October 12, 2025¶
Generated: 2025-10-12 17:03 UTC
Total Duration: 1h 48m 23s
Iterations: 1
Judge (classifier) model: azure/gpt-4.1
About this Benchmark¶
HolmesGPT is continuously evaluated against real-world Kubernetes and cloud troubleshooting scenarios.
If you find scenarios that HolmesGPT does not perform well on, please consider adding them as evals to the benchmark.
Model Accuracy Comparison¶
Model | Pass | Fail | Skip/Error | Total | Success Rate |
---|---|---|---|---|---|
gpt-4o | 52 | 41 | 12 | 105 | 🟡 56% (52/93) |
eu.anthropic.claude-sonnet-4-20250514-v1:0 | 82 | 13 | 10 | 105 | 🟡 86% (82/95) |
gpt-4.1 | 67 | 27 | 11 | 105 | 🟡 71% (67/94) |
gpt-5 | 74 | 20 | 11 | 105 | 🟡 79% (74/94) |
novita/deepseek/deepseek-v3.1-terminus | 75 | 20 | 10 | 105 | 🟡 79% (75/95) |
novita/qwen/qwen3-next-80b-a3b-instruct | 55 | 40 | 10 | 105 | 🟡 58% (55/95) |
Model Cost Comparison¶
Model | Tests | Avg Cost | Min Cost | Max Cost | Total Cost |
---|---|---|---|---|---|
gpt-4o | 93 | $0.18 | $0.03 | $1.00 | $16.67 |
eu.anthropic.claude-sonnet-4-20250514-v1:0 | 93 | $0.25 | $0.06 | $1.01 | $22.85 |
gpt-4.1 | 94 | $0.11 | $0.02 | $0.66 | $10.68 |
gpt-5 | 94 | $0.19 | $0.02 | $0.59 | $17.43 |
Model Latency Comparison¶
Model | Avg (s) | Min (s) | Max (s) | P50 (s) | P95 (s) |
---|---|---|---|---|---|
gpt-4o | 26.6 | 8.0 | 67.1 | 25.9 | 55.3 |
eu.anthropic.claude-sonnet-4-20250514-v1:0 | 48.9 | 9.8 | 263.9 | 43.4 | 100.8 |
gpt-4.1 | 40.7 | 5.5 | 645.1 | 25.4 | 51.7 |
gpt-5 | 138.5 | 17.4 | 859.1 | 81.6 | 752.3 |
novita/deepseek/deepseek-v3.1-terminus | 75.5 | 21.1 | 221.1 | 71.1 | 142.2 |
novita/qwen/qwen3-next-80b-a3b-instruct | 82.8 | 12.0 | 1100.6 | 34.8 | 296.6 |
⚠️ Note: 7 test(s) excluded from latency calculations due to throttling/timeout errors (eu.anthropic.claude-sonnet-4-20250514-v1:0: 2, novita/qwen/qwen3-next-80b-a3b-instruct: 5)
Performance by Tag¶
Success rate by test category and model:
Tag | gpt-4o | eu.anthropic.claude-sonnet-4-20250514-v1:0 | gpt-4.1 | gpt-5 | novita/deepseek/deepseek-v3.1-terminus | novita/qwen/qwen3-next-80b-a3b-instruct | Warnings |
---|---|---|---|---|---|---|---|
chain-of-causation | 🔴 0% (0/7) | 🟡 71% (5/7) | 🔴 0% (0/7) | 🟡 57% (4/7) | 🟡 29% (2/7) | 🟡 14% (1/7) | ⚠️ 6 skipped |
context_window | 🟡 14% (1/7) | 🟡 57% (4/7) | 🟡 57% (4/7) | 🟡 86% (6/7) | 🟡 57% (4/7) | 🟡 29% (2/7) | |
counting | 🟢 100% (4/4) | 🟢 100% (4/4) | 🟢 100% (4/4) | 🟡 75% (¾) | 🟢 100% (4/4) | 🟢 100% (4/4) | |
database | 🔴 0% (0/1) | 🟢 100% (1/1) | 🔴 0% (0/1) | 🟢 100% (1/1) | 🟢 100% (1/1) | 🟢 100% (1/1) | ⚠️ 18 skipped |
datadog | 🟡 67% (⅔) | 🟡 75% (¾) | 🟡 75% (¾) | 🟡 75% (¾) | 🟡 75% (¾) | 🟡 75% (¾) | ⚠️ 1 skipped |
datetime | 🟡 50% (2/4) | 🟡 50% (2/4) | 🟡 50% (2/4) | 🟢 100% (4/4) | 🟡 75% (¾) | 🟡 50% (2/4) | ⚠️ 12 skipped |
easy | 🟡 91% (32/35) | 🟡 92% (33/36) | 🟡 97% (35/36) | 🟡 83% (30/36) | 🟡 94% (34/36) | 🟡 86% (31/36) | ⚠️ 1 skipped |
hard | 🟡 20% (3/15) | 🟡 80% (12/15) | 🟡 20% (3/15) | 🟡 47% (7/15) | 🟡 47% (7/15) | 🟡 40% (6/15) | ⚠️ 30 skipped |
kafka | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - | ⚠️ 12 skipped |
kubernetes | 🟡 49% (23/47) | 🟡 85% (40/47) | 🟡 66% (31/47) | 🟡 79% (37/47) | 🟡 72% (34/47) | 🟡 55% (26/47) | ⚠️ 6 skipped |
logs | 🟡 46% (12/26) | 🟡 78% (21/27) | 🟡 62% (16/26) | 🟡 74% (20/27) | 🟡 74% (20/27) | 🟡 44% (12/27) | ⚠️ 38 skipped |
medium | 🟡 40% (17/43) | 🟡 84% (37/44) | 🟡 67% (29/43) | 🟡 86% (37/43) | 🟡 77% (34/44) | 🟡 41% (18/44) | ⚠️ 33 skipped |
network | 🟡 75% (¾) | 🟢 100% (4/4) | 🟡 75% (¾) | 🟡 75% (¾) | 🟢 100% (4/4) | 🟢 100% (4/4) | |
no-cicd | 🟢 100% (1/1) | 🔴 0% (0/1) | 🟢 100% (1/1) | 🟢 100% (1/1) | 🟢 100% (1/1) | 🟢 100% (1/1) | |
numerical | 🟢 100% (1/1) | 🟢 100% (1/1) | 🟢 100% (1/1) | 🟢 100% (1/1) | 🟢 100% (1/1) | 🟢 100% (1/1) | |
port-forward | 🟡 22% (2/9) | 🟡 67% (6/9) | 🟡 56% (5/9) | 🟡 67% (6/9) | 🟡 22% (2/9) | 🟡 22% (2/9) | |
prometheus | 🟡 25% (¼) | 🟡 75% (¾) | 🟢 100% (4/4) | 🟢 100% (4/4) | 🟡 50% (2/4) | 🟡 25% (¼) | |
question-answer | 🟡 75% (¾) | 🟢 100% (4/4) | 🟢 100% (4/4) | 🟢 100% (4/4) | 🟢 100% (4/4) | 🟡 75% (¾) | |
runbooks | 🟡 67% (4/6) | 🟢 100% (6/6) | 🟡 83% (⅚) | 🟢 100% (6/6) | 🟢 100% (6/6) | 🟡 67% (4/6) | ⚠️ 6 skipped |
slackbot | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - | ⚠️ 6 skipped |
traces | 🔴 0% (0/5) | 🟡 60% (⅗) | 🔴 0% (0/5) | 🟡 80% (⅘) | 🔴 0% (0/5) | 🔴 0% (0/5) | |
transparency | 🟡 79% (11/14) | 🟡 93% (13/14) | 🟡 93% (13/14) | 🟡 86% (12/14) | 🟡 71% (10/14) | 🟡 43% (6/14) | ⚠️ 6 skipped |
Overall | 🟡 56% (52/93) | 🟡 86% (82/95) | 🟡 71% (67/94) | 🟡 79% (74/94) | 🟡 79% (75/95) | 🟡 58% (55/95) | ⚠️ 64 skipped |
Raw Results¶
Status of all evaluations across models. Color coding:
- 🟢 Passing 100% (stable)
- 🟡 Passing 1-99%
- 🔴 Passing 0% (failing)
- 🔧 Mock data failure (missing or invalid test data)
- ⚠️ Setup failure (environment/infrastructure issue)
- ⏱️ Timeout or rate limit error
- ⏭️ Test skipped (e.g., known issue or precondition not met)
Detailed Raw Results¶
Eval ID | gpt-4o | eu.anthropic.claude-sonnet-4-20250514-v1:0 | gpt-4.1 | gpt-5 | novita/deepseek/deepseek-v3.1-terminus | novita/qwen/qwen3-next-80b-a3b-instruct |
---|---|---|---|---|---|---|
01_how_many_pods 🔗 | 🟢 100% (1/1) / ⏱️ 21.6s / 💰 $0.11 | 🟢 100% (1/1) / ⏱️ 21.4s / 💰 $0.08 | 🟢 100% (1/1) / ⏱️ 20.6s / 💰 $0.05 | 🟢 100% (1/1) / ⏱️ 26.4s / 💰 $0.08 | 🟢 100% (1/1) / ⏱️ 27.3s | 🟢 100% (1/1) / ⏱️ 15.4s |
02_what_is_wrong_with_pod 🔗 | 🔴 0% (0/1) / ⏱️ 30.4s / 💰 $0.09 | 🟢 100% (1/1) / ⏱️ 35.6s / 💰 $0.17 | 🟢 100% (1/1) / ⏱️ 26.4s / 💰 $0.07 | 🟢 100% (1/1) / ⏱️ 63.6s / 💰 $0.15 | 🟢 100% (1/1) / ⏱️ 75.8s | 🟢 100% (1/1) / ⏱️ 31.6s |
03_what_is_the_command_to_port_forward 🔗 | 🟢 100% (1/1) / ⏱️ 26.0s / 💰 $0.13 | 🟢 100% (1/1) / ⏱️ 29.0s / 💰 $0.16 | 🟢 100% (1/1) / ⏱️ 26.3s / 💰 $0.07 | 🟢 100% (1/1) / ⏱️ 58.9s / 💰 $0.06 | 🟢 100% (1/1) / ⏱️ 43.8s | 🔴 0% (0/1) / ⏱️ 24.1s |
04_related_k8s_events 🔗 | 🟢 100% (1/1) / ⏱️ 18.2s / 💰 $0.08 | 🟢 100% (1/1) / ⏱️ 30.9s / 💰 $0.15 | 🟢 100% (1/1) / ⏱️ 22.4s / 💰 $0.06 | 🟢 100% (1/1) / ⏱️ 51.7s / 💰 $0.09 | 🟢 100% (1/1) / ⏱️ 44.6s | 🟢 100% (1/1) / ⏱️ 27.4s |
05_image_version 🔗 | 🟢 100% (1/1) / ⏱️ 16.2s / 💰 $0.07 | ⏱️ 0% (0/1) / ⏱️ 621.2s | 🟢 100% (1/1) / ⏱️ 23.3s / 💰 $0.07 | 🔴 0% (0/1) / ⏱️ 20.2s / 💰 $0.02 | 🟢 100% (1/1) / ⏱️ 45.7s | 🟢 100% (1/1) / ⏱️ 22.1s |
08_sock_shop_frontend 🔗 | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - |
09_crashpod 🔗 | 🟢 100% (1/1) / ⏱️ 55.3s / 💰 $0.20 | 🟢 100% (1/1) / ⏱️ 43.7s / 💰 $0.21 | 🟢 100% (1/1) / ⏱️ 631.4s / 💰 $0.13 | 🟢 100% (1/1) / ⏱️ 53.6s / 💰 $0.19 | 🟢 100% (1/1) / ⏱️ 74.5s | 🟢 100% (1/1) / ⏱️ 30.1s |
100a_historical_logs 🔗 | 🟢 100% (1/1) / ⏱️ 23.7s / 💰 $0.15 | 🔴 0% (0/1) / ⏱️ 83.8s / 💰 $0.35 | 🟢 100% (1/1) / ⏱️ 24.2s / 💰 $0.06 | 🟢 100% (1/1) / ⏱️ 859.1s / 💰 $0.49 | 🔴 0% (0/1) / ⏱️ 86.2s | 🟢 100% (1/1) / ⏱️ 49.2s |
100b_historical_logs_nonstandard_label 🔗 | 🔴 0% (0/1) / ⏱️ 29.7s / 💰 $0.20 | 🔴 0% (0/1) / ⏱️ 94.3s / 💰 $0.39 | 🔴 0% (0/1) / ⏱️ 23.2s / 💰 $0.11 | 🔴 0% (0/1) / ⏱️ 171.0s / 💰 $0.28 | 🔴 0% (0/1) / ⏱️ 83.4s | 🔴 0% (0/1) / ⏱️ 77.2s |
101_historical_logs_pod_deleted 🔗 | 🔴 0% (0/1) / ⏱️ 22.2s / 💰 $0.09 | 🟢 100% (1/1) / ⏱️ 60.1s / 💰 $0.32 | 🔴 0% (0/1) / ⏱️ 27.5s / 💰 $0.11 | 🔴 0% (0/1) / ⏱️ 191.0s / 💰 $0.32 | 🔴 0% (0/1) / ⏱️ 93.8s | 🔴 0% (0/1) / ⏱️ 164.4s |
103_logs_transparency_default_limit 🔗 | 🔴 0% (0/1) / ⏱️ 22.4s / 💰 $0.14 | 🟢 100% (1/1) / ⏱️ 37.3s / 💰 $0.12 | 🟢 100% (1/1) / ⏱️ 29.3s / 💰 $0.10 | 🟢 100% (1/1) / ⏱️ 44.2s / 💰 $0.13 | 🟢 100% (1/1) / ⏱️ 64.4s | 🔴 0% (0/1) / ⏱️ 65.1s |
104a_postgres_root_issue 🔗 | 🔴 0% (0/1) / ⏱️ 30.5s / 💰 $0.23 | 🟢 100% (1/1) / ⏱️ 55.5s / 💰 $0.39 | 🔴 0% (0/1) / ⏱️ 48.6s / 💰 $0.21 | 🟢 100% (1/1) / ⏱️ 191.3s / 💰 $0.48 | 🟢 100% (1/1) / ⏱️ 94.5s | 🟢 100% (1/1) / ⏱️ 62.8s |
104b_postgres_missing_index_pgstat 🔗 | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - |
104c_postgres_minimal_missing_index 🔗 | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - |
105_redis_wrong_data_structure 🔗 | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - |
107_log_filter_http_status_code 🔗 | 🔴 0% (0/1) / ⏱️ 31.7s / 💰 $0.20 | 🟢 100% (1/1) / ⏱️ 70.4s / 💰 $0.28 | 🟢 100% (1/1) / ⏱️ 25.4s / 💰 $0.06 | 🟢 100% (1/1) / ⏱️ 192.2s / 💰 $0.35 | 🟢 100% (1/1) / ⏱️ 123.2s | 🔴 0% (0/1) / ⏱️ 161.0s |
108_logs_nearby_lines 🔗 | 🔴 0% (0/1) / ⏱️ 29.3s / 💰 $0.22 | 🟢 100% (1/1) / ⏱️ 50.8s / 💰 $0.40 | 🔴 0% (0/1) / ⏱️ 32.6s / 💰 $0.30 | 🔴 0% (0/1) / ⏱️ 159.7s / 💰 $0.27 | 🔴 0% (0/1) / ⏱️ 106.3s | 🔴 0% (0/1) / ⏱️ 71.5s |
109_logs_transparency_not_found 🔗 | 🟢 100% (1/1) / ⏱️ 26.5s / 💰 $0.19 | 🟢 100% (1/1) / ⏱️ 29.1s / 💰 $0.09 | 🟢 100% (1/1) / ⏱️ 24.3s / 💰 $0.07 | 🟢 100% (1/1) / ⏱️ 72.9s / 💰 $0.10 | 🟢 100% (1/1) / ⏱️ 71.2s | 🔴 0% (0/1) / ⏱️ 49.0s |
10_image_pull_backoff 🔗 | 🟢 100% (1/1) / ⏱️ 30.4s / 💰 $0.23 | 🟢 100% (1/1) / ⏱️ 44.6s / 💰 $0.25 | 🟢 100% (1/1) / ⏱️ 33.1s / 💰 $0.12 | 🟢 100% (1/1) / ⏱️ 93.5s / 💰 $0.23 | 🟢 100% (1/1) / ⏱️ 56.9s | 🟢 100% (1/1) / ⏱️ 71.5s |
110_k8s_events_image_pull 🔗 | 🟢 100% (1/1) / ⏱️ 23.4s / 💰 $0.15 | 🟢 100% (1/1) / ⏱️ 31.2s / 💰 $0.10 | 🟢 100% (1/1) / ⏱️ 23.8s / 💰 $0.06 | 🟢 100% (1/1) / ⏱️ 46.3s / 💰 $0.05 | 🟢 100% (1/1) / ⏱️ 82.9s | 🟢 100% (1/1) / ⏱️ 31.2s |
111_disabled_datadog_traces 🔗 | 🟢 100% (1/1) / ⏱️ 14.9s / 💰 $0.03 | 🟢 100% (1/1) / ⏱️ 27.0s / 💰 $0.09 | 🟢 100% (1/1) / ⏱️ 14.3s / 💰 $0.02 | 🟢 100% (1/1) / ⏱️ 128.7s / 💰 $0.15 | 🔴 0% (0/1) / ⏱️ 66.3s | 🟢 100% (1/1) / ⏱️ 23.2s |
111_pod_names_contain_service 🔗 | 🟢 100% (1/1) / ⏱️ 25.1s / 💰 $0.12 | 🟢 100% (1/1) / ⏱️ 56.4s / 💰 $0.25 | 🟢 100% (1/1) / ⏱️ 37.5s / 💰 $0.12 | 🟢 100% (1/1) / ⏱️ 146.3s / 💰 $0.18 | 🟢 100% (1/1) / ⏱️ 57.3s | 🟢 100% (1/1) / ⏱️ 34.8s |
112_find_pvcs_by_uuid 🔗 | 🔴 0% (0/1) / ⏱️ 17.3s / 💰 $0.07 | 🟢 100% (1/1) / ⏱️ 41.7s / 💰 $0.23 | 🔴 0% (0/1) / ⏱️ 30.1s / 💰 $0.14 | 🟢 100% (1/1) / ⏱️ 53.9s / 💰 $0.10 | 🟢 100% (1/1) / ⏱️ 65.8s | 🔴 0% (0/1) / ⏱️ 36.7s |
114_checkout_latency_tracing_rebuild[0] 🔗 | 🔴 0% (0/1) / ⏱️ 65.7s / 💰 $1.00 | 🟢 100% (1/1) / ⏱️ 66.8s / 💰 $0.55 | 🔴 0% (0/1) / ⏱️ 27.6s / 💰 $0.17 | 🔴 0% (0/1) / ⏱️ 206.5s / 💰 $0.51 | 🔴 0% (0/1) / ⏱️ 221.1s | 🔴 0% (0/1) / ⏱️ 81.1s |
115_checkout_errors_tracing[0] 🔗 | 🔴 0% (0/1) / ⏱️ 67.1s / 💰 $0.77 | 🟢 100% (1/1) / ⏱️ 76.9s / 💰 $0.52 | 🔴 0% (0/1) / ⏱️ 645.1s / 💰 $0.23 | 🟢 100% (1/1) / ⏱️ 167.7s / 💰 $0.40 | 🔴 0% (0/1) / ⏱️ 135.7s | 🔴 0% (0/1) / ⏱️ 81.0s |
11_init_containers 🔗 | 🟢 100% (1/1) / ⏱️ 25.4s / 💰 $0.14 | 🟢 100% (1/1) / ⏱️ 44.8s / 💰 $0.15 | 🟢 100% (1/1) / ⏱️ 27.8s / 💰 $0.08 | 🟢 100% (1/1) / ⏱️ 90.8s / 💰 $0.12 | 🟢 100% (1/1) / ⏱️ 69.0s | 🟢 100% (1/1) / ⏱️ 33.2s |
121_new_relic_checkout_errors_tracing[0] 🔗 | 🔴 0% (0/1) / ⏱️ 25.9s / 💰 $0.16 | 🔴 0% (0/1) / ⏱️ 100.3s / 💰 $0.44 | 🔴 0% (0/1) / ⏱️ 21.2s / 💰 $0.08 | 🟢 100% (1/1) / ⏱️ 208.5s / 💰 $0.35 | 🔴 0% (0/1) / ⏱️ 66.4s | 🔴 0% (0/1) / ⏱️ 240.6s |
122_new_relic_checkout_latency_tracing_rebuild[0] 🔗 | 🔴 0% (0/1) / ⏱️ 23.0s / 💰 $0.13 | 🟢 100% (1/1) / ⏱️ 105.7s / 💰 $0.55 | 🔴 0% (0/1) / ⏱️ 33.1s / 💰 $0.10 | 🟢 100% (1/1) / ⏱️ 153.9s / 💰 $0.35 | 🔴 0% (0/1) / ⏱️ 146.8s | 🔴 0% (0/1) / ⏱️ 956.2s |
123_new_relic_checkout_errors_tracing[0] 🔗 | 🔴 0% (0/1) / ⏱️ 17.4s / 💰 $0.07 | 🔴 0% (0/1) / ⏱️ 124.9s / 💰 $0.69 | 🔴 0% (0/1) / ⏱️ 24.8s / 💰 $0.06 | 🟢 100% (1/1) / ⏱️ 225.8s / 💰 $0.58 | 🔴 0% (0/1) / ⏱️ 38.4s | 🔴 0% (0/1) / ⏱️ 22.8s |
12_job_crashing 🔗 | 🟢 100% (1/1) / ⏱️ 28.1s / 💰 $0.18 | 🟢 100% (1/1) / ⏱️ 49.4s / 💰 $0.21 | 🟢 100% (1/1) / ⏱️ 28.5s / 💰 $0.11 | 🔴 0% (0/1) / ⏱️ 82.8s / 💰 $0.16 | 🟢 100% (1/1) / ⏱️ 112.4s | ⏱️ 0% (0/1) / ⏱️ 489.4s |
13a_pending_node_selector_basic 🔗 | 🟢 100% (1/1) / ⏱️ 25.4s / 💰 $0.15 | 🟢 100% (1/1) / ⏱️ 41.3s / 💰 $0.21 | 🟢 100% (1/1) / ⏱️ 28.1s / 💰 $0.09 | 🟢 100% (1/1) / ⏱️ 99.5s / 💰 $0.20 | 🟢 100% (1/1) / ⏱️ 87.1s | 🔴 0% (0/1) / ⏱️ 404.9s |
13b_pending_node_selector_detailed 🔗 | 🔴 0% (0/1) / ⏱️ 30.8s / 💰 $0.18 | 🟢 100% (1/1) / ⏱️ 43.7s / 💰 $0.20 | 🟢 100% (1/1) / ⏱️ 33.5s / 💰 $0.11 | 🔴 0% (0/1) / ⏱️ 21.8s / 💰 $0.02 | 🟢 100% (1/1) / ⏱️ 91.1s | 🔴 0% (0/1) / ⏱️ 18.3s |
14_pending_resources 🔗 | 🟢 100% (1/1) / ⏱️ 33.7s / 💰 $0.28 | 🟢 100% (1/1) / ⏱️ 48.2s / 💰 $0.15 | 🟢 100% (1/1) / ⏱️ 24.2s / 💰 $0.11 | 🔴 0% (0/1) / ⏱️ 24.3s / 💰 $0.02 | 🟢 100% (1/1) / ⏱️ 77.0s | 🔴 0% (0/1) / ⏱️ 74.3s |
156_kafka_opensearch_latency 🔗 | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - |
159_prometheus_high_cardinality_cpu[0] 🔗 | 🔴 0% (0/1) / ⏱️ 60.9s / 💰 $0.67 | 🟢 100% (1/1) / ⏱️ 43.9s / 💰 $0.28 | 🟢 100% (1/1) / ⏱️ 28.9s / 💰 $0.16 | 🟢 100% (1/1) / ⏱️ 163.2s / 💰 $0.22 | 🔴 0% (0/1) / ⏱️ 63.2s | 🔴 0% (0/1) / ⏱️ 34.7s |
159_prometheus_high_cardinality_cpu[1] 🔗 | 🔴 0% (0/1) / ⏱️ 31.9s / 💰 $0.28 | 🟢 100% (1/1) / ⏱️ 45.1s / 💰 $0.20 | 🟢 100% (1/1) / ⏱️ 25.0s / 💰 $0.22 | 🟢 100% (1/1) / ⏱️ 108.4s / 💰 $0.19 | 🟢 100% (1/1) / ⏱️ 168.3s | 🔴 0% (0/1) / ⏱️ 41.9s |
159_prometheus_high_cardinality_cpu[2] 🔗 | 🔴 0% (0/1) / ⏱️ 30.4s / 💰 $0.25 | 🟢 100% (1/1) / ⏱️ 39.0s / 💰 $0.36 | 🟢 100% (1/1) / ⏱️ 25.0s / 💰 $0.12 | 🟢 100% (1/1) / ⏱️ 81.4s / 💰 $0.10 | 🔴 0% (0/1) / ⏱️ 64.5s | 🔴 0% (0/1) / ⏱️ 296.6s |
15_failed_readiness_probe 🔗 | 🟢 100% (1/1) / ⏱️ 28.2s / 💰 $0.19 | 🟢 100% (1/1) / ⏱️ 46.3s / 💰 $0.28 | 🟢 100% (1/1) / ⏱️ 23.8s / 💰 $0.09 | 🟢 100% (1/1) / ⏱️ 71.4s / 💰 $0.19 | 🟢 100% (1/1) / ⏱️ 77.8s | 🟢 100% (1/1) / ⏱️ 31.7s |
16_failed_no_toolset_found 🔗 | 🔴 0% (0/1) / ⏱️ 20.8s / 💰 $0.07 | 🔴 0% (0/1) / ⏱️ 19.2s / 💰 $0.06 | 🔴 0% (0/1) / ⏱️ 15.9s / 💰 $0.03 | 🔴 0% (0/1) / ⏱️ 39.8s / 💰 $0.04 | 🔴 0% (0/1) / ⏱️ 91.0s | 🔴 0% (0/1) / ⏱️ 15.3s |
17_oom_kill 🔗 | 🟢 100% (1/1) / ⏱️ 26.0s / 💰 $0.15 | 🟢 100% (1/1) / ⏱️ 43.4s / 💰 $0.25 | 🟢 100% (1/1) / ⏱️ 23.4s / 💰 $0.05 | 🟢 100% (1/1) / ⏱️ 112.1s / 💰 $0.17 | 🟢 100% (1/1) / ⏱️ 68.3s | 🟢 100% (1/1) / ⏱️ 43.7s |
19_detect_missing_app_details 🔗 | 🟢 100% (1/1) / ⏱️ 29.5s / 💰 $0.18 | 🟢 100% (1/1) / ⏱️ 69.4s / 💰 $0.23 | 🟢 100% (1/1) / ⏱️ 38.7s / 💰 $0.13 | 🟢 100% (1/1) / ⏱️ 65.8s / 💰 $0.11 | 🟢 100% (1/1) / ⏱️ 75.0s | 🟢 100% (1/1) / ⏱️ 710.0s |
20_long_log_file_search 🔗 | 🟢 100% (1/1) / ⏱️ 33.7s / 💰 $0.19 | 🟢 100% (1/1) / ⏱️ 69.7s / 💰 $0.25 | 🟢 100% (1/1) / ⏱️ 29.3s / 💰 $0.09 | 🟢 100% (1/1) / ⏱️ 81.6s / 💰 $0.14 | 🟢 100% (1/1) / ⏱️ 111.8s | 🟢 100% (1/1) / ⏱️ 29.0s |
21_job_fail_curl_no_svc_account 🔗 | 🟢 100% (1/1) / ⏱️ 30.3s / 💰 $0.22 | 🟢 100% (1/1) / ⏱️ 39.9s / 💰 $0.20 | 🟢 100% (1/1) / ⏱️ 26.5s / 💰 $0.09 | 🔴 0% (0/1) / ⏱️ 19.8s / 💰 $0.02 | 🟢 100% (1/1) / ⏱️ 67.5s | 🟢 100% (1/1) / ⏱️ 35.9s |
22_high_latency_dbi_down 🔗 | 🔴 0% (0/1) / ⏱️ 55.4s / 💰 $0.57 | 🟢 100% (1/1) / ⏱️ 78.8s / 💰 $0.51 | 🔴 0% (0/1) / ⏱️ 31.4s / 💰 $0.16 | 🔴 0% (0/1) / ⏱️ 118.0s / 💰 $0.24 | 🟢 100% (1/1) / ⏱️ 87.5s | 🔴 0% (0/1) / ⏱️ 17.2s |
23_app_error_in_current_logs 🔗 | 🟢 100% (1/1) / ⏱️ 29.1s / 💰 $0.19 | 🟢 100% (1/1) / ⏱️ 44.9s / 💰 $0.26 | 🟢 100% (1/1) / ⏱️ 36.1s / 💰 $0.17 | 🟢 100% (1/1) / ⏱️ 151.8s / 💰 $0.28 | 🟢 100% (1/1) / ⏱️ 82.7s | 🟢 100% (1/1) / ⏱️ 78.2s |
24_misconfigured_pvc 🔗 | 🟢 100% (1/1) / ⏱️ 29.1s / 💰 $0.21 | 🟢 100% (1/1) / ⏱️ 50.7s / 💰 $0.24 | 🟢 100% (1/1) / ⏱️ 32.2s / 💰 $0.14 | 🔴 0% (0/1) / ⏱️ 18.8s / 💰 $0.02 | 🟢 100% (1/1) / ⏱️ 97.4s | 🟢 100% (1/1) / ⏱️ 31.1s |
24a_misconfigured_pvc_basic 🔗 | 🟢 100% (1/1) / ⏱️ 33.4s / 💰 $0.23 | 🔴 0% (0/1) / ⏱️ 38.2s / 💰 $0.17 | 🟢 100% (1/1) / ⏱️ 27.2s / 💰 $0.11 | 🟢 100% (1/1) / ⏱️ 91.2s / 💰 $0.15 | 🔴 0% (0/1) / ⏱️ 34.0s | 🟢 100% (1/1) / ⏱️ 40.2s |
24b_misconfigured_pvc_detailed 🔗 | 🔴 0% (0/1) / ⏱️ 23.8s / 💰 $0.14 | 🟢 100% (1/1) / ⏱️ 41.3s / 💰 $0.13 | 🔴 0% (0/1) / ⏱️ 31.2s / 💰 $0.11 | 🔴 0% (0/1) / ⏱️ 19.6s / 💰 $0.02 | 🟢 100% (1/1) / ⏱️ 88.1s | 🟢 100% (1/1) / ⏱️ 32.7s |
25_misconfigured_ingress_class 🔗 | 🔴 0% (0/1) / ⏱️ 13.5s / 💰 $0.06 | 🟢 100% (1/1) / ⏱️ 263.9s / 💰 $0.42 | 🔴 0% (0/1) / ⏱️ 14.5s / 💰 $0.06 | 🔴 0% (0/1) / ⏱️ 243.1s / 💰 $0.21 | 🟢 100% (1/1) / ⏱️ 112.8s | 🟢 100% (1/1) / ⏱️ 201.1s |
26_page_render_times 🔗 | 🟢 100% (1/1) / ⏱️ 23.8s / 💰 $0.18 | 🟢 100% (1/1) / ⏱️ 40.9s / 💰 $0.28 | 🟢 100% (1/1) / ⏱️ 22.7s / 💰 $0.16 | 🟢 100% (1/1) / ⏱️ 771.7s / 💰 $0.32 | 🟢 100% (1/1) / ⏱️ 64.3s | 🟢 100% (1/1) / ⏱️ 26.3s |
27a_multi_container_logs 🔗 | 🔴 0% (0/1) / ⏱️ 12.6s / 💰 $0.07 | 🟢 100% (1/1) / ⏱️ 30.5s / 💰 $0.12 | 🔴 0% (0/1) / ⏱️ 11.8s / 💰 $0.06 | 🟢 100% (1/1) / ⏱️ 61.5s / 💰 $0.14 | 🟢 100% (1/1) / ⏱️ 56.4s | 🔴 0% (0/1) / ⏱️ 38.5s |
27b_multi_container_logs 🔗 | 🟢 100% (1/1) / ⏱️ 21.4s / 💰 $0.14 | 🟢 100% (1/1) / ⏱️ 31.7s / 💰 $0.20 | 🟢 100% (1/1) / ⏱️ 27.3s / 💰 $0.08 | 🟢 100% (1/1) / ⏱️ 76.7s / 💰 $0.12 | 🟢 100% (1/1) / ⏱️ 37.5s | 🟢 100% (1/1) / ⏱️ 22.7s |
28_permissions_error 🔗 | 🟢 100% (1/1) / ⏱️ 15.5s / 💰 $0.08 | 🟢 100% (1/1) / ⏱️ 19.7s / 💰 $0.13 | 🟢 100% (1/1) / ⏱️ 18.2s / 💰 $0.06 | 🔴 0% (0/1) / ⏱️ 40.2s / 💰 $0.06 | 🟢 100% (1/1) / ⏱️ 29.7s | 🔴 0% (0/1) / ⏱️ 25.3s |
33_cpu_metrics_discovery 🔗 | 🟢 100% (1/1) / ⏱️ 20.6s / 💰 $0.13 | ⏱️ 0% (0/1) / ⏱️ 622.2s | 🟢 100% (1/1) / ⏱️ 23.7s / 💰 $0.10 | 🟢 100% (1/1) / ⏱️ 69.7s / 💰 $0.15 | 🟢 100% (1/1) / ⏱️ 148.6s | 🟢 100% (1/1) / ⏱️ 17.5s |
39_failed_toolset 🔗 | 🟢 100% (1/1) / ⏱️ 17.7s / 💰 $0.03 | 🟢 100% (1/1) / ⏱️ 45.2s / 💰 $0.18 | 🟢 100% (1/1) / ⏱️ 17.2s / 💰 $0.02 | 🟢 100% (1/1) / ⏱️ 147.5s / 💰 $0.34 | 🔴 0% (0/1) / ⏱️ 78.1s | 🔴 0% (0/1) / ⏱️ 692.3s |
41_setup_argo 🔗 | 🟢 100% (1/1) / ⏱️ 14.9s / 💰 $0.03 | 🟢 100% (1/1) / ⏱️ 19.9s / 💰 $0.06 | 🟢 100% (1/1) / ⏱️ 17.0s / 💰 $0.03 | 🟢 100% (1/1) / ⏱️ 70.2s / 💰 $0.10 | 🟢 100% (1/1) / ⏱️ 37.5s | 🔴 0% (0/1) / ⏱️ 140.8s |
42_dns_issues_result_new_tools_no_runbook 🔗 | 🟢 100% (1/1) / ⏱️ 27.0s / 💰 $0.23 | 🟢 100% (1/1) / ⏱️ 80.7s / 💰 $0.53 | 🟢 100% (1/1) / ⏱️ 39.4s / 💰 $0.29 | 🟢 100% (1/1) / ⏱️ 166.8s / 💰 $0.28 | 🟢 100% (1/1) / ⏱️ 111.3s | 🟢 100% (1/1) / ⏱️ 105.4s |
42_dns_issues_steps_new_tools 🔗 | 🟢 100% (1/1) / ⏱️ 46.1s / 💰 $0.26 | 🟢 100% (1/1) / ⏱️ 100.8s / 💰 $0.33 | 🟢 100% (1/1) / ⏱️ 51.7s / 💰 $0.23 | 🟢 100% (1/1) / ⏱️ 260.1s / 💰 $0.22 | 🟢 100% (1/1) / ⏱️ 109.7s | 🟢 100% (1/1) / ⏱️ 105.5s |
43_current_datetime_from_prompt 🔗 | 🟢 100% (1/1) / ⏱️ 15.2s / 💰 $0.03 | 🟢 100% (1/1) / ⏱️ 14.8s / 💰 $0.06 | 🟢 100% (1/1) / ⏱️ 16.5s / 💰 $0.04 | 🟢 100% (1/1) / ⏱️ 29.8s / 💰 $0.02 | 🟢 100% (1/1) / ⏱️ 26.1s | 🟢 100% (1/1) / ⏱️ 22.3s |
43_slack_deployment_logs 🔗 | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - |
44_slack_statefulset_logs 🔗 | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - |
45_fetch_deployment_logs_simple 🔗 | 🔴 0% (0/1) / ⏱️ 42.0s / 💰 $0.40 | 🟢 100% (1/1) / ⏱️ 31.2s / 💰 $0.15 | 🟢 100% (1/1) / ⏱️ 24.5s / 💰 $0.08 | 🟢 100% (1/1) / ⏱️ 652.0s / 💰 $0.14 | 🟢 100% (1/1) / ⏱️ 55.6s | 🟢 100% (1/1) / ⏱️ 24.3s |
48_logs_since_thursday 🔗 | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - |
50_logs_since_specific_date 🔗 | ⚪️ - | 🟢 100% (1/1) / ⏱️ 23.6s / 💰 $0.12 | ⚪️ - | 🟢 100% (1/1) / ⏱️ 41.4s / 💰 $0.11 | 🟢 100% (1/1) / ⏱️ 37.8s | 🟢 100% (1/1) / ⏱️ 14.7s |
50a_logs_since_last_specific_month 🔗 | 🔴 0% (0/1) / ⏱️ 24.4s / 💰 $0.11 | 🟢 100% (1/1) / ⏱️ 38.9s / 💰 $0.12 | 🟢 100% (1/1) / ⏱️ 20.0s / 💰 $0.05 | 🟢 100% (1/1) / ⏱️ 90.4s / 💰 $0.11 | 🟢 100% (1/1) / ⏱️ 84.0s | 🔴 0% (0/1) / ⏱️ 17.2s |
51_logs_summarize_errors 🔗 | 🟢 100% (1/1) / ⏱️ 25.6s / 💰 $0.16 | 🟢 100% (1/1) / ⏱️ 33.8s / 💰 $0.16 | 🟢 100% (1/1) / ⏱️ 26.9s / 💰 $0.11 | 🟢 100% (1/1) / ⏱️ 49.7s / 💰 $0.07 | 🟢 100% (1/1) / ⏱️ 60.2s | 🟢 100% (1/1) / ⏱️ 41.3s |
52_logs_login_issues 🔗 | 🔴 0% (0/1) / ⏱️ 27.6s / 💰 $0.15 | 🟢 100% (1/1) / ⏱️ 45.2s / 💰 $0.18 | 🟢 100% (1/1) / ⏱️ 32.0s / 💰 $0.14 | 🔴 0% (0/1) / ⏱️ 26.7s / 💰 $0.03 | 🟢 100% (1/1) / ⏱️ 69.1s | 🔴 0% (0/1) / ⏱️ 1100.6s |
53_logs_find_term 🔗 | 🟢 100% (1/1) / ⏱️ 28.1s / 💰 $0.18 | 🟢 100% (1/1) / ⏱️ 35.6s / 💰 $0.19 | 🟢 100% (1/1) / ⏱️ 25.4s / 💰 $0.11 | 🟢 100% (1/1) / ⏱️ 52.1s / 💰 $0.11 | 🟢 100% (1/1) / ⏱️ 60.0s | 🟢 100% (1/1) / ⏱️ 26.3s |
54_not_truncated_when_getting_pods 🔗 | 🟢 100% (1/1) / ⏱️ 17.5s / 💰 $0.07 | 🟢 100% (1/1) / ⏱️ 42.0s / 💰 $0.13 | 🟢 100% (1/1) / ⏱️ 31.4s / 💰 $0.13 | 🔴 0% (0/1) / ⏱️ 770.4s / 💰 $0.59 | 🟢 100% (1/1) / ⏱️ 71.1s | 🟢 100% (1/1) / ⏱️ 28.9s |
55_kafka_runbook 🔗 | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - |
57_wrong_namespace 🔗 | 🔴 0% (0/1) / ⏱️ 23.2s / 💰 $0.12 | 🟢 100% (1/1) / ⏱️ 33.2s / 💰 $0.16 | 🔴 0% (0/1) / ⏱️ 23.0s / 💰 $0.06 | 🟢 100% (1/1) / ⏱️ 75.3s / 💰 $0.16 | 🟢 100% (1/1) / ⏱️ 52.2s | 🔴 0% (0/1) / ⏱️ 654.0s |
59_label_based_counting 🔗 | 🟢 100% (1/1) / ⏱️ 19.7s / 💰 $0.10 | 🟢 100% (1/1) / ⏱️ 21.9s / 💰 $0.14 | 🟢 100% (1/1) / ⏱️ 21.2s / 💰 $0.07 | 🟢 100% (1/1) / ⏱️ 27.7s / 💰 $0.08 | 🟢 100% (1/1) / ⏱️ 34.7s | 🟢 100% (1/1) / ⏱️ 22.2s |
60_count_less_than 🔗 | 🟢 100% (1/1) / ⏱️ 20.3s / 💰 $0.09 | 🟢 100% (1/1) / ⏱️ 22.6s / 💰 $0.08 | 🟢 100% (1/1) / ⏱️ 20.8s / 💰 $0.06 | 🔴 0% (0/1) / ⏱️ 29.4s / 💰 $0.07 | 🟢 100% (1/1) / ⏱️ 60.8s | 🟢 100% (1/1) / ⏱️ 20.2s |
61_exact_match_counting 🔗 | 🟢 100% (1/1) / ⏱️ 20.9s / 💰 $0.09 | 🟢 100% (1/1) / ⏱️ 22.7s / 💰 $0.08 | 🟢 100% (1/1) / ⏱️ 21.7s / 💰 $0.07 | 🟢 100% (1/1) / ⏱️ 28.5s / 💰 $0.06 | 🟢 100% (1/1) / ⏱️ 32.4s | 🟢 100% (1/1) / ⏱️ 24.5s |
62_fetch_error_logs_with_errors 🔗 | 🟢 100% (1/1) / ⏱️ 25.6s / 💰 $0.17 | 🟢 100% (1/1) / ⏱️ 31.8s / 💰 $0.15 | 🟢 100% (1/1) / ⏱️ 24.5s / 💰 $0.07 | 🔴 0% (0/1) / ⏱️ 27.6s / 💰 $0.03 | 🟢 100% (1/1) / ⏱️ 41.8s | 🟢 100% (1/1) / ⏱️ 38.5s |
63_fetch_error_logs_no_errors 🔗 | 🟢 100% (1/1) / ⏱️ 24.1s / 💰 $0.15 | 🟢 100% (1/1) / ⏱️ 34.9s / 💰 $0.15 | 🟢 100% (1/1) / ⏱️ 23.8s / 💰 $0.10 | 🟢 100% (1/1) / ⏱️ 48.8s / 💰 $0.13 | 🟢 100% (1/1) / ⏱️ 57.6s | 🟢 100% (1/1) / ⏱️ 24.2s |
64_keda_vs_hpa_confusion 🔗 | 🔴 0% (0/1) / ⏱️ 15.0s / 💰 $0.07 | 🟢 100% (1/1) / ⏱️ 61.9s / 💰 $0.32 | 🔴 0% (0/1) / ⏱️ 20.3s / 💰 $0.10 | 🟢 100% (1/1) / ⏱️ 96.1s / 💰 $0.21 | 🟢 100% (1/1) / ⏱️ 68.2s | 🔴 0% (0/1) / ⏱️ 21.0s |
65_health_check_followup 🔗 | 🟢 100% (1/1) / ⏱️ 26.7s / 💰 $0.16 | 🟢 100% (1/1) / ⏱️ 62.2s / 💰 $0.41 | 🟢 100% (1/1) / ⏱️ 33.9s / 💰 $0.19 | 🟢 100% (1/1) / ⏱️ 129.9s / 💰 $0.31 | 🟢 100% (1/1) / ⏱️ 89.4s | 🟢 100% (1/1) / ⏱️ 188.2s |
71_connection_pool_starvation 🔗 | 🔴 0% (0/1) / ⏱️ 30.3s / 💰 $0.20 | 🔴 0% (0/1) / ⏱️ 47.1s / 💰 $0.40 | 🔴 0% (0/1) / ⏱️ 25.2s / 💰 $0.09 | 🟢 100% (1/1) / ⏱️ 62.5s / 💰 $0.11 | 🟢 100% (1/1) / ⏱️ 117.1s | 🟢 100% (1/1) / ⏱️ 128.1s |
73a_time_window_anomaly 🔗 | 🔴 0% (0/1) / ⏱️ 29.9s / 💰 $0.22 | 🔴 0% (0/1) / ⏱️ 48.6s / 💰 $0.23 | 🔴 0% (0/1) / ⏱️ 30.9s / 💰 $0.22 | 🟢 100% (1/1) / ⏱️ 57.6s / 💰 $0.16 | 🟢 100% (1/1) / ⏱️ 62.2s | 🔴 0% (0/1) / ⏱️ 28.9s |
73b_time_window_anomaly 🔗 | 🔴 0% (0/1) / ⏱️ 28.0s / 💰 $0.17 | 🔴 0% (0/1) / ⏱️ 40.9s / 💰 $0.37 | 🔴 0% (0/1) / ⏱️ 31.9s / 💰 $0.21 | 🟢 100% (1/1) / ⏱️ 70.3s / 💰 $0.16 | 🔴 0% (0/1) / ⏱️ 59.1s | 🔴 0% (0/1) / ⏱️ 30.3s |
76_service_discovery_issue 🔗 | 🔴 0% (0/1) / ⏱️ 42.4s / 💰 $0.38 | 🟢 100% (1/1) / ⏱️ 47.7s / 💰 $0.23 | 🟢 100% (1/1) / ⏱️ 164.0s / 💰 $0.36 | 🟢 100% (1/1) / ⏱️ 86.5s / 💰 $0.17 | 🟢 100% (1/1) / ⏱️ 71.6s | 🔴 0% (0/1) / ⏱️ 47.2s |
77_liveness_probe_misconfiguration 🔗 | 🔴 0% (0/1) / ⏱️ 20.0s / 💰 $0.07 | 🟢 100% (1/1) / ⏱️ 49.1s / 💰 $0.24 | 🔴 0% (0/1) / ⏱️ 25.4s / 💰 $0.09 | 🟢 100% (1/1) / ⏱️ 102.5s / 💰 $0.25 | 🟢 100% (1/1) / ⏱️ 77.9s | 🟢 100% (1/1) / ⏱️ 34.7s |
78a_missing_cpu_limits 🔗 | 🔴 0% (0/1) / ⏱️ 22.3s / 💰 $0.10 | 🟢 100% (1/1) / ⏱️ 36.6s / 💰 $0.10 | 🟢 100% (1/1) / ⏱️ 27.2s / 💰 $0.08 | 🟢 100% (1/1) / ⏱️ 92.4s / 💰 $0.16 | 🟢 100% (1/1) / ⏱️ 85.5s | ⏱️ 0% (0/1) / ⏱️ 307.2s |
78b_cpu_quota_exceeded 🔗 | 🔴 0% (0/1) / ⏱️ 26.6s / 💰 $0.23 | 🟢 100% (1/1) / ⏱️ 49.4s / 💰 $0.21 | 🔴 0% (0/1) / ⏱️ 26.8s / 💰 $0.13 | 🟢 100% (1/1) / ⏱️ 94.9s / 💰 $0.23 | 🟢 100% (1/1) / ⏱️ 82.9s | 🔴 0% (0/1) / ⏱️ 39.9s |
79_configmap_mount_issue 🔗 | 🟢 100% (1/1) / ⏱️ 32.5s / 💰 $0.23 | 🟢 100% (1/1) / ⏱️ 37.6s / 💰 $0.18 | 🟢 100% (1/1) / ⏱️ 25.2s / 💰 $0.07 | 🟢 100% (1/1) / ⏱️ 51.5s / 💰 $0.10 | 🟢 100% (1/1) / ⏱️ 47.5s | 🟢 100% (1/1) / ⏱️ 27.6s |
80_pvc_storage_class_mismatch 🔗 | 🔴 0% (0/1) / ⏱️ 32.3s / 💰 $0.20 | 🟢 100% (1/1) / ⏱️ 43.7s / 💰 $0.20 | 🔴 0% (0/1) / ⏱️ 30.7s / 💰 $0.09 | 🟢 100% (1/1) / ⏱️ 104.7s / 💰 $0.23 | 🔴 0% (0/1) / ⏱️ 54.2s | 🟢 100% (1/1) / ⏱️ 104.6s |
81_service_account_permission_denied 🔗 | 🟢 100% (1/1) / ⏱️ 26.0s / 💰 $0.17 | 🟢 100% (1/1) / ⏱️ 53.9s / 💰 $0.27 | 🟢 100% (1/1) / ⏱️ 34.6s / 💰 $0.14 | 🟢 100% (1/1) / ⏱️ 73.2s / 💰 $0.26 | 🟢 100% (1/1) / ⏱️ 103.4s | 🟢 100% (1/1) / ⏱️ 30.6s |
82_pod_anti_affinity_conflict 🔗 | 🔴 0% (0/1) / ⏱️ 26.2s / 💰 $0.18 | 🔴 0% (0/1) / ⏱️ 60.0s / 💰 $0.24 | 🔴 0% (0/1) / ⏱️ 20.5s / 💰 $0.07 | 🟢 100% (1/1) / ⏱️ 106.8s / 💰 $0.19 | 🔴 0% (0/1) / ⏱️ 103.4s | 🔴 0% (0/1) / ⏱️ 43.4s |
83_secret_not_found 🔗 | 🔴 0% (0/1) / ⏱️ 31.3s / 💰 $0.20 | 🟢 100% (1/1) / ⏱️ 42.7s / 💰 $0.20 | 🔴 0% (0/1) / ⏱️ 22.7s / 💰 $0.06 | 🟢 100% (1/1) / ⏱️ 752.3s / 💰 $0.33 | 🟢 100% (1/1) / ⏱️ 70.7s | 🟢 100% (1/1) / ⏱️ 51.0s |
84_network_policy_blocking_traffic 🔗 | 🟢 100% (1/1) / ⏱️ 32.7s / 💰 $0.24 | 🟢 100% (1/1) / ⏱️ 60.2s / 💰 $0.23 | 🟢 100% (1/1) / ⏱️ 31.1s / 💰 $0.13 | 🟢 100% (1/1) / ⏱️ 82.4s / 💰 $0.17 | 🟢 100% (1/1) / ⏱️ 107.2s | 🟢 100% (1/1) / ⏱️ 77.7s |
85_hpa_not_scaling 🔗 | 🔴 0% (0/1) / ⏱️ 22.9s / 💰 $0.14 | 🟢 100% (1/1) / ⏱️ 43.4s / 💰 $0.16 | 🟢 100% (1/1) / ⏱️ 23.2s / 💰 $0.07 | 🟢 100% (1/1) / ⏱️ 75.4s / 💰 $0.17 | 🟢 100% (1/1) / ⏱️ 80.1s | 🟢 100% (1/1) / ⏱️ 28.2s |
86_configmap_like_but_secret 🔗 | 🟢 100% (1/1) / ⏱️ 29.9s / 💰 $0.20 | 🟢 100% (1/1) / ⏱️ 42.6s / 💰 $0.21 | 🟢 100% (1/1) / ⏱️ 27.8s / 💰 $0.12 | 🟢 100% (1/1) / ⏱️ 140.3s / 💰 $0.28 | 🟢 100% (1/1) / ⏱️ 85.7s | 🟢 100% (1/1) / ⏱️ 46.6s |
89_runbook_missing_cloudwatch 🔗 | 🟢 100% (1/1) / ⏱️ 21.6s / 💰 $0.09 | 🟢 100% (1/1) / ⏱️ 31.3s / 💰 $0.10 | 🟢 100% (1/1) / ⏱️ 18.7s / 💰 $0.07 | 🟢 100% (1/1) / ⏱️ 47.1s / 💰 $0.04 | 🟢 100% (1/1) / ⏱️ 56.7s | 🟢 100% (1/1) / ⏱️ 101.1s |
90_runbook_basic_selection 🔗 | 🟢 100% (1/1) / ⏱️ 33.2s / 💰 $0.20 | 🟢 100% (1/1) / ⏱️ 118.3s / 💰 $1.01 | 🟢 100% (1/1) / ⏱️ 174.1s / 💰 $0.66 | 🟢 100% (1/1) / ⏱️ 247.8s / 💰 $0.39 | 🟢 100% (1/1) / ⏱️ 133.5s | 🟢 100% (1/1) / ⏱️ 130.8s |
91f_datadog_logs_historical_pod 🔗 | 🔴 0% (0/1) / ⏱️ 14.9s / 💰 $0.04 | 🔴 0% (0/1) / ⏱️ 85.8s / 💰 $0.30 | 🔴 0% (0/1) / ⏱️ 31.8s / 💰 $0.14 | 🔴 0% (0/1) / ⏱️ 138.3s / 💰 $0.21 | 🔴 0% (0/1) / ⏱️ 74.6s | 🔴 0% (0/1) / ⏱️ 150.4s |
93_calling_datadog[0] 🔗 | 🟢 100% (1/1) / ⏱️ 16.2s / 💰 $0.13 | 🟢 100% (1/1) / ⏱️ 10.3s / 💰 $0.27 | 🟢 100% (1/1) / ⏱️ 7.1s / 💰 $0.07 | 🟢 100% (1/1) / ⏱️ 29.1s / 💰 $0.17 | 🟢 100% (1/1) / ⏱️ 30.9s | 🟢 100% (1/1) / ⏱️ 17.9s |
93_calling_datadog[1] 🔗 | ⚪️ - | 🟢 100% (1/1) / ⏱️ 9.8s / 💰 $0.15 | 🟢 100% (1/1) / ⏱️ 6.7s / 💰 $0.07 | 🟢 100% (1/1) / ⏱️ 21.6s / 💰 $0.12 | 🟢 100% (1/1) / ⏱️ 21.1s | 🟢 100% (1/1) / ⏱️ 12.1s |
93_calling_datadog[2] 🔗 | 🟢 100% (1/1) / ⏱️ 14.8s / 💰 $0.13 | 🟢 100% (1/1) / ⏱️ 10.1s / 💰 $0.15 | 🟢 100% (1/1) / ⏱️ 5.5s / 💰 $0.07 | 🟢 100% (1/1) / ⏱️ 25.8s / 💰 $0.19 | 🟢 100% (1/1) / ⏱️ 22.2s | 🟢 100% (1/1) / ⏱️ 12.0s |
93_events_since_specific_date 🔗 | 🟢 100% (1/1) / ⏱️ 10.4s / 💰 $0.13 | 🟢 100% (1/1) / ⏱️ 17.0s / 💰 $0.17 | 🟢 100% (1/1) / ⏱️ 8.9s / 💰 $0.06 | ⚪️ - | 🟢 100% (1/1) / ⏱️ 23.9s | 🟢 100% (1/1) / ⏱️ 12.6s |
94_runbook_transparency 🔗 | 🔴 0% (0/1) / ⏱️ 28.8s / 💰 $0.19 | 🟢 100% (1/1) / ⏱️ 61.2s / 💰 $0.29 | 🟢 100% (1/1) / ⏱️ 30.6s / 💰 $0.11 | 🟢 100% (1/1) / ⏱️ 810.0s / 💰 $0.55 | 🟢 100% (1/1) / ⏱️ 112.1s | 🔴 0% (0/1) / ⏱️ 85.4s |
96_no_matching_runbook 🔗 | 🔴 0% (0/1) / ⏱️ 27.5s / 💰 $0.19 | 🟢 100% (1/1) / ⏱️ 72.9s / 💰 $0.69 | 🔴 0% (0/1) / ⏱️ 32.4s / 💰 $0.13 | 🟢 100% (1/1) / ⏱️ 126.5s / 💰 $0.20 | 🟢 100% (1/1) / ⏱️ 84.9s | 🔴 0% (0/1) / ⏱️ 15.1s |
97_logs_clarification_needed 🔗 | 🟢 100% (1/1) / ⏱️ 8.0s / 💰 $0.03 | 🟢 100% (1/1) / ⏱️ 35.1s / 💰 $0.14 | 🟢 100% (1/1) / ⏱️ 15.3s / 💰 $0.05 | 🟢 100% (1/1) / ⏱️ 17.4s / 💰 $0.02 | 🔴 0% (0/1) / ⏱️ 142.2s | 🔴 0% (0/1) / ⏱️ 81.3s |
98_logs_transparency_default_time 🔗 | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - | ⚪️ - |
99_logs_transparency_custom_time 🔗 | 🟢 100% (1/1) / ⏱️ 22.5s / 💰 $0.08 | 🟢 100% (1/1) / ⏱️ 33.8s / 💰 $0.11 | 🟢 100% (1/1) / ⏱️ 21.5s / 💰 $0.06 | 🟢 100% (1/1) / ⏱️ 37.0s / 💰 $0.08 | 🟢 100% (1/1) / ⏱️ 59.6s | 🟢 100% (1/1) / ⏱️ 34.2s |
Results are automatically generated and updated weekly. View full traces and detailed analysis in Braintrust experiment: local-benchmark-20251012-151418.