Skip to content

HolmesGPT LLM Evaluation Benchmark Results

Generated: 2025-09-30 15:37 UTC

Total Duration: 6h 16m 42s

Iterations: 5

Judge (classifier) model: gpt-4o

About this Benchmark

HolmesGPT is continuously evaluated against real-world Kubernetes and cloud troubleshooting scenarios.

If you find scenarios that HolmesGPT does not perform well on, please consider adding them as evals to the benchmark.

Model Accuracy Comparison

Model Pass Fail Skip/Error Total Success Rate
gpt-4o 295 174 56 525 🟡 63% (295/469)
gpt-4.1 346 122 57 525 🟡 74% (346/468)
gpt-5 360 104 61 525 🟡 78% (360/464)
sonnet-4-20250514 419 51 55 525 🟡 89% (419/470)
sonnet-4-5-20250929 420 50 55 525 🟡 89% (420/470)

Model Cost Comparison

Model Tests Avg Cost Min Cost Max Cost Total Cost
gpt-4o 468 $0.14 $0.01 $0.85 $64.90
gpt-4.1 468 $0.11 $0.02 $1.07 $52.00
gpt-5 464 $0.13 $0.02 $0.58 $61.76
sonnet-4-20250514 468 $0.17 $0.06 $1.05 $80.54
sonnet-4-5-20250929 467 $0.16 $0.06 $0.64 $75.56

Model Latency Comparison

Model Avg (s) Min (s) Max (s) P50 (s) P95 (s)
gpt-4o 49.0 8.9 278.2 43.5 94.7
gpt-4.1 53.8 5.2 236.8 48.2 109.3
gpt-5 190.3 22.5 1136.0 158.1 442.5
sonnet-4-20250514 89.6 10.4 879.7 64.8 231.5
sonnet-4-5-20250929 73.0 10.6 663.3 60.0 154.6

Performance by Tag

Success rate by test category and model:

Tag gpt-4o gpt-4.1 gpt-5 sonnet-4-20250514 sonnet-4-5-20250929 Warnings
chain-of-causation 🔴 0% (0/30) 🟡 3% (1/30) 🟡 40% (12/30) 🟡 63% (19/30) 🟡 70% (21/30) ⚠️ 50 skipped
context_window 🟡 57% (20/35) 🟡 77% (27/35) 🟡 83% (29/35) 🟡 86% (30/35) 🟡 77% (27/35)
counting 🟢 100% (20/20) 🟢 100% (20/20) 🟡 95% (19/20) 🟢 100% (20/20) 🟢 100% (20/20)
database 🔴 0% (0/5) 🟡 60% (⅗) 🟢 100% (5/5) 🟢 100% (5/5) 🟢 100% (5/5) ⚠️ 75 skipped
datadog 🟡 75% (15/20) 🟡 80% (16/20) 🟡 95% (18/19) 🟢 100% (20/20) 🟢 100% (20/20) ⚠️ 1 skipped
datetime 🟡 65% (13/20) 🟡 65% (13/20) 🟡 95% (19/20) 🟡 75% (15/20) 🟡 85% (17/20) ⚠️ 50 skipped
easy 🟡 97% (175/180) 🟡 96% (173/180) 🟡 80% (144/179) 🟡 97% (174/180) 🟡 96% (172/180) ⚠️ 1 skipped
hard 🟡 11% (8/70) 🟡 29% (20/70) 🟡 57% (40/70) 🟡 77% (54/70) 🟡 80% (56/70) ⚠️ 150 skipped
kafka ⚪️ - ⚪️ - ⚪️ - ⚪️ - ⚪️ - ⚠️ 50 skipped
kubernetes 🟡 55% (129/235) 🟡 71% (168/235) 🟡 69% (163/235) 🟡 89% (208/235) 🟡 87% (205/235) ⚠️ 25 skipped
logs 🟡 62% (80/130) 🟡 67% (87/129) 🟡 77% (100/130) 🟡 75% (98/130) 🟡 82% (106/130) ⚠️ 176 skipped
medium 🟡 51% (112/219) 🟡 70% (153/218) 🟡 82% (176/215) 🟡 87% (191/220) 🟡 87% (192/220) ⚠️ 133 skipped
network 🟡 45% (9/20) 🟡 60% (12/20) 🟡 85% (17/20) 🟢 100% (20/20) 🟢 100% (20/20)
numerical 🟢 100% (5/5) 🟢 100% (5/5) 🟢 100% (5/5) 🟢 100% (5/5) 🟢 100% (5/5)
port-forward 🟡 29% (13/45) 🟡 44% (20/45) 🟡 53% (24/45) 🟡 49% (22/45) 🟡 42% (19/45)
prometheus 🟡 65% (13/20) 🟡 95% (19/20) 🟢 100% (20/20) 🟢 100% (20/20) 🟡 80% (16/20)
question-answer 🟢 100% (20/20) 🟢 100% (20/20) 🟡 95% (19/20) 🟢 100% (20/20) 🟢 100% (20/20)
runbooks 🟡 73% (22/30) 🟡 73% (22/30) 🟡 93% (28/30) 🟢 100% (30/30) 🟡 97% (29/30) ⚠️ 25 skipped
slackbot ⚪️ - ⚪️ - ⚪️ - ⚪️ - ⚪️ - ⚠️ 25 skipped
traces 🔴 0% (0/25) 🟡 4% (1/25) 🟡 40% (10/25) 🟡 56% (14/25) 🟡 64% (16/25)
transparency 🟡 71% (50/70) 🟡 71% (50/70) 🟡 84% (59/70) 🟡 81% (57/70) 🟡 84% (59/70) ⚠️ 25 skipped
Overall 🟡 63% (295/469) 🟡 74% (346/468) 🟡 78% (360/464) 🟡 89% (419/470) 🟡 89% (420/470) ⚠️ 284 skipped

Raw Results

Status of all evaluations across models. Color coding:

  • 🟢 Passing 100% (stable)
  • 🟡 Passing 1-99%
  • 🔴 Passing 0% (failing)
  • 🔧 Mock data failure (missing or invalid test data)
  • ⚠️ Setup failure (environment/infrastructure issue)
  • ⏱️ Timeout or rate limit error
  • ⏭️ Test skipped (e.g., known issue or precondition not met)
Eval ID gpt-4o gpt-4.1 gpt-5 sonnet-4-20250514 sonnet-4-5-20250929
01_how_many_pods 🔗 🟢 🟢 🟢 🟢 🟢
02_what_is_wrong_with_pod 🔗 🟢 🟢 🟢 🟢 🟢
03_what_is_the_command_to_port_forward 🔗 🟢 🟢 🟢 🟢 🟢
04_related_k8s_events 🔗 🟢 🟢 🟡 🟢 🟢
05_image_version 🔗 🟢 🟢 🟡 🟢 🟢
09_crashpod 🔗 🟢 🟢 🟡 🟢 🟢
100a_historical_logs 🔗 🔴 🔴 🔴 🔴 🔴
100b_historical_logs_nonstandard_label 🔗 🔴 🔴 🔴 🔴 🔴
101_historical_logs_pod_deleted 🔗 🔴 🔴 🔴 🔴 🔴
103_logs_transparency_default_limit 🔗 🔴 🔴 🟢 🔴 🟢
104a_postgres_root_issue 🔗 🔴 🟡 🟢 🟢 🟢
107_log_filter_http_status_code 🔗 🟡 🟢 🟢 🟢 🟢
108_logs_nearby_lines 🔗 🔴 🔴 🟡 🟡 🔴
109_logs_transparency_not_found 🔗 🟡 🟢 🟢 🟢 🟢
10_image_pull_backoff 🔗 🟢 🟢 🟡 🟢 🟢
110_k8s_events_image_pull 🔗 🟢 🟢 🟢 🟢 🟢
111_disabled_datadog_traces 🔗 🔴 🟡 🟡 🟢 🟢
111_pod_names_contain_service 🔗 🟢 🟢 🟡 🟢 🟢
112_find_pvcs_by_uuid 🔗 🔴 🟡 🟢 🟢 🟢
114_checkout_latency_tracing_rebuild[0] 🔗 🔴 🟡 🟡 🟡 🟡
115_checkout_errors_tracing[0] 🔗 🔴 🔴 🟡 🟡 🟡
11_init_containers 🔗 🟢 🟢 🟡 🟢 🟢
121_new_relic_checkout_errors_tracing[0] 🔗 🔴 🔴 🟡 🟡 🟡
122_new_relic_checkout_latency_tracing_rebuild[0] 🔗 🔴 🔴 🟡 🟢 🟢
123_new_relic_checkout_errors_tracing[0] 🔗 🔴 🔴 🟡 🟢 🟢
12_job_crashing 🔗 🟡 🟢 🟡 🟢 🟢
13a_pending_node_selector_basic 🔗 🟢 🟢 🟡 🟢 🟡
13b_pending_node_selector_detailed 🔗 🔴 🟡 🟡 🟢 🟢
14_pending_resources 🔗 🟢 🟢 🟡 🟢 🟢
159_prometheus_high_cardinality_cpu[0] 🔗 🟢 🟢 🟢 🟢 🟢
159_prometheus_high_cardinality_cpu[1] 🔗 🟡 🟡 🟢 🟢 🟢
159_prometheus_high_cardinality_cpu[2] 🔗 🔴 🟢 🟢 🟢 🟡
15_failed_readiness_probe 🔗 🟢 🟢 🟡 🟢 🟢
16_failed_no_toolset_found 🔗 🔴 🔴 🔴 🟡 🔴
17_oom_kill 🔗 🟢 🟢 🟡 🟢 🟢
19_detect_missing_app_details 🔗 🟢 🟡 🟢 🟢 🟢
20_long_log_file_search 🔗 🟢 🟢 🟢 🟢 🟢
21_job_fail_curl_no_svc_account 🔗 🟡 🟢 🟡 🟢 🟢
23_app_error_in_current_logs 🔗 🟢 🟢 🟢 🟡 🟢
24_misconfigured_pvc 🔗 🟢 🟢 🔴 🟢 🟢
24a_misconfigured_pvc_basic 🔗 🟡 🟢 🔴 🟢 🟢
24b_misconfigured_pvc_detailed 🔗 🔴 🟡 🟡 🟢 🟢
25_misconfigured_ingress_class 🔗 🔴 🔴 🟡 🟢 🟢
26_page_render_times 🔗 🟢 🟢 🟢 🟢 🟢
27a_multi_container_logs 🔗 🟢 🟢 🟢 🟢 🟢
27b_multi_container_logs 🔗 🟢 🟡 🟡 🟢 🟢
28_permissions_error 🔗 🟡 🟡 🟡 🔴 🔴
33_cpu_metrics_discovery 🔗 🟢 🟢 🟢 🟢 🟢
39_failed_toolset 🔗 🟢 🟡 🟡 🟡 🟢
41_setup_argo 🔗 🟡 🟢 🟢 🟢 🟢
42_dns_issues_result_new_tools_no_runbook 🔗 🟡 🟡 🟢 🟢 🟢
42_dns_issues_steps_new_tools 🔗 🟢 🟢 🟢 🟢 🟢
43_current_datetime_from_prompt 🔗 🟢 🟢 🟢 🟢 🟢
45_fetch_deployment_logs_simple 🔗 🟢 🟢 🟢 🟢 🟢
50a_logs_since_last_specific_month 🔗 🟢 🟢 🟢 🟢 🟢
51_logs_summarize_errors 🔗 🟢 🟢 🟢 🟢 🟢
52_logs_login_issues 🔗 🟡 🟢 🟡 🟢 🟢
53_logs_find_term 🔗 🟢 🟢 🟢 🟢 🟢
54_not_truncated_when_getting_pods 🔗 🟢 🟢 🟡 🟢 🟡
57_wrong_namespace 🔗 🔴 🔴 🟢 🟡 🟢
59_label_based_counting 🔗 🟢 🟢 🟢 🟢 🟢
60_count_less_than 🔗 🟢 🟢 🟡 🟢 🟢
61_exact_match_counting 🔗 🟢 🟢 🟢 🟢 🟢
62_fetch_error_logs_with_errors 🔗 🟢 🟢 🟡 🟢 🟢
63_fetch_error_logs_no_errors 🔗 🟢 🟢 🟡 🟢 🟡
64_keda_vs_hpa_confusion 🔗 🔴 🔴 🟡 🟢 🟢
65_health_check_followup 🔗 🟢 🟢 🟡 🟢 🟢
71_connection_pool_starvation 🔗 🟡 🟢 🟡 🟢 🟢
73a_time_window_anomaly 🔗 🔴 🟡 🟢 🔴 🟡
73b_time_window_anomaly 🔗 🟡 🟡 🟡 🟢 🟢
76_service_discovery_issue 🔗 🔴 🟢 🟢 🟢 🟢
77_liveness_probe_misconfiguration 🔗 🟡 🟢 🟢 🟢 🟢
78a_missing_cpu_limits 🔗 🔴 🟢 🟢 🟢 🟢
78b_cpu_quota_exceeded 🔗 🔴 🟡 🟢 🟢 🟢
79_configmap_mount_issue 🔗 🟢 🟢 🟢 🟢 🟢
80_pvc_storage_class_mismatch 🔗 🔴 🔴 🟢 🟢 🟢
81_service_account_permission_denied 🔗 🟡 🟡 🟡 🟢 🟢
82_pod_anti_affinity_conflict 🔗 🟢 🟢 🟢 🟢 🟢
83_secret_not_found 🔗 🟢 🟢 🟢 🟢 🟢
84_network_policy_blocking_traffic 🔗 🟡 🟡 🟢 🟢 🟢
85_hpa_not_scaling 🔗 🔴 🟡 🟢 🟢 🟢
86_configmap_like_but_secret 🔗 🟡 🟢 🟢 🟢 🟢
89_runbook_missing_cloudwatch 🔗 🟡 🟡 🟢 🟢 🟢
90_runbook_basic_selection 🔗 🟢 🟢 🟢 🟢 🟡
91f_datadog_logs_historical_pod 🔗 🔴 🟡 🟡 🟢 🟢
93_calling_datadog[0] 🔗 🟢 🟢 🟢 🟢 🟢
93_calling_datadog[1] 🔗 🟢 🟢 🟢 🟢 🟢
94_runbook_transparency 🔗 🟢 🟢 🟢 🟢 🟢
96_no_matching_runbook 🔗 🔴 🔴 🟡 🟢 🟢
97_logs_clarification_needed 🔗 🟢 🟢 🟢 🟢 🟢
99_logs_transparency_custom_time 🔗 🟢 🟢 🟢 🟢 🟢
50_logs_since_specific_date 🔗 🟢 🟢 🟢 🟢 🟢
93_calling_datadog[2] 🔗 🟢 🟢 🟢 🟢 🟢
93_events_since_specific_date 🔗 🟢 🟢 🔧 🟢 🟢
44_slack_statefulset_logs 🔗 🔧 🔧 🔧 🔧 🔧
48_logs_since_thursday 🔗 🔧 🔧 🔧 🔧 🔧
22_high_latency_dbi_down 🔗 ⚠️ ⚠️ ⚠️ ⚠️ ⚠️
08_sock_shop_frontend 🔗 ⏭️ ⏭️ ⏭️ ⏭️ ⏭️
104b_postgres_missing_index_pgstat 🔗 ⏭️ ⏭️ ⏭️ ⏭️ ⏭️
104c_postgres_minimal_missing_index 🔗 ⏭️ ⏭️ ⏭️ ⏭️ ⏭️
105_redis_wrong_data_structure 🔗 ⏭️ ⏭️ ⏭️ ⏭️ ⏭️
156_kafka_opensearch_latency 🔗 ⏭️ ⏭️ ⏭️ ⏭️ ⏭️
43_slack_deployment_logs 🔗 ⏭️ ⏭️ ⏭️ ⏭️ ⏭️
55_kafka_runbook 🔗 ⏭️ ⏭️ ⏭️ ⏭️ ⏭️
98_logs_transparency_default_time 🔗 ⏭️ ⏭️ ⏭️ ⏭️ ⏭️
SUMMARY 🟡 63% (295/469) 🟡 74% (346/468) 🟡 78% (360/464) 🟡 89% (419/470) 🟡 89% (420/470)

Detailed Raw Results

Eval ID gpt-4o gpt-4.1 gpt-5 sonnet-4-20250514 sonnet-4-5-20250929
01_how_many_pods 🔗 🟢 100% (5/5) / ⏱️ 31.3s / 💰 $0.08 🟢 100% (5/5) / ⏱️ 33.2s / 💰 $0.05 🟢 100% (5/5) / ⏱️ 43.4s / 💰 $0.04 🟢 100% (5/5) / ⏱️ 34.3s / 💰 $0.08 🟢 100% (5/5) / ⏱️ 33.6s / 💰 $0.08
02_what_is_wrong_with_pod 🔗 🟢 100% (5/5) / ⏱️ 42.9s / 💰 $0.07 🟢 100% (5/5) / ⏱️ 38.7s / 💰 $0.06 🟢 100% (5/5) / ⏱️ 123.9s / 💰 $0.10 🟢 100% (5/5) / ⏱️ 53.5s / 💰 $0.11 🟢 100% (5/5) / ⏱️ 67.5s / 💰 $0.10
03_what_is_the_command_to_port_forward 🔗 🟢 100% (5/5) / ⏱️ 61.2s / 💰 $0.12 🟢 100% (5/5) / ⏱️ 53.7s / 💰 $0.11 🟢 100% (5/5) / ⏱️ 68.9s / 💰 $0.06 🟢 100% (5/5) / ⏱️ 43.0s / 💰 $0.12 🟢 100% (5/5) / ⏱️ 51.9s / 💰 $0.09
04_related_k8s_events 🔗 🟢 100% (5/5) / ⏱️ 39.4s / 💰 $0.11 🟢 100% (5/5) / ⏱️ 38.7s / 💰 $0.06 🟡 80% (⅘) / ⏱️ 69.1s / 💰 $0.05 🟢 100% (5/5) / ⏱️ 58.8s / 💰 $0.09 🟢 100% (5/5) / ⏱️ 62.4s / 💰 $0.09
05_image_version 🔗 🟢 100% (5/5) / ⏱️ 43.6s / 💰 $0.10 🟢 100% (5/5) / ⏱️ 56.3s / 💰 $0.07 🟡 80% (⅘) / ⏱️ 73.8s / 💰 $0.06 🟢 100% (5/5) / ⏱️ 37.0s / 💰 $0.09 🟢 100% (5/5) / ⏱️ 38.1s / 💰 $0.09
09_crashpod 🔗 🟢 100% (5/5) / ⏱️ 43.0s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 37.3s / 💰 $0.06 🟡 80% (⅘) / ⏱️ 92.4s / 💰 $0.08 🟢 100% (5/5) / ⏱️ 73.0s / 💰 $0.14 🟢 100% (5/5) / ⏱️ 64.8s / 💰 $0.14
100a_historical_logs 🔗 🔴 0% (0/5) / ⏱️ 52.3s / 💰 $0.12 🔴 0% (0/5) / ⏱️ 54.8s / 💰 $0.07 🔴 0% (0/5) / ⏱️ 500.6s / 💰 $0.29 🔴 0% (0/5) / ⏱️ 116.2s / 💰 $0.27 🔴 0% (0/5) / ⏱️ 98.0s / 💰 $0.19
100b_historical_logs_nonstandard_label 🔗 🔴 0% (0/5) / ⏱️ 50.9s / 💰 $0.11 🔴 0% (0/5) / ⏱️ 57.5s / 💰 $0.07 🔴 0% (0/5) / ⏱️ 363.3s / 💰 $0.22 🔴 0% (0/5) / ⏱️ 157.0s / 💰 $0.18 🔴 0% (0/5) / ⏱️ 102.9s / 💰 $0.17
101_historical_logs_pod_deleted 🔗 🔴 0% (0/5) / ⏱️ 53.2s / 💰 $0.12 🔴 0% (0/5) / ⏱️ 53.8s / 💰 $0.08 🔴 0% (0/5) / ⏱️ 268.6s / 💰 $0.16 🔴 0% (0/5) / ⏱️ 97.5s / 💰 $0.16 🔴 0% (0/5) / ⏱️ 85.9s / 💰 $0.15
103_logs_transparency_default_limit 🔗 🔴 0% (0/5) / ⏱️ 63.1s / 💰 $0.15 🔴 0% (0/5) / ⏱️ 105.8s / 💰 $0.39 🟢 100% (5/5) / ⏱️ 137.1s / 💰 $0.09 🔴 0% (0/5) / ⏱️ 81.0s / 💰 $0.41 🟢 100% (5/5) / ⏱️ 74.6s / 💰 $0.12
104a_postgres_root_issue 🔗 🔴 0% (0/5) / ⏱️ 48.3s / 💰 $0.18 🟡 60% (⅗) / ⏱️ 85.6s / 💰 $0.35 🟢 100% (5/5) / ⏱️ 233.2s / 💰 $0.21 🟢 100% (5/5) / ⏱️ 71.9s / 💰 $0.19 🟢 100% (5/5) / ⏱️ 106.0s / 💰 $0.24
107_log_filter_http_status_code 🔗 🟡 40% (⅖) / ⏱️ 54.0s / 💰 $0.15 🟢 100% (5/5) / ⏱️ 57.4s / 💰 $0.10 🟢 100% (5/5) / ⏱️ 472.2s / 💰 $0.30 🟢 100% (5/5) / ⏱️ 127.5s / 💰 $0.22 🟢 100% (5/5) / ⏱️ 100.3s / 💰 $0.24
108_logs_nearby_lines 🔗 🔴 0% (0/5) / ⏱️ 64.2s / 💰 $0.17 🔴 0% (0/5) / ⏱️ 57.6s / 💰 $0.23 🟡 40% (⅖) / ⏱️ 345.4s / 💰 $0.26 🟡 20% (⅕) / ⏱️ 111.3s / 💰 $0.36 🔴 0% (0/5) / ⏱️ 89.7s / 💰 $0.22
109_logs_transparency_not_found 🔗 🟡 80% (⅘) / ⏱️ 47.2s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 44.5s / 💰 $0.07 🟢 100% (5/5) / ⏱️ 135.7s / 💰 $0.09 🟢 100% (5/5) / ⏱️ 44.4s / 💰 $0.09 🟢 100% (5/5) / ⏱️ 48.1s / 💰 $0.10
10_image_pull_backoff 🔗 🟢 100% (5/5) / ⏱️ 47.3s / 💰 $0.18 🟢 100% (5/5) / ⏱️ 55.9s / 💰 $0.10 🟡 60% (⅗) / ⏱️ 99.9s / 💰 $0.10 🟢 100% (5/5) / ⏱️ 59.4s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 60.0s / 💰 $0.13
110_k8s_events_image_pull 🔗 🟢 100% (5/5) / ⏱️ 34.7s / 💰 $0.09 🟢 100% (5/5) / ⏱️ 42.4s / 💰 $0.07 🟢 100% (5/5) / ⏱️ 100.1s / 💰 $0.07 🟢 100% (5/5) / ⏱️ 72.1s / 💰 $0.10 🟢 100% (5/5) / ⏱️ 53.4s / 💰 $0.10
111_disabled_datadog_traces 🔗 🔴 0% (0/5) / ⏱️ 40.5s / 💰 $0.03 🟡 60% (⅗) / ⏱️ 39.6s / 💰 $0.03 🟡 80% (⅘) / ⏱️ 235.0s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 87.4s / 💰 $0.15 🟢 100% (5/5) / ⏱️ 44.8s / 💰 $0.06
111_pod_names_contain_service 🔗 🟢 100% (5/5) / ⏱️ 71.3s / 💰 $0.16 🟢 100% (5/5) / ⏱️ 68.3s / 💰 $0.10 🟡 40% (⅖) / ⏱️ 210.5s / 💰 $0.10 🟢 100% (5/5) / ⏱️ 77.3s / 💰 $0.20 🟢 100% (5/5) / ⏱️ 66.9s / 💰 $0.16
112_find_pvcs_by_uuid 🔗 🔴 0% (0/5) / ⏱️ 45.8s / 💰 $0.12 🟡 20% (⅕) / ⏱️ 58.2s / 💰 $0.08 🟢 100% (5/5) / ⏱️ 147.8s / 💰 $0.08 🟢 100% (5/5) / ⏱️ 67.5s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 88.6s / 💰 $0.13
114_checkout_latency_tracing_rebuild[0] 🔗 🔴 0% (0/5) / ⏱️ 69.7s / 💰 $0.20 🟡 20% (⅕) / ⏱️ 89.3s / 💰 $0.16 🟡 40% (⅖) / ⏱️ 377.2s / 💰 $0.34 🟡 20% (⅕) / ⏱️ 148.2s / 💰 $0.31 🟡 40% (⅖) / ⏱️ 173.2s / 💰 $0.52
115_checkout_errors_tracing[0] 🔗 🔴 0% (0/5) / ⏱️ 87.3s / 💰 $0.22 🔴 0% (0/5) / ⏱️ 93.8s / 💰 $0.21 🟡 40% (⅖) / ⏱️ 265.8s / 💰 $0.20 🟡 20% (⅕) / ⏱️ 136.2s / 💰 $0.30 🟡 20% (⅕) / ⏱️ 255.3s / 💰 $0.51
11_init_containers 🔗 🟢 100% (5/5) / ⏱️ 45.3s / 💰 $0.11 🟢 100% (5/5) / ⏱️ 54.0s / 💰 $0.07 🟡 80% (⅘) / ⏱️ 139.5s / 💰 $0.10 🟢 100% (5/5) / ⏱️ 65.4s / 💰 $0.12 🟢 100% (5/5) / ⏱️ 73.8s / 💰 $0.11
121_new_relic_checkout_errors_tracing[0] 🔗 🔴 0% (0/5) / ⏱️ 35.5s / 💰 $0.10 🔴 0% (0/5) / ⏱️ 40.7s / 💰 $0.05 🟡 60% (⅗) / ⏱️ 530.6s / 💰 $0.41 🟡 40% (⅖) / ⏱️ 189.5s / 💰 $0.48 🟡 60% (⅗) / ⏱️ 145.3s / 💰 $0.41
122_new_relic_checkout_latency_tracing_rebuild[0] 🔗 🔴 0% (0/5) / ⏱️ 42.6s / 💰 $0.20 🔴 0% (0/5) / ⏱️ 65.4s / 💰 $0.25 🟡 40% (⅖) / ⏱️ 583.9s / 💰 $0.36 🟢 100% (5/5) / ⏱️ 293.4s / 💰 $0.41 🟢 100% (5/5) / ⏱️ 156.6s / 💰 $0.39
123_new_relic_checkout_errors_tracing[0] 🔗 🔴 0% (0/5) / ⏱️ 63.9s / 💰 $0.11 🔴 0% (0/5) / ⏱️ 50.6s / 💰 $0.06 🟡 20% (⅕) / ⏱️ 343.2s / 💰 $0.31 🟢 100% (5/5) / ⏱️ 155.5s / 💰 $0.44 🟢 100% (5/5) / ⏱️ 124.7s / 💰 $0.37
12_job_crashing 🔗 🟡 60% (⅗) / ⏱️ 49.7s / 💰 $0.11 🟢 100% (5/5) / ⏱️ 41.9s / 💰 $0.08 🟡 80% (⅘) / ⏱️ 184.2s / 💰 $0.15 🟢 100% (5/5) / ⏱️ 92.1s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 65.8s / 💰 $0.14
13a_pending_node_selector_basic 🔗 🟢 100% (5/5) / ⏱️ 52.8s / 💰 $0.14 🟢 100% (5/5) / ⏱️ 52.0s / 💰 $0.10 🟡 20% (⅕) / ⏱️ 84.3s / 💰 $0.04 🟢 100% (5/5) / ⏱️ 119.2s / 💰 $0.14 🟡 80% (⅘) / ⏱️ 54.6s / 💰 $0.11
13b_pending_node_selector_detailed 🔗 🔴 0% (0/5) / ⏱️ 42.9s / 💰 $0.13 🟡 80% (⅘) / ⏱️ 45.4s / 💰 $0.09 🟡 40% (⅖) / ⏱️ 110.8s / 💰 $0.08 🟢 100% (5/5) / ⏱️ 66.7s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 63.6s / 💰 $0.14
14_pending_resources 🔗 🟢 100% (5/5) / ⏱️ 58.7s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 70.5s / 💰 $0.10 🟡 20% (⅕) / ⏱️ 70.4s / 💰 $0.04 🟢 100% (5/5) / ⏱️ 114.5s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 80.2s / 💰 $0.13
159_prometheus_high_cardinality_cpu[0] 🔗 🟢 100% (5/5) / ⏱️ 39.3s / 💰 $0.16 🟢 100% (5/5) / ⏱️ 51.0s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 231.2s / 💰 $0.20 🟢 100% (5/5) / ⏱️ 66.9s / 💰 $0.17 🟢 100% (5/5) / ⏱️ 59.8s / 💰 $0.16
159_prometheus_high_cardinality_cpu[1] 🔗 🟡 60% (⅗) / ⏱️ 43.7s / 💰 $0.20 🟡 80% (⅘) / ⏱️ 46.6s / 💰 $0.16 🟢 100% (5/5) / ⏱️ 154.2s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 84.9s / 💰 $0.22 🟢 100% (5/5) / ⏱️ 82.1s / 💰 $0.19
159_prometheus_high_cardinality_cpu[2] 🔗 🔴 0% (0/5) / ⏱️ 35.1s / 💰 $0.10 🟢 100% (5/5) / ⏱️ 50.6s / 💰 $0.16 🟢 100% (5/5) / ⏱️ 130.8s / 💰 $0.12 🟢 100% (5/5) / ⏱️ 155.1s / 💰 $0.22 🟡 20% (⅕) / ⏱️ 53.2s / 💰 $0.19
15_failed_readiness_probe 🔗 🟢 100% (5/5) / ⏱️ 42.9s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 45.5s / 💰 $0.09 🟡 80% (⅘) / ⏱️ 141.4s / 💰 $0.10 🟢 100% (5/5) / ⏱️ 88.2s / 💰 $0.14 🟢 100% (5/5) / ⏱️ 52.1s / 💰 $0.14
16_failed_no_toolset_found 🔗 🔴 0% (0/5) / ⏱️ 46.5s / 💰 $0.09 🔴 0% (0/5) / ⏱️ 38.1s / 💰 $0.03 🔴 0% (0/5) / ⏱️ 64.5s / 💰 $0.02 🟡 60% (⅗) / ⏱️ 38.1s / 💰 $0.06 🔴 0% (0/5) / ⏱️ 32.5s / 💰 $0.06
17_oom_kill 🔗 🟢 100% (5/5) / ⏱️ 55.6s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 59.1s / 💰 $0.08 🟡 60% (⅗) / ⏱️ 116.0s / 💰 $0.08 🟢 100% (5/5) / ⏱️ 71.5s / 💰 $0.12 🟢 100% (5/5) / ⏱️ 61.3s / 💰 $0.12
19_detect_missing_app_details 🔗 🟢 100% (5/5) / ⏱️ 78.8s / 💰 $0.44 🟡 80% (⅘) / ⏱️ 66.1s / 💰 $0.11 🟢 100% (5/5) / ⏱️ 267.1s / 💰 $0.18 🟢 100% (5/5) / ⏱️ 102.3s / 💰 $0.21 🟢 100% (5/5) / ⏱️ 95.1s / 💰 $0.16
20_long_log_file_search 🔗 🟢 100% (5/5) / ⏱️ 56.0s / 💰 $0.11 🟢 100% (5/5) / ⏱️ 57.5s / 💰 $0.06 🟢 100% (5/5) / ⏱️ 126.4s / 💰 $0.08 🟢 100% (5/5) / ⏱️ 123.3s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 84.4s / 💰 $0.11
21_job_fail_curl_no_svc_account 🔗 🟡 80% (⅘) / ⏱️ 51.1s / 💰 $0.25 🟢 100% (5/5) / ⏱️ 79.9s / 💰 $0.16 🟡 80% (⅘) / ⏱️ 174.0s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 74.5s / 💰 $0.21 🟢 100% (5/5) / ⏱️ 66.5s / 💰 $0.19
23_app_error_in_current_logs 🔗 🟢 100% (5/5) / ⏱️ 82.7s / 💰 $0.19 🟢 100% (5/5) / ⏱️ 91.4s / 💰 $0.30 🟢 100% (5/5) / ⏱️ 249.1s / 💰 $0.19 🟡 80% (⅘) / ⏱️ 78.8s / 💰 $0.25 🟢 100% (5/5) / ⏱️ 76.9s / 💰 $0.17
24_misconfigured_pvc 🔗 🟢 100% (5/5) / ⏱️ 60.4s / 💰 $0.17 🟢 100% (5/5) / ⏱️ 89.5s / 💰 $0.13 🔴 0% (0/5) / ⏱️ 58.4s / 💰 $0.02 🟢 100% (5/5) / ⏱️ 88.4s / 💰 $0.16 🟢 100% (5/5) / ⏱️ 112.6s / 💰 $0.17
24a_misconfigured_pvc_basic 🔗 🟡 80% (⅘) / ⏱️ 51.4s / 💰 $0.19 🟢 100% (5/5) / ⏱️ 72.4s / 💰 $0.10 🔴 0% (0/5) / ⏱️ 30.1s / 💰 $0.02 🟢 100% (5/5) / ⏱️ 75.8s / 💰 $0.15 🟢 100% (5/5) / ⏱️ 68.9s / 💰 $0.16
24b_misconfigured_pvc_detailed 🔗 🔴 0% (0/5) / ⏱️ 55.8s / 💰 $0.18 🟡 20% (⅕) / ⏱️ 59.5s / 💰 $0.12 🟡 20% (⅕) / ⏱️ 93.6s / 💰 $0.07 🟢 100% (5/5) / ⏱️ 89.7s / 💰 $0.17 🟢 100% (5/5) / ⏱️ 195.9s / 💰 $0.17
25_misconfigured_ingress_class 🔗 🔴 0% (0/5) / ⏱️ 48.5s / 💰 $0.13 🔴 0% (0/5) / ⏱️ 62.6s / 💰 $0.14 🟡 40% (⅖) / ⏱️ 187.9s / 💰 $0.10 🟢 100% (5/5) / ⏱️ 121.1s / 💰 $0.26 🟢 100% (5/5) / ⏱️ 100.2s / 💰 $0.35
26_page_render_times 🔗 🟢 100% (5/5) / ⏱️ 41.5s / 💰 $0.14 🟢 100% (5/5) / ⏱️ 42.0s / 💰 $0.10 🟢 100% (5/5) / ⏱️ 347.0s / 💰 $0.26 🟢 100% (5/5) / ⏱️ 73.5s / 💰 $0.16 🟢 100% (5/5) / ⏱️ 48.4s / 💰 $0.16
27a_multi_container_logs 🔗 🟢 100% (5/5) / ⏱️ 44.7s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 53.5s / 💰 $0.10 🟢 100% (5/5) / ⏱️ 197.6s / 💰 $0.12 🟢 100% (5/5) / ⏱️ 75.2s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 47.2s / 💰 $0.12
27b_multi_container_logs 🔗 🟢 100% (5/5) / ⏱️ 56.3s / 💰 $0.14 🟡 80% (⅘) / ⏱️ 64.4s / 💰 $0.08 🟡 80% (⅘) / ⏱️ 124.0s / 💰 $0.08 🟢 100% (5/5) / ⏱️ 55.0s / 💰 $0.11 🟢 100% (5/5) / ⏱️ 63.9s / 💰 $0.11
28_permissions_error 🔗 🟡 60% (⅗) / ⏱️ 22.3s / 💰 $0.04 🟡 40% (⅖) / ⏱️ 26.9s / 💰 $0.05 🟡 40% (⅖) / ⏱️ 138.5s / 💰 $0.09 🔴 0% (0/5) / ⏱️ 32.5s / 💰 $0.07 🔴 0% (0/5) / ⏱️ 27.3s / 💰 $0.07
33_cpu_metrics_discovery 🔗 🟢 100% (5/5) / ⏱️ 46.7s / 💰 $0.09 🟢 100% (5/5) / ⏱️ 58.9s / 💰 $0.09 🟢 100% (5/5) / ⏱️ 266.9s / 💰 $0.22 🟢 100% (5/5) / ⏱️ 76.5s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 59.9s / 💰 $0.13
39_failed_toolset 🔗 🟢 100% (5/5) / ⏱️ 27.2s / 💰 $0.04 🟡 40% (⅖) / ⏱️ 40.8s / 💰 $0.07 🟡 80% (⅘) / ⏱️ 251.5s / 💰 $0.19 🟡 80% (⅘) / ⏱️ 169.5s / 💰 $0.09 🟢 100% (5/5) / ⏱️ 56.9s / 💰 $0.11
41_setup_argo 🔗 🟡 80% (⅘) / ⏱️ 49.1s / 💰 $0.03 🟢 100% (5/5) / ⏱️ 35.2s / 💰 $0.02 🟢 100% (5/5) / ⏱️ 171.0s / 💰 $0.09 🟢 100% (5/5) / ⏱️ 29.1s / 💰 $0.06 🟢 100% (5/5) / ⏱️ 30.0s / 💰 $0.06
42_dns_issues_result_new_tools_no_runbook 🔗 🟡 60% (⅗) / ⏱️ 55.0s / 💰 $0.22 🟡 60% (⅗) / ⏱️ 80.6s / 💰 $0.18 🟢 100% (5/5) / ⏱️ 291.8s / 💰 $0.23 🟢 100% (5/5) / ⏱️ 163.9s / 💰 $0.36 🟢 100% (5/5) / ⏱️ 109.7s / 💰 $0.26
42_dns_issues_steps_new_tools 🔗 🟢 100% (5/5) / ⏱️ 56.5s / 💰 $0.14 🟢 100% (5/5) / ⏱️ 62.4s / 💰 $0.14 🟢 100% (5/5) / ⏱️ 471.2s / 💰 $0.23 🟢 100% (5/5) / ⏱️ 165.8s / 💰 $0.26 🟢 100% (5/5) / ⏱️ 157.3s / 💰 $0.31
43_current_datetime_from_prompt 🔗 🟢 100% (5/5) / ⏱️ 32.6s / 💰 $0.02 🟢 100% (5/5) / ⏱️ 42.0s / 💰 $0.04 🟢 100% (5/5) / ⏱️ 66.7s / 💰 $0.03 🟢 100% (5/5) / ⏱️ 23.5s / 💰 $0.06 🟢 100% (5/5) / ⏱️ 23.4s / 💰 $0.06
45_fetch_deployment_logs_simple 🔗 🟢 100% (5/5) / ⏱️ 37.8s / 💰 $0.11 🟢 100% (5/5) / ⏱️ 46.1s / 💰 $0.07 🟢 100% (5/5) / ⏱️ 100.0s / 💰 $0.09 🟢 100% (5/5) / ⏱️ 41.4s / 💰 $0.09 🟢 100% (5/5) / ⏱️ 50.7s / 💰 $0.11
50a_logs_since_last_specific_month 🔗 🟢 100% (5/5) / ⏱️ 41.9s / 💰 $0.10 🟢 100% (5/5) / ⏱️ 51.0s / 💰 $0.05 🟢 100% (5/5) / ⏱️ 314.7s / 💰 $0.11 🟢 100% (5/5) / ⏱️ 54.4s / 💰 $0.10 🟢 100% (5/5) / ⏱️ 47.2s / 💰 $0.09
51_logs_summarize_errors 🔗 🟢 100% (5/5) / ⏱️ 45.9s / 💰 $0.12 🟢 100% (5/5) / ⏱️ 46.7s / 💰 $0.06 🟢 100% (5/5) / ⏱️ 133.0s / 💰 $0.08 🟢 100% (5/5) / ⏱️ 159.3s / 💰 $0.10 🟢 100% (5/5) / ⏱️ 55.1s / 💰 $0.10
52_logs_login_issues 🔗 🟡 40% (⅖) / ⏱️ 84.3s / 💰 $0.12 🟢 100% (5/5) / ⏱️ 78.5s / 💰 $0.38 🟡 60% (⅗) / ⏱️ 152.1s / 💰 $0.11 🟢 100% (5/5) / ⏱️ 69.7s / 💰 $0.11 🟢 100% (5/5) / ⏱️ 61.7s / 💰 $0.11
53_logs_find_term 🔗 🟢 100% (5/5) / ⏱️ 37.7s / 💰 $0.14 🟢 100% (5/5) / ⏱️ 46.7s / 💰 $0.11 🟢 100% (5/5) / ⏱️ 107.3s / 💰 $0.08 🟢 100% (5/5) / ⏱️ 50.9s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 53.2s / 💰 $0.13
54_not_truncated_when_getting_pods 🔗 🟢 100% (5/5) / ⏱️ 58.6s / 💰 $0.12 🟢 100% (5/5) / ⏱️ 69.7s / 💰 $0.11 🟡 80% (⅘) / ⏱️ 196.2s / 💰 $0.15 🟢 100% (5/5) / ⏱️ 142.2s / 💰 $0.15 🟡 80% (⅘) / ⏱️ 65.7s / 💰 $0.11
57_wrong_namespace 🔗 🔴 0% (0/5) / ⏱️ 40.6s / 💰 $0.10 🔴 0% (0/5) / ⏱️ 47.2s / 💰 $0.06 🟢 100% (5/5) / ⏱️ 145.2s / 💰 $0.08 🟡 60% (⅗) / ⏱️ 77.4s / 💰 $0.09 🟢 100% (5/5) / ⏱️ 91.7s / 💰 $0.10
59_label_based_counting 🔗 🟢 100% (5/5) / ⏱️ 33.8s / 💰 $0.09 🟢 100% (5/5) / ⏱️ 32.9s / 💰 $0.05 🟢 100% (5/5) / ⏱️ 77.8s / 💰 $0.03 🟢 100% (5/5) / ⏱️ 51.4s / 💰 $0.08 🟢 100% (5/5) / ⏱️ 34.9s / 💰 $0.08
60_count_less_than 🔗 🟢 100% (5/5) / ⏱️ 85.4s / 💰 $0.11 🟢 100% (5/5) / ⏱️ 56.3s / 💰 $0.06 🟡 80% (⅘) / ⏱️ 88.5s / 💰 $0.05 🟢 100% (5/5) / ⏱️ 37.1s / 💰 $0.08 🟢 100% (5/5) / ⏱️ 36.5s / 💰 $0.09
61_exact_match_counting 🔗 🟢 100% (5/5) / ⏱️ 33.9s / 💰 $0.08 🟢 100% (5/5) / ⏱️ 34.0s / 💰 $0.05 🟢 100% (5/5) / ⏱️ 60.9s / 💰 $0.04 🟢 100% (5/5) / ⏱️ 32.2s / 💰 $0.07 🟢 100% (5/5) / ⏱️ 36.0s / 💰 $0.08
62_fetch_error_logs_with_errors 🔗 🟢 100% (5/5) / ⏱️ 60.6s / 💰 $0.11 🟢 100% (5/5) / ⏱️ 50.9s / 💰 $0.07 🟡 80% (⅘) / ⏱️ 102.9s / 💰 $0.06 🟢 100% (5/5) / ⏱️ 46.9s / 💰 $0.09 🟢 100% (5/5) / ⏱️ 43.3s / 💰 $0.09
63_fetch_error_logs_no_errors 🔗 🟢 100% (5/5) / ⏱️ 39.9s / 💰 $0.11 🟢 100% (5/5) / ⏱️ 45.1s / 💰 $0.07 🟡 60% (⅗) / ⏱️ 138.8s / 💰 $0.11 🟢 100% (5/5) / ⏱️ 46.6s / 💰 $0.09 🟡 80% (⅘) / ⏱️ 39.6s / 💰 $0.07
64_keda_vs_hpa_confusion 🔗 🔴 0% (0/5) / ⏱️ 71.8s / 💰 $0.42 🔴 0% (0/5) / ⏱️ 51.2s / 💰 $0.08 🟡 80% (⅘) / ⏱️ 191.3s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 112.3s / 💰 $0.20 🟢 100% (5/5) / ⏱️ 93.1s / 💰 $0.20
65_health_check_followup 🔗 🟢 100% (5/5) / ⏱️ 50.3s / 💰 $0.18 🟢 100% (5/5) / ⏱️ 69.6s / 💰 $0.22 🟡 80% (⅘) / ⏱️ 277.0s / 💰 $0.20 🟢 100% (5/5) / ⏱️ 328.5s / 💰 $0.24 🟢 100% (5/5) / ⏱️ 94.4s / 💰 $0.27
71_connection_pool_starvation 🔗 🟡 80% (⅘) / ⏱️ 47.1s / 💰 $0.17 🟢 100% (5/5) / ⏱️ 49.2s / 💰 $0.10 🟡 20% (⅕) / ⏱️ 152.5s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 59.9s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 65.3s / 💰 $0.17
73a_time_window_anomaly 🔗 🔴 0% (0/5) / ⏱️ 48.7s / 💰 $0.15 🟡 20% (⅕) / ⏱️ 58.6s / 💰 $0.07 🟢 100% (5/5) / ⏱️ 187.9s / 💰 $0.13 🔴 0% (0/5) / ⏱️ 84.2s / 💰 $0.13 🟡 40% (⅖) / ⏱️ 81.8s / 💰 $0.15
73b_time_window_anomaly 🔗 🟡 60% (⅗) / ⏱️ 56.0s / 💰 $0.16 🟡 40% (⅖) / ⏱️ 68.9s / 💰 $0.08 🟡 80% (⅘) / ⏱️ 165.5s / 💰 $0.14 🟢 100% (5/5) / ⏱️ 189.3s / 💰 $0.14 🟢 100% (5/5) / ⏱️ 67.7s / 💰 $0.14
76_service_discovery_issue 🔗 🔴 0% (0/5) / ⏱️ 45.5s / 💰 $0.20 🟢 100% (5/5) / ⏱️ 66.1s / 💰 $0.15 🟢 100% (5/5) / ⏱️ 205.8s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 67.4s / 💰 $0.22 🟢 100% (5/5) / ⏱️ 65.1s / 💰 $0.16
77_liveness_probe_misconfiguration 🔗 🟡 40% (⅖) / ⏱️ 42.7s / 💰 $0.15 🟢 100% (5/5) / ⏱️ 58.6s / 💰 $0.08 🟢 100% (5/5) / ⏱️ 182.8s / 💰 $0.12 🟢 100% (5/5) / ⏱️ 69.0s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 54.0s / 💰 $0.13
78a_missing_cpu_limits 🔗 🔴 0% (0/5) / ⏱️ 49.4s / 💰 $0.14 🟢 100% (5/5) / ⏱️ 59.7s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 206.0s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 72.8s / 💰 $0.12 🟢 100% (5/5) / ⏱️ 65.1s / 💰 $0.14
78b_cpu_quota_exceeded 🔗 🔴 0% (0/5) / ⏱️ 55.1s / 💰 $0.18 🟡 20% (⅕) / ⏱️ 49.1s / 💰 $0.09 🟢 100% (5/5) / ⏱️ 152.7s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 73.5s / 💰 $0.12 🟢 100% (5/5) / ⏱️ 61.6s / 💰 $0.14
79_configmap_mount_issue 🔗 🟢 100% (5/5) / ⏱️ 42.6s / 💰 $0.10 🟢 100% (5/5) / ⏱️ 47.1s / 💰 $0.07 🟢 100% (5/5) / ⏱️ 197.3s / 💰 $0.12 🟢 100% (5/5) / ⏱️ 61.4s / 💰 $0.11 🟢 100% (5/5) / ⏱️ 69.9s / 💰 $0.12
80_pvc_storage_class_mismatch 🔗 🔴 0% (0/5) / ⏱️ 76.1s / 💰 $0.12 🔴 0% (0/5) / ⏱️ 64.0s / 💰 $0.08 🟢 100% (5/5) / ⏱️ 191.5s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 89.4s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 72.9s / 💰 $0.14
81_service_account_permission_denied 🔗 🟡 20% (⅕) / ⏱️ 47.2s / 💰 $0.14 🟡 80% (⅘) / ⏱️ 56.2s / 💰 $0.11 🟡 80% (⅘) / ⏱️ 198.0s / 💰 $0.15 🟢 100% (5/5) / ⏱️ 103.5s / 💰 $0.21 🟢 100% (5/5) / ⏱️ 73.0s / 💰 $0.17
82_pod_anti_affinity_conflict 🔗 🟢 100% (5/5) / ⏱️ 55.6s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 61.8s / 💰 $0.08 🟢 100% (5/5) / ⏱️ 173.8s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 77.3s / 💰 $0.14 🟢 100% (5/5) / ⏱️ 108.8s / 💰 $0.14
83_secret_not_found 🔗 🟢 100% (5/5) / ⏱️ 44.7s / 💰 $0.15 🟢 100% (5/5) / ⏱️ 44.8s / 💰 $0.08 🟢 100% (5/5) / ⏱️ 185.9s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 81.7s / 💰 $0.11 🟢 100% (5/5) / ⏱️ 85.1s / 💰 $0.12
84_network_policy_blocking_traffic 🔗 🟡 20% (⅕) / ⏱️ 47.4s / 💰 $0.18 🟡 80% (⅘) / ⏱️ 58.1s / 💰 $0.14 🟢 100% (5/5) / ⏱️ 226.9s / 💰 $0.14 🟢 100% (5/5) / ⏱️ 131.7s / 💰 $0.24 🟢 100% (5/5) / ⏱️ 85.7s / 💰 $0.23
85_hpa_not_scaling 🔗 🔴 0% (0/5) / ⏱️ 42.0s / 💰 $0.11 🟡 80% (⅘) / ⏱️ 60.3s / 💰 $0.12 🟢 100% (5/5) / ⏱️ 183.9s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 67.2s / 💰 $0.16 🟢 100% (5/5) / ⏱️ 68.2s / 💰 $0.17
86_configmap_like_but_secret 🔗 🟡 80% (⅘) / ⏱️ 50.8s / 💰 $0.18 🟢 100% (5/5) / ⏱️ 58.3s / 💰 $0.10 🟢 100% (5/5) / ⏱️ 227.5s / 💰 $0.17 🟢 100% (5/5) / ⏱️ 76.0s / 💰 $0.13 🟢 100% (5/5) / ⏱️ 158.1s / 💰 $0.15
89_runbook_missing_cloudwatch 🔗 🟡 80% (⅘) / ⏱️ 44.0s / 💰 $0.07 🟡 80% (⅘) / ⏱️ 31.9s / 💰 $0.04 🟢 100% (5/5) / ⏱️ 258.4s / 💰 $0.15 🟢 100% (5/5) / ⏱️ 55.3s / 💰 $0.11 🟢 100% (5/5) / ⏱️ 47.5s / 💰 $0.11
90_runbook_basic_selection 🔗 🟢 100% (5/5) / ⏱️ 58.0s / 💰 $0.20 🟢 100% (5/5) / ⏱️ 71.2s / 💰 $0.16 🟢 100% (5/5) / ⏱️ 365.1s / 💰 $0.29 🟢 100% (5/5) / ⏱️ 216.9s / 💰 $0.49 🟡 80% (⅘) / ⏱️ 138.6s / 💰 $0.47
91f_datadog_logs_historical_pod 🔗 🔴 0% (0/5) / ⏱️ 46.0s / 💰 $0.16 🟡 20% (⅕) / ⏱️ 64.1s / 💰 $0.14 🟡 80% (⅘) / ⏱️ 302.2s / 💰 $0.19 🟢 100% (5/5) / ⏱️ 74.8s / 💰 $0.15 🟢 100% (5/5) / ⏱️ 67.1s / 💰 $0.14
93_calling_datadog[0] 🔗 🟢 100% (5/5) / ⏱️ 61.2s / 💰 $0.12 🟢 100% (5/5) / ⏱️ 15.6s / 💰 $0.07 🟢 100% (5/5) / ⏱️ 54.2s / 💰 $0.09 🟢 100% (5/5) / ⏱️ 13.7s / 💰 $0.15 🟢 100% (5/5) / ⏱️ 12.5s / 💰 $0.15
93_calling_datadog[1] 🔗 🟢 100% (5/5) / ⏱️ 73.2s / 💰 $0.12 🟢 100% (5/5) / ⏱️ 12.9s / 💰 $0.07 🟢 100% (5/5) / ⏱️ 63.4s / 💰 $0.08 🟢 100% (5/5) / ⏱️ 20.4s / 💰 $0.15 🟢 100% (5/5) / ⏱️ 11.8s / 💰 $0.15
94_runbook_transparency 🔗 🟢 100% (5/5) / ⏱️ 60.9s / 💰 $0.25 🟢 100% (5/5) / ⏱️ 85.7s / 💰 $0.20 🟢 100% (5/5) / ⏱️ 309.8s / 💰 $0.25 🟢 100% (5/5) / ⏱️ 116.3s / 💰 $0.23 🟢 100% (5/5) / ⏱️ 94.6s / 💰 $0.24
96_no_matching_runbook 🔗 🔴 0% (0/5) / ⏱️ 56.5s / 💰 $0.22 🔴 0% (0/5) / ⏱️ 128.6s / 💰 $0.55 🟡 60% (⅗) / ⏱️ 304.2s / 💰 $0.20 🟢 100% (5/5) / ⏱️ 203.2s / 💰 $0.57 🟢 100% (5/5) / ⏱️ 119.7s / 💰 $0.27
97_logs_clarification_needed 🔗 🟢 100% (5/5) / ⏱️ 18.7s / 💰 $0.03 🟢 100% (5/5) / ⏱️ 30.9s / 💰 $0.03 🟢 100% (5/5) / ⏱️ 32.2s / 💰 $0.02 🟢 100% (5/5) / ⏱️ 95.0s / 💰 $0.19 🟢 100% (5/5) / ⏱️ 21.8s / 💰 $0.06
99_logs_transparency_custom_time 🔗 🟢 100% (5/5) / ⏱️ 38.0s / 💰 $0.12 🟢 100% (5/5) / ⏱️ 46.2s / 💰 $0.09 🟢 100% (5/5) / ⏱️ 99.7s / 💰 $0.07 🟢 100% (5/5) / ⏱️ 89.6s / 💰 $0.11 🟢 100% (5/5) / ⏱️ 95.6s / 💰 $0.11
50_logs_since_specific_date 🔗 🟢 100% (5/5) / ⏱️ 20.3s / 💰 $0.10 🟢 100% (4/4) / ⏱️ 25.4s / 💰 $0.06 🟢 100% (5/5) / ⏱️ 105.6s / 💰 $0.09 🟢 100% (5/5) / ⏱️ 35.3s / 💰 $0.12 🟢 100% (5/5) / ⏱️ 28.8s / 💰 $0.10
93_calling_datadog[2] 🔗 🟢 100% (5/5) / ⏱️ 57.6s / 💰 $0.12 🟢 100% (5/5) / ⏱️ 15.2s / 💰 $0.08 🟢 100% (4/4) / ⏱️ 72.4s / 💰 $0.09 🟢 100% (5/5) / ⏱️ 13.1s / 💰 $0.15 🟢 100% (5/5) / ⏱️ 11.9s / 💰 $0.15
93_events_since_specific_date 🔗 🟢 100% (4/4) / ⏱️ 20.2s / 💰 $0.10 🟢 100% (4/4) / ⏱️ 19.0s / 💰 $0.06 ⚪️ - 🟢 100% (5/5) / ⏱️ 24.3s / 💰 $0.10 🟢 100% (5/5) / ⏱️ 21.0s / 💰 $0.10
44_slack_statefulset_logs 🔗 ⚪️ - ⚪️ - ⚪️ - ⚪️ - ⚪️ -
48_logs_since_thursday 🔗 ⚪️ - ⚪️ - ⚪️ - ⚪️ - ⚪️ -
22_high_latency_dbi_down 🔗 ⚪️ - ⚪️ - ⚪️ - ⚪️ - ⚪️ -
08_sock_shop_frontend 🔗 ⚪️ - ⚪️ - ⚪️ - ⚪️ - ⚪️ -
104b_postgres_missing_index_pgstat 🔗 ⚪️ - ⚪️ - ⚪️ - ⚪️ - ⚪️ -
104c_postgres_minimal_missing_index 🔗 ⚪️ - ⚪️ - ⚪️ - ⚪️ - ⚪️ -
105_redis_wrong_data_structure 🔗 ⚪️ - ⚪️ - ⚪️ - ⚪️ - ⚪️ -
156_kafka_opensearch_latency 🔗 ⚪️ - ⚪️ - ⚪️ - ⚪️ - ⚪️ -
43_slack_deployment_logs 🔗 ⚪️ - ⚪️ - ⚪️ - ⚪️ - ⚪️ -
55_kafka_runbook 🔗 ⚪️ - ⚪️ - ⚪️ - ⚪️ - ⚪️ -
98_logs_transparency_default_time 🔗 ⚪️ - ⚪️ - ⚪️ - ⚪️ - ⚪️ -

Results are automatically generated and updated weekly. View full traces and detailed analysis in Braintrust experiment: local-benchmark-20250930-092035.