Kubernetes¶

Which Kubernetes toolset should I use?

Holmes has three Kubernetes integrations. Most users only need the first:

Kubernetes (built-in) — default for most users. Read-only access to cluster resources via kubectl, authenticated with the pod's ServiceAccount in-cluster or your local kubeconfig for CLI. No extra deployment.
Kubernetes (MCP) — use when you need OAuth/OIDC authentication (e.g. AKS with Microsoft Entra ID, or per-user RBAC enforced by your identity provider). Replaces the built-in toolset.
Kubernetes Remediation (MCP) — add on top of either of the above when you want Holmes to perform write actions (restart, scale, drain, patch, etc.). Complements the read-only toolsets rather than replacing them.

Toolsets¶

Core¶

Enabled by Default

This toolset is enabled by default and should typically remain enabled.

By enabling this toolset, HolmesGPT will be able to describe and find Kubernetes resources like nodes, deployments, pods, etc. The tools shell out to kubectl, authenticated with the pod's ServiceAccount when deployed in-cluster or with your local kubeconfig for CLI usage. Permissions are read-only by default — secrets and other sensitive resources are excluded.

Configuration:

holmes:
    toolsets:
        kubernetes/core:
            enabled: true

Capabilities:

Tool Name	Description
kubernetes_jq_query	Query Kubernetes resources using jq filters with pagination
kubernetes_tabular_query	Extract specific fields from resources in tabular format with optional filtering
kubernetes_count	Count Kubernetes resources matching a jq filter

Logs¶

Enabled by Default

This toolset is enabled by default. You do not need to configure it.

By enabling this toolset, HolmesGPT will be able to read Kubernetes pod logs.

Available Log Sources

Multiple logging toolsets can be enabled simultaneously. HolmesGPT will use the most appropriate source for each investigation.

Kubernetes logs - Direct pod log access (enabled by default)
Loki - Centralized logs via Loki
Elasticsearch / OpenSearch - Logs from Elasticsearch/OpenSearch
Coralogix - Logs via Coralogix platform
DataDog - Logs from DataDog

Configuration:

holmes:
    toolsets:
        kubernetes/logs:
            enabled: true

Capabilities:

Tool Name	Description
kubectl_logs	Fetch logs from a specific pod
kubectl_logs_all_containers	Fetch logs from all containers in a pod
kubectl_previous_logs	Fetch previous logs from a crashed pod
kubectl_previous_logs_all_containers	Fetch previous logs from all containers in a crashed pod
kubectl_container_previous_logs	Fetch previous logs from a specific container in a crashed pod
kubectl_container_logs	Fetch logs from a specific container in a pod
kubectl_logs_grep	Search for specific patterns in pod logs
kubectl_logs_all_containers_grep	Search for patterns in logs from all containers

Live Metrics¶

Not Enabled by Default

This toolset is only available when kubectl top is supported (requires Metrics Server).

This toolset retrieves real-time CPU and memory usage for pods and nodes.

Configuration:

holmes:
    toolsets:
        kubernetes/live-metrics:
            enabled: true

Capabilities:

Tool Name	Description
kubectl_top_pods	Get current CPU and memory usage for pods
kubectl_top_nodes	Get current CPU and memory usage for nodes

Kube Prometheus Stack¶

Not Enabled by Default

This toolset must be explicitly enabled.

This toolset uses kubectl to proxy into a Prometheus service running in-cluster and fetch target definitions. This is different from the Prometheus toolset, which connects directly to a Prometheus server for metrics querying.

Configuration:

holmes:
    toolsets:
        kubernetes/kube-prometheus-stack:
            enabled: true

Capabilities:

Tool Name	Description
get_prometheus_target	Fetch the definition of a Prometheus target via kubectl proxy

Resource Lineage¶

Not Enabled by Default

This toolset must be explicitly enabled. Requires kube-lineage installed either via kubectl krew or built from source.

Provides tools to fetch children/dependents and parents/dependencies of Kubernetes resources. Two variations are available depending on how kube-lineage is installed.

Configuration:

holmes:
    toolsets:
        kubernetes/kube-lineage-extras:
            enabled: true
        # OR if installed via krew:
        kubernetes/krew-extras:
            enabled: true

Capabilities:

Tool Name	Description
kubectl_lineage_children	Get child/dependent resources of a Kubernetes resource
kubectl_lineage_parents	Get parent/dependency resources of a Kubernetes resource

Adding Permissions for Additional Resources¶

In-Cluster Only

This section applies only to HolmesGPT running inside a Kubernetes cluster via Helm. For local CLI deployments, permissions are managed through your kubeconfig file.

HolmesGPT may require access to additional Kubernetes resources or CRDs for specific analyses. Permissions can be extended by modifying the ClusterRole rules.

Default CRD Permissions¶

HolmesGPT includes read-only permissions for common Kubernetes operators and tools by default. These can be individually enabled or disabled:

Holmes Helm ChartRobusta Helm Chart

crdPermissions:
  argo: true
  flux: true
  kafka: true
  keda: true
  crossplane: true
  istio: true
  gatewayApi: true
  velero: true
  externalSecrets: true

enableHolmesGPT: true
holmes:
  crdPermissions:
    argo: true
    flux: true
    kafka: true
    keda: true
    crossplane: true
    istio: true
    gatewayApi: true
    velero: true
    externalSecrets: true

Adding Custom Permissions¶

For resources not covered by the default CRD permissions, you can add custom ClusterRole rules.

Common scenarios:

External Integrations and CRDs - Access to custom resources from other operators
Additional Kubernetes resources - Resources not included in the default permissions

Example: Adding Cert-Manager Permissions

To enable HolmesGPT to analyze cert-manager certificates and issuers (not included in default permissions), add custom ClusterRole rules:

Holmes Helm ChartRobusta Helm Chart

Update your values.yaml:

customClusterRoleRules:
  - apiGroups: ["cert-manager.io"]
    resources: ["certificates", "certificaterequests", "issuers", "clusterissuers"]
    verbs: ["get", "list", "watch"]

Apply the configuration:

helm upgrade holmes holmes/holmes --values=values.yaml

Update your generated_values.yaml (note: add the holmes: prefix):

enableHolmesGPT: true
holmes:
  customClusterRoleRules:
    - apiGroups: ["cert-manager.io"]
      resources: ["certificates", "certificaterequests", "issuers", "clusterissuers"]
      verbs: ["get", "list", "watch"]

Apply the configuration:

helm upgrade robusta robusta/robusta --values=generated_values.yaml --set clusterName=<YOUR_CLUSTER_NAME>