Kubernetes¶

Which Kubernetes toolset should I use?

Holmes has three Kubernetes integrations. Most users only need the first:

Kubernetes (built-in) — default for most users. Read-only access to cluster resources via kubectl, authenticated with the pod's ServiceAccount in-cluster or your local kubeconfig for CLI. No extra deployment.
Kubernetes (MCP) — use when you need OAuth/OIDC authentication (e.g. AKS with Microsoft Entra ID, or per-user RBAC enforced by your identity provider). Replaces the built-in toolset.
Kubernetes Remediation (MCP) — add on top of either of the above when you want Holmes to perform write actions (restart, scale, drain, patch, etc.). Complements the read-only toolsets rather than replacing them.

Toolsets¶

Core¶

Enabled by Default

This toolset is enabled by default and should typically remain enabled.

By enabling this toolset, HolmesGPT will be able to describe and find Kubernetes resources like nodes, deployments, pods, etc. The tools shell out to kubectl, authenticated with the pod's ServiceAccount when deployed in-cluster or with your local kubeconfig for CLI usage. Permissions are read-only by default — secrets and other sensitive resources are excluded.

Configuration:

holmes:
    toolsets:
        kubernetes/core:
            enabled: true

Capabilities:

Tool Name	Description
kubernetes_jq_query	Query Kubernetes resources using jq filters with pagination
kubernetes_tabular_query	Extract specific fields from resources in tabular format with optional filtering
kubernetes_count	Count Kubernetes resources matching a jq filter

Logs¶

Enabled by Default

This toolset is enabled by default. You do not need to configure it.

By enabling this toolset, HolmesGPT will be able to read Kubernetes pod logs.

Available Log Sources

Multiple logging toolsets can be enabled simultaneously. HolmesGPT will use the most appropriate source for each investigation.

Kubernetes logs - Direct pod log access (enabled by default)
Loki - Centralized logs via Loki
Elasticsearch / OpenSearch - Logs from Elasticsearch/OpenSearch
Coralogix - Logs via Coralogix platform
DataDog - Logs from DataDog

Configuration:

holmes:
    toolsets:
        kubernetes/logs:
            enabled: true

Capabilities:

Tool Name	Description
kubectl_logs	Fetch logs from a specific pod
kubectl_logs_all_containers	Fetch logs from all containers in a pod
kubectl_previous_logs	Fetch previous logs from a crashed pod
kubectl_previous_logs_all_containers	Fetch previous logs from all containers in a crashed pod
kubectl_container_previous_logs	Fetch previous logs from a specific container in a crashed pod
kubectl_container_logs	Fetch logs from a specific container in a pod
kubectl_logs_grep	Search for specific patterns in pod logs
kubectl_logs_all_containers_grep	Search for patterns in logs from all containers

Live Metrics¶

Not Enabled by Default

This toolset is only available when kubectl top is supported (requires Metrics Server).

This toolset retrieves real-time CPU and memory usage for pods and nodes.

Configuration:

holmes:
    toolsets:
        kubernetes/live-metrics:
            enabled: true

Capabilities:

Tool Name	Description
kubectl_top_pods	Get current CPU and memory usage for pods
kubectl_top_nodes	Get current CPU and memory usage for nodes

Kube Prometheus Stack¶

Not Enabled by Default

This toolset must be explicitly enabled.

This toolset uses kubectl to proxy into a Prometheus service running in-cluster and fetch target definitions. This is different from the Prometheus toolset, which connects directly to a Prometheus server for metrics querying.

Configuration:

holmes:
    toolsets:
        kubernetes/kube-prometheus-stack:
            enabled: true

Capabilities:

Tool Name	Description
get_prometheus_target	Fetch the definition of a Prometheus target via kubectl proxy

Resource Lineage¶

Not Enabled by Default

This toolset must be explicitly enabled. Requires kube-lineage installed either via kubectl krew or built from source.

Provides tools to fetch children/dependents and parents/dependencies of Kubernetes resources. Two variations are available depending on how kube-lineage is installed.

Configuration:

holmes:
    toolsets:
        kubernetes/kube-lineage-extras:
            enabled: true
        # OR if installed via krew:
        kubernetes/krew-extras:
            enabled: true

Capabilities:

Tool Name	Description
kubectl_lineage_children	Get child/dependent resources of a Kubernetes resource
kubectl_lineage_parents	Get parent/dependency resources of a Kubernetes resource

Permissions¶

Read-Only by Default

The permissions described on this page are read-only (get, list, watch). The built-in Kubernetes toolset does not modify, create, delete, or update any Kubernetes resources — it only reads cluster information for troubleshooting and analysis.

If you want HolmesGPT to also take remediating actions (restart pods, scale deployments, etc.), you can opt in by enabling the Kubernetes Remediation (MCP) toolset, which grants scoped write access alongside the read-only toolset.

How HolmesGPT Inherits Permissions¶

HolmesGPT inherits permissions for accessing Kubernetes from its environment:

When running locally: HolmesGPT uses your current kubectl context and the permissions configured in your kubeconfig file.
When running in-cluster: HolmesGPT uses the ServiceAccount defined in the Helm chart. The Helm chart automatically creates a ServiceAccount, ClusterRole, and ClusterRoleBinding when createServiceAccount: true (default). See the Service Account Configuration section for details.

The complete ServiceAccount, ClusterRole, and ClusterRoleBinding definitions can be found in the Helm chart template:

View Service Account Template

Adaptive Behavior¶

HolmesGPT automatically adjusts its behavior based on available permissions:

You can modify these permissions and HolmesGPT will automatically adapt to work with whatever permissions are available.
If HolmesGPT tries to run kubectl commands that it doesn't have permissions for, it will discover the lack of permissions and adjust its behavior accordingly. It will work with the resources it can access and inform you about any limitations.

Recommended Permissions¶

For most users, we recommend giving read-access to all non-sensitive resources in the cluster. This allows HolmesGPT to:

Investigate issues across all namespaces
Access logs and events
Analyze resource configurations
Provide comprehensive troubleshooting insights

The default permissions created by the Helm chart follow this recommendation and include read-only access (get, list, watch) to core Kubernetes resources, custom resources, and monitoring resources across all namespaces.

Adding Permissions for Additional Resources¶

In-Cluster Only

This section applies only to HolmesGPT running inside a Kubernetes cluster via Helm. For local CLI deployments, permissions are managed through your kubeconfig file.

HolmesGPT may require access to additional Kubernetes resources or CRDs for specific analyses. Permissions can be extended by modifying the ClusterRole rules.

Default CRD Permissions¶

HolmesGPT includes read-only permissions for common Kubernetes operators and tools by default. These can be individually enabled or disabled:

Holmes Helm ChartRobusta Helm Chart

crdPermissions:
  argo: true
  flux: true
  kafka: true
  keda: true
  crossplane: true
  istio: true
  gatewayApi: true
  velero: true
  externalSecrets: true

enableHolmesGPT: true
holmes:
  crdPermissions:
    argo: true
    flux: true
    kafka: true
    keda: true
    crossplane: true
    istio: true
    gatewayApi: true
    velero: true
    externalSecrets: true

Adding Custom Permissions¶

For resources not covered by the default CRD permissions, you can add custom ClusterRole rules.

Common scenarios:

External Integrations and CRDs - Access to custom resources from other operators
Additional Kubernetes resources - Resources not included in the default permissions

Example: Adding Cert-Manager Permissions

To enable HolmesGPT to analyze cert-manager certificates and issuers (not included in default permissions), add custom ClusterRole rules:

Holmes Helm ChartRobusta Helm Chart

Update your values.yaml:

customClusterRoleRules:
  - apiGroups: ["cert-manager.io"]
    resources: ["certificates", "certificaterequests", "issuers", "clusterissuers"]
    verbs: ["get", "list", "watch"]

Apply the configuration:

helm upgrade holmes holmes/holmes --values=values.yaml

Update your generated_values.yaml (note: add the holmes: prefix):

enableHolmesGPT: true
holmes:
  customClusterRoleRules:
    - apiGroups: ["cert-manager.io"]
      resources: ["certificates", "certificaterequests", "issuers", "clusterissuers"]
      verbs: ["get", "list", "watch"]

Apply the configuration:

helm upgrade robusta robusta/robusta --values=generated_values.yaml --set clusterName=<YOUR_CLUSTER_NAME>

Using an Existing ServiceAccount¶

If you prefer to use an existing ServiceAccount with custom permissions instead of having the Helm chart create one:

Holmes Helm ChartRobusta Helm Chart

createServiceAccount: false
customServiceAccountName: "your-existing-service-account"

enableHolmesGPT: true
holmes:
  createServiceAccount: false
  customServiceAccountName: "your-existing-service-account"