Skip to content

Anthropic

Configure HolmesGPT to use Anthropic's Claude models.

Setup

Get an Anthropic API key.

Configuration

export ANTHROPIC_API_KEY="your-anthropic-api-key"
holmes ask "what pods are failing?" --model="anthropic/<your-claude-model>"

Create Kubernetes Secret:

kubectl create secret generic holmes-secrets \
  --from-literal=anthropic-api-key="sk-ant-..." \
  -n <namespace>

Configure Helm Values:

# values.yaml
additionalEnvVars:
  - name: ANTHROPIC_API_KEY
    valueFrom:
      secretKeyRef:
        name: holmes-secrets
        key: anthropic-api-key

# Configure at least one model using modelList
modelList:
  claude-sonnet-4:
    api_key: "{{ env.ANTHROPIC_API_KEY }}"
    model: claude-sonnet-4-20250514
    temperature: 1
    thinking:
      budget_tokens: 10000
      type: enabled

  claude-opus-4:
    api_key: "{{ env.ANTHROPIC_API_KEY }}"
    model: anthropic/claude-opus-4-1-20250805
    temperature: 1

# Optional: Set default model (use modelList key name, not the model path)
config:
  model: "claude-sonnet-4"  # This refers to the key name in modelList above

Create Kubernetes Secret:

kubectl create secret generic robusta-holmes-secret \
  --from-literal=anthropic-api-key="sk-ant-..." \
  -n <namespace>

Configure Helm Values:

# values.yaml
holmes:
  additionalEnvVars:
    - name: ANTHROPIC_API_KEY
      valueFrom:
        secretKeyRef:
          name: robusta-holmes-secret
          key: anthropic-api-key

  # Configure at least one model using modelList
  modelList:
    claude-sonnet-4:
      api_key: "{{ env.ANTHROPIC_API_KEY }}"
      model: claude-sonnet-4-20250514
      temperature: 1
      thinking:
        budget_tokens: 10000
        type: enabled

    claude-opus-4:
      api_key: "{{ env.ANTHROPIC_API_KEY }}"
      model: anthropic/claude-opus-4-1-20250805
      temperature: 1

  # Optional: Set default model (use modelList key name, not the model path)
  config:
    model: "claude-sonnet-4"  # This refers to the key name in modelList above

Using CLI Parameters

You can also pass the API key directly as a command-line parameter:

holmes ask "what pods are failing?" --model="anthropic/<your-claude-model>" --api-key="your-api-key"

Prompt Caching

HolmesGPT adds Anthropic's prompt caching feature, which can significantly reduce costs and latency for repeated API calls with similar prompts.

HolmesGPT automatically adds cache control to the last message in each API call. This caches everything from the beginning of the conversation up to that point, making subsequent calls with the same prefix much faster and cheaper.

How It Works

  • Anthropic uses prefix-based caching - it caches the exact sequence of messages up to the cache control point
  • The cache has a 5-minute lifetime by default
  • Cached content must be at least 1024 tokens to be effective
  • You're charged for cache writes on the first call, but subsequent cache hits are much cheaper

Benefits in HolmesGPT

Prompt caching is particularly effective for HolmesGPT because:

  • System prompts with tool definitions are large and static - perfect for caching
  • Tool investigation loops reuse the same context multiple times
  • Multi-step investigations benefit from cached conversation history

Additional Resources

HolmesGPT uses the LiteLLM API to support Anthropic provider. Refer to LiteLLM Anthropic docs for more details.