AKS Node Health¶
Consider Azure MCP instead
Most users should start with the Azure MCP integration, which provides broad access to all Azure APIs including AKS node diagnostics. This standalone toolset is only needed if you require specific AKS node health CLI commands that aren't available through the MCP server.
By enabling this toolset, HolmesGPT will be able to perform specialized health checks and troubleshooting for Azure Kubernetes Service (AKS) nodes, including node-specific diagnostics and performance analysis.
Prerequisites¶
- Azure CLI installed and configured
- Appropriate Azure RBAC permissions for AKS clusters
- Access to the target AKS cluster
- Node-level access permissions
Configuration¶
First, ensure you're authenticated with Azure:
Then add the following to ~/.holmes/config.yaml. Create the file if it doesn't exist:
toolsets:
aks/node-health:
enabled: true
config:
subscription_id: "<your Azure subscription ID>"
resource_group: "<your AKS resource group>"
cluster_name: "<your AKS cluster name>"
After making changes to your configuration, run:
Advanced Configuration¶
You can configure additional health check parameters:
toolsets:
aks/node-health:
enabled: true
config:
subscription_id: "<your Azure subscription ID>"
resource_group: "<your AKS resource group>"
cluster_name: "<your AKS cluster name>"
health_check_interval: 300 # Health check interval in seconds
max_unhealthy_nodes: 3 # Maximum number of unhealthy nodes to report
Capabilities¶
| Tool Name | Description |
|---|---|
| aks_check_node_health | Perform comprehensive health checks on AKS nodes |
| aks_get_node_metrics | Get detailed metrics for AKS nodes |
| aks_diagnose_node_issues | Diagnose common node-level issues |
| aks_check_node_readiness | Check if nodes are ready and schedulable |
| aks_get_node_events | Get events related to specific nodes |
| aks_check_node_resources | Check resource utilization on nodes |