Skip to content

Commit 98996ea

Browse files
committed
WIP kubevirt: Add troubleshoot VM tool to toolset
Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Lee Yarwood <lyarwood@redhat.com>
1 parent bb6e786 commit 98996ea

File tree

4 files changed

+361
-0
lines changed

4 files changed

+361
-0
lines changed

pkg/toolsets/kubevirt/toolset.go

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ import (
77
internalk8s "github.com/containers/kubernetes-mcp-server/pkg/kubernetes"
88
"github.com/containers/kubernetes-mcp-server/pkg/toolsets"
99
vm_create "github.com/containers/kubernetes-mcp-server/pkg/toolsets/kubevirt/vm/create"
10+
vm_troubleshoot "github.com/containers/kubernetes-mcp-server/pkg/toolsets/kubevirt/vm/troubleshoot"
1011
)
1112

1213
type Toolset struct{}
@@ -24,6 +25,7 @@ func (t *Toolset) GetDescription() string {
2425
func (t *Toolset) GetTools(o internalk8s.Openshift) []api.ServerTool {
2526
return slices.Concat(
2627
vm_create.Tools(),
28+
vm_troubleshoot.Tools(),
2729
)
2830
}
2931

Lines changed: 151 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,151 @@
1+
# VirtualMachine Troubleshooting Guide
2+
3+
## VM: {{.Name}} (namespace: {{.Namespace}})
4+
5+
Follow these steps to diagnose issues with the VirtualMachine:
6+
7+
---
8+
9+
## Step 1: Check VirtualMachine Status
10+
11+
Use the `resources_get` tool to inspect the VirtualMachine:
12+
- **apiVersion**: `kubevirt.io/v1`
13+
- **kind**: `VirtualMachine`
14+
- **namespace**: `{{.Namespace}}`
15+
- **name**: `{{.Name}}`
16+
17+
**What to look for:**
18+
- `status.printableStatus` - Should be "Running" for a healthy VM
19+
- `status.ready` - Should be `true`
20+
- `status.conditions` - Look for conditions with `status: "False"` or error messages
21+
- `spec.runStrategy` - Check if it's "Always", "Manual", "Halted", or "RerunOnFailure"
22+
23+
---
24+
25+
## Step 2: Check VirtualMachineInstance Status
26+
27+
If the VM exists but isn't running, check if a VirtualMachineInstance was created:
28+
29+
Use the `resources_get` tool:
30+
- **apiVersion**: `kubevirt.io/v1`
31+
- **kind**: `VirtualMachineInstance`
32+
- **namespace**: `{{.Namespace}}`
33+
- **name**: `{{.Name}}`
34+
35+
**What to look for:**
36+
- `status.phase` - Should be "Running" for a healthy VMI
37+
- `status.conditions` - Check for "Ready" condition with `status: "True"`
38+
- `status.guestOSInfo` - Confirms guest agent is running
39+
- If VMI doesn't exist and VM runStrategy is "Always", this indicates a problem
40+
41+
---
42+
43+
## Step 3: Check DataVolume Status (if applicable)
44+
45+
If the VM uses DataVolumeTemplates, check their status:
46+
47+
Use the `resources_list` tool:
48+
- **apiVersion**: `cdi.kubevirt.io/v1beta1`
49+
- **kind**: `DataVolume`
50+
- **namespace**: `{{.Namespace}}`
51+
52+
Look for DataVolumes with names starting with `{{.Name}}-`
53+
54+
**What to look for:**
55+
- `status.phase` - Should be "Succeeded" when ready
56+
- `status.progress` - Shows import/clone progress (e.g., "100.0%")
57+
- Common issues:
58+
- Phase "Pending" - Waiting for resources
59+
- Phase "ImportScheduled" or "ImportInProgress" - Still importing
60+
- Phase "Failed" - Check `status.conditions` for error details
61+
62+
---
63+
64+
## Step 4: Check virt-launcher Pod
65+
66+
The virt-launcher pod runs the actual VM. Find and inspect it:
67+
68+
Use the `pods_list_in_namespace` tool:
69+
- **namespace**: `{{.Namespace}}`
70+
- **labelSelector**: `kubevirt.io=virt-launcher,vm.kubevirt.io/name={{.Name}}`
71+
72+
**What to look for:**
73+
- Pod should be in "Running" phase
74+
- All containers should be ready (e.g., "2/2")
75+
- Check pod events and conditions for errors
76+
77+
If pod exists, get detailed status with `pods_get`:
78+
- **namespace**: `{{.Namespace}}`
79+
- **name**: `virt-launcher-{{.Name}}-xxxxx` (use actual pod name from list)
80+
81+
Get pod logs with `pods_log`:
82+
- **namespace**: `{{.Namespace}}`
83+
- **name**: `virt-launcher-{{.Name}}-xxxxx`
84+
- **container**: `compute` (main VM container)
85+
86+
---
87+
88+
## Step 5: Check Events
89+
90+
Events provide crucial diagnostic information:
91+
92+
Use the `events_list` tool:
93+
- **namespace**: `{{.Namespace}}`
94+
95+
Filter output for events related to `{{.Name}}` - look for warnings or errors.
96+
97+
---
98+
99+
## Step 6: Check Instance Type and Preference (if used)
100+
101+
If the VM uses instance types or preferences, verify they exist:
102+
103+
For instance types, use `resources_get`:
104+
- **apiVersion**: `instancetype.kubevirt.io/v1beta1`
105+
- **kind**: `VirtualMachineClusterInstancetype`
106+
- **name**: (check VM spec for instancetype name)
107+
108+
For preferences, use `resources_get`:
109+
- **apiVersion**: `instancetype.kubevirt.io/v1beta1`
110+
- **kind**: `VirtualMachineClusterPreference`
111+
- **name**: (check VM spec for preference name)
112+
113+
---
114+
115+
## Common Issues and Solutions
116+
117+
### VM stuck in "Stopped" or "Halted"
118+
- Check `spec.runStrategy` - if "Halted", the VM is intentionally stopped
119+
- Change runStrategy to "Always" to start the VM
120+
121+
### VMI doesn't exist
122+
- Check VM conditions for admission errors
123+
- Verify instance type and preference exist
124+
- Check resource quotas in the namespace
125+
126+
### DataVolume stuck in "ImportInProgress"
127+
- Check CDI controller pods in `cdi` namespace
128+
- Verify source image is accessible
129+
- Check PVC storage class exists and has available capacity
130+
131+
### virt-launcher pod in CrashLoopBackOff
132+
- Check pod logs for container `compute`
133+
- Common causes:
134+
- Insufficient resources (CPU/memory)
135+
- Invalid VM configuration
136+
- Storage issues (PVC not available)
137+
138+
### VM starts but guest doesn't boot
139+
- Check virt-launcher logs for QEMU errors
140+
- Verify boot disk is properly configured
141+
- Check if guest agent is installed (for cloud images)
142+
- Ensure correct architecture (amd64 vs arm64)
143+
144+
---
145+
146+
## Additional Resources
147+
148+
For more detailed diagnostics:
149+
- Check KubeVirt components: `pods_list` in `kubevirt` namespace
150+
- Check CDI components: `pods_list` in `cdi` namespace (if using DataVolumes)
151+
- Review resource consumption: `pods_top` for the virt-launcher pod
Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
package troubleshoot
2+
3+
import (
4+
_ "embed"
5+
"fmt"
6+
"strings"
7+
"text/template"
8+
9+
"github.com/containers/kubernetes-mcp-server/pkg/api"
10+
"github.com/google/jsonschema-go/jsonschema"
11+
"k8s.io/utils/ptr"
12+
)
13+
14+
//go:embed plan.tmpl
15+
var planTemplate string
16+
17+
func Tools() []api.ServerTool {
18+
return []api.ServerTool{
19+
{
20+
Tool: api.Tool{
21+
Name: "vm_troubleshoot",
22+
Description: "Generate a comprehensive troubleshooting guide for a VirtualMachine, providing step-by-step instructions to diagnose common issues",
23+
InputSchema: &jsonschema.Schema{
24+
Type: "object",
25+
Properties: map[string]*jsonschema.Schema{
26+
"namespace": {
27+
Type: "string",
28+
Description: "The namespace of the virtual machine",
29+
},
30+
"name": {
31+
Type: "string",
32+
Description: "The name of the virtual machine",
33+
},
34+
},
35+
Required: []string{"namespace", "name"},
36+
},
37+
Annotations: api.ToolAnnotations{
38+
Title: "Virtual Machine: Troubleshoot",
39+
ReadOnlyHint: ptr.To(true),
40+
DestructiveHint: ptr.To(false),
41+
IdempotentHint: ptr.To(true),
42+
OpenWorldHint: ptr.To(false),
43+
},
44+
},
45+
Handler: troubleshoot,
46+
},
47+
}
48+
}
49+
50+
type troubleshootParams struct {
51+
Namespace string
52+
Name string
53+
}
54+
55+
func troubleshoot(params api.ToolHandlerParams) (*api.ToolCallResult, error) {
56+
// Parse required parameters
57+
namespace, err := getRequiredString(params, "namespace")
58+
if err != nil {
59+
return api.NewToolCallResult("", err), nil
60+
}
61+
62+
name, err := getRequiredString(params, "name")
63+
if err != nil {
64+
return api.NewToolCallResult("", err), nil
65+
}
66+
67+
// Prepare template parameters
68+
templateParams := troubleshootParams{
69+
Namespace: namespace,
70+
Name: name,
71+
}
72+
73+
// Render template
74+
tmpl, err := template.New("troubleshoot").Parse(planTemplate)
75+
if err != nil {
76+
return api.NewToolCallResult("", fmt.Errorf("failed to parse template: %w", err)), nil
77+
}
78+
79+
var result strings.Builder
80+
if err := tmpl.Execute(&result, templateParams); err != nil {
81+
return api.NewToolCallResult("", fmt.Errorf("failed to render template: %w", err)), nil
82+
}
83+
84+
return api.NewToolCallResult(result.String(), nil), nil
85+
}
86+
87+
func getRequiredString(params api.ToolHandlerParams, key string) (string, error) {
88+
args := params.GetArguments()
89+
val, ok := args[key]
90+
if !ok {
91+
return "", fmt.Errorf("%s parameter required", key)
92+
}
93+
str, ok := val.(string)
94+
if !ok {
95+
return "", fmt.Errorf("%s parameter must be a string", key)
96+
}
97+
return str, nil
98+
}
Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
package troubleshoot
2+
3+
import (
4+
"context"
5+
"strings"
6+
"testing"
7+
8+
"github.com/containers/kubernetes-mcp-server/pkg/api"
9+
internalk8s "github.com/containers/kubernetes-mcp-server/pkg/kubernetes"
10+
)
11+
12+
type mockToolCallRequest struct {
13+
arguments map[string]interface{}
14+
}
15+
16+
func (m *mockToolCallRequest) GetArguments() map[string]any {
17+
return m.arguments
18+
}
19+
20+
func TestTroubleshoot(t *testing.T) {
21+
tests := []struct {
22+
name string
23+
args map[string]interface{}
24+
wantErr bool
25+
checkFunc func(t *testing.T, result string)
26+
}{
27+
{
28+
name: "generates troubleshooting guide",
29+
args: map[string]interface{}{
30+
"namespace": "test-ns",
31+
"name": "test-vm",
32+
},
33+
wantErr: false,
34+
checkFunc: func(t *testing.T, result string) {
35+
if !strings.Contains(result, "VirtualMachine Troubleshooting Guide") {
36+
t.Errorf("Expected troubleshooting guide header")
37+
}
38+
if !strings.Contains(result, "test-vm") {
39+
t.Errorf("Expected VM name in guide")
40+
}
41+
if !strings.Contains(result, "test-ns") {
42+
t.Errorf("Expected namespace in guide")
43+
}
44+
if !strings.Contains(result, "Step 1: Check VirtualMachine Status") {
45+
t.Errorf("Expected step 1 header")
46+
}
47+
if !strings.Contains(result, "resources_get") {
48+
t.Errorf("Expected resources_get tool reference")
49+
}
50+
if !strings.Contains(result, "VirtualMachineInstance") {
51+
t.Errorf("Expected VMI section")
52+
}
53+
if !strings.Contains(result, "virt-launcher") {
54+
t.Errorf("Expected virt-launcher pod section")
55+
}
56+
},
57+
},
58+
{
59+
name: "missing namespace",
60+
args: map[string]interface{}{
61+
"name": "test-vm",
62+
},
63+
wantErr: true,
64+
},
65+
{
66+
name: "missing name",
67+
args: map[string]interface{}{
68+
"namespace": "test-ns",
69+
},
70+
wantErr: true,
71+
},
72+
}
73+
74+
for _, tt := range tests {
75+
t.Run(tt.name, func(t *testing.T) {
76+
params := api.ToolHandlerParams{
77+
Context: context.Background(),
78+
Kubernetes: &internalk8s.Kubernetes{},
79+
ToolCallRequest: &mockToolCallRequest{arguments: tt.args},
80+
}
81+
82+
result, err := troubleshoot(params)
83+
if err != nil {
84+
t.Errorf("troubleshoot() unexpected Go error: %v", err)
85+
return
86+
}
87+
88+
if result == nil {
89+
t.Error("Expected non-nil result")
90+
return
91+
}
92+
93+
if tt.wantErr {
94+
if result.Error == nil {
95+
t.Error("Expected error in result.Error, got nil")
96+
}
97+
} else {
98+
if result.Error != nil {
99+
t.Errorf("Expected no error in result, got: %v", result.Error)
100+
}
101+
if result.Content == "" {
102+
t.Error("Expected non-empty result content")
103+
}
104+
if tt.checkFunc != nil {
105+
tt.checkFunc(t, result.Content)
106+
}
107+
}
108+
})
109+
}
110+
}

0 commit comments

Comments
 (0)