-
Notifications
You must be signed in to change notification settings - Fork 117
Open
Labels
Description
Description
Include Kubernetes Events associated with failed TaskRuns in the LLM context to provide richer diagnostic information for AI-powered analysis.
Context
This was suggested in PR #2292 review comment: #2292 (comment)
Currently, the LLM context for failed tasks (pkg/llm/context/assembler.go:232-248) includes:
- Task name, reason, message
- Log snippet
- Display name and completion time
But does not include Kubernetes Events, which can provide valuable diagnostic information such as:
- Pod scheduling issues
- Container failures
- Resource constraints (memory/CPU limits)
- Image pull errors
- Volume mount issues
- Other system-level failures
Proposed Changes
-
Extend TaskInfos struct (
pkg/apis/pipelinesascode/v1alpha1/types.go:72):type TaskInfos struct { Name string `json:"name"` Message string `json:"message,omitempty"` LogSnippet string `json:"log_snippet,omitempty"` Reason string `json:"reason,omitempty"` DisplayName string `json:"display_name,omitempty"` CompletionTime *metav1.Time `json:"completion_time,omitempty"` Events []Event `json:"events,omitempty"` // NEW }
-
Update CollectFailedTasksLogSnippet (
pkg/kubeinteraction/status/task_status.go:76):- For each failed TaskRun, call
kinteract.GetEvents(ctx, namespace, "TaskRun", taskRunName) - Add events to the TaskInfos struct
- For each failed TaskRun, call
-
Update buildErrorContent (
pkg/llm/context/assembler.go:232):- Include events in the failedTask map when building context
Benefits
- Better root cause analysis for failures related to infrastructure/platform issues
- LLM can distinguish between application failures vs platform failures
- More comprehensive context for suggesting fixes
Implementation Notes
The infrastructure already exists:
GetEventsmethod available inpkg/kubeinteraction/events.go:11- Already used for Repository and PipelineRun objects in
pkg/cmd/tknpac/describe/describe.go:190,207
Priority
This is a future enhancement and not blocking for the initial LLM analysis feature.