Feature heauristic checks + cleanup #44
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What's the issues or discussion related to this PR ?
Finding a better way to add heuristic checks. Cleaned up all the unnecessary files and unused functionality. Fixed the bug with versioning specialist definitions so everytime a specialist is enriched the version also increases. You can also manually bump up the version of your specialist.
What's added in this PR?
scenario.yamls now contain a section for heursitic checks which can be commands run for code or pattern matching looking for sections of a PRD.
Along with the new enhancements, the template version has increased.
What are the steps to test this PR?
pnpm benchRun
002-client-componentin the next.js suite to view heuristic changes.to test the version bumping enable enrichment as well.
Documentation update for this PR (if applicable)?
Documentation already exists in
docs/heuristic-checks-guide.mdwhich covers the heuristic checks feature comprehensively. The guide includes:No additional documentation updates required as the existing guide already covers the new
heuristic_checkssection inscenario.yamlfiles.(Optional) What's left to be done for this PR?
[] MCP + oauth should be created @Nsttt
[] make a easier way to create pattern matching heuristics
[] prompt and conversation should be sent R2 instead of D1 since the JSON's get really big for longer benchmarks
(Optional) What's the potential risk and how to mitigate it?
Who do you wish to review this PR other than required reviewers?
@Nsttt @zackarychapple
(Required) Pre-PR/Merge checklist