-
Notifications
You must be signed in to change notification settings - Fork 3
Test placeholders #32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
e623f8c to
39972a2
Compare
37184fb to
d5b9ea1
Compare
d5b9ea1 to
15005fb
Compare
cbbayburt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @ycedres.
Just some minor comments below, but overall, looks good.
src/mcp_server_uyuni/server.py
Outdated
| logger.info(log_string) | ||
| await ctx.info(log_string) | ||
|
|
||
| is_confirmed = str(confirm).lower() == 'true' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This confirmed check looks different than the others. Can we extract the following parsing logic to a helper method to unify it and avoid repetition?
is_confirmed = str(confirm).lower() in ('true', 'yes', '1')
test/test_cases_ops.json
Outdated
| { | ||
| "id": "TC-OPS-001_confirm_request", | ||
| "prompt": "Add a new system at host 10.145.211.203 with activation key '1-DEBLIKE-KEY'", | ||
| "prompt": "Can you add a new system at host {new_system_host} with activation key '{key_deblike}'?", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the imperative "Do this .." instead of "Can you do this .." feels more natural IMO. What's the reason we prefer this way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, I was using Gemini to batch refactor and replace fixed values with placeholders and I did not realise it changed also this.
cbbayburt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes @ycedres. LGTM 👍 feel free to merge
The goal of this PR is improving the test runner and the test results. At this point some more automatic testing improvement is needed but this is a step forward.
Add placeholders to allow easier test configuration.
Improve judge model system prompt.
Improve confirmation mechanism for write-enabled tools (this allows confirmation mechanism of testing to also work).
Included patch produced in (so related tests pass)
Fix nested tool calls #34