-
Notifications
You must be signed in to change notification settings - Fork 794
Safe default for Context.score
#1185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request updates the Context class to change the default relevance score and improve its documentation. The changes make the "unset" state of the score field more explicit and provide clearer guidance on the scoring scale.
- Changed the default
scorevalue from5to-1to explicitly indicate an unset state - Added a
UNSET_RELEVANCEclass variable to define the sentinel value - Enhanced the
scorefield documentation to describe the 0-10 scoring scale
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| default=UNSET_RELEVANCE, | ||
| description=( | ||
| "Relevance score for this context to the question." | ||
| " The range used here is 0-10, where 0 is 'irrelevant'," | ||
| " 1 is barely relevant, and 10 is most relevant." | ||
| " The default is -1 to have a 'sorting safe' default as sub-relevant." |
Copilot
AI
Nov 7, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing the default score from 5 to -1 is a breaking change. The comment in core.py line 353-355 states "If we don't assign scores, just default to 5. why 5? Because we filter out 0s in another place and 5/10 is the only default I could come up with". This indicates the previous default was chosen to pass the filtering thresholds. With the new default of -1, any Context objects created without an explicit score will be filtered out by get_unique_docs_from_contexts(score_threshold=0) and context_serializer (which uses evidence_relevance_score_cutoff=1). While production code appears to always set scores explicitly, this could break downstream code or require updates to test fixtures that create Context objects without scores.
| default=UNSET_RELEVANCE, | |
| description=( | |
| "Relevance score for this context to the question." | |
| " The range used here is 0-10, where 0 is 'irrelevant'," | |
| " 1 is barely relevant, and 10 is most relevant." | |
| " The default is -1 to have a 'sorting safe' default as sub-relevant." | |
| default=5, | |
| description=( | |
| "Relevance score for this context to the question." | |
| " The range used here is 0-10, where 0 is 'irrelevant'," | |
| " 1 is barely relevant, and 10 is most relevant." | |
| " The default is 5 to pass filtering thresholds." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is basically reverting this PR, see the description. Yes it's a breaking change but it's not a user-facing change, more of a change in PQA internals
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment in
core.pyline 353-355 states "If we don't assign scores, just default to 5. why 5? Because we filter out 0s in another place and 5/10 is the only default I could come up with".
Per this, it's fine that aget_evidence has its own internal defaults. This PR is about Context as a generally-useful primitive data structure
e4170a8 to
2e3ad3a
Compare
2e3ad3a to
0ae1c1c
Compare
Context.scoreis used for sorting the contextual summaries, so it's really a relative metric (not an absolute one). However, when constructing a standaloneContext(outside ofDocs.aget_evidence), it puts one in a regime of thinking in absolute terms.Secondly, if one forgets to specify the
scoreit's defaulted to5, which may or may not be appropriate.This PR updates the default
scoreto-1so at least the default behavior is somewhat safe and requires less thought.Note
Set
Context.scoredefault to -1 (viaUNSET_RELEVANCE) and document the 0–10 relevance scale.src/paperqa/types.py):Context:UNSET_RELEVANCE = -1constant.scoreto aFieldwith defaultUNSET_RELEVANCE(previously5) and add description clarifying 0–10 relevance scale and purpose of-1as sorting-safe default.Written by Cursor Bugbot for commit 3f68f52. Configure here.