Commit 938c92b
committed
feat(retrieval): Implement Top-K chunk retrieval for context generation
Refactor document processing to handle chunks (, ) instead of whole documents. This allows for finer-grained retrieval.
Modify the tool to:
- Calculate cosine similarity between the question embedding and all cached chunk embeddings.
- Use a to efficiently find the top K (currently 5) most relevant chunks based on similarity score.
- Combine the text content of these top K chunks, including source metadata, into a single context string for the LLM.
- Update the system and user prompts to instruct the LLM to synthesize an answer based on multiple context snippets.
Add the dependency to enable storing f32 scores in the .
Remove the previous token counting and cost estimation logic from the embedding generation process.1 parent 5c6e9da commit 938c92b
File tree
9 files changed
+716
-564
lines changed- src
9 files changed
+716
-564
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
0 commit comments