-
Notifications
You must be signed in to change notification settings - Fork 0
feat: add multiple processing modes for diffgraph generation #12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Introduced a modular architecture for processing modes that enables different strategies for analyzing code changes and generating diffgraphs. Added: - BaseProcessor abstract class for extensible processor interface - Processor registry with @register_processor decorator - Factory pattern via get_processor() for creating processors - CLI options: --mode/-m to select mode, --list-modes to discover - OpenAI Agents mode as 'openai-agents-dependency-graph' (default) - Comprehensive test suite in tests/ directory with pytest structure - Developer guide: docs/ADDING_PROCESSING_MODES.md Changed: - Refactored CodeAnalysisAgent into OpenAIAgentsProcessor - CLI now uses processor factory pattern instead of direct imports - Improved extensibility for adding new modes (tree-sitter, data-flow, etc.) Removed: - diffgraph/ai_analysis.py (refactored into processing_modes/) BREAKING CHANGES: None This is fully backward compatible. The default mode preserves all existing functionality. Users can continue using 'wild diff' without changes. Future modes can be easily added by: 1. Creating a new processor in diffgraph/processing_modes/ 2. Inheriting from BaseProcessor 3. Registering with @register_processor decorator Closes: #[issue-number] (if applicable)
WalkthroughThis pull request introduces a modular Processing Modes System refactoring the monolithic CodeAnalysisAgent into a pluggable architecture with a BaseProcessor abstract base class, processor registry, and factory pattern. The CLI gains mode selection and mode listing capabilities. Version bumped to 1.1.0 with comprehensive test coverage. Changes
Sequence DiagramsequenceDiagram
participant User
participant CLI
participant Registry as Processor<br/>Registry
participant Factory as Factory
participant Processor
participant GraphManager
User->>CLI: diffgraph --mode openai-agents-dependency-graph
CLI->>Registry: Import processing_modes
Registry->>Processor: @register_processor("openai-agents-dependency-graph")
Processor-->>Registry: Register OpenAIAgentsProcessor
CLI->>Factory: get_processor("openai-agents-dependency-graph", api_key=key)
Factory->>Registry: Look up "openai-agents-dependency-graph"
Registry-->>Factory: Return OpenAIAgentsProcessor class
Factory->>Processor: Instantiate OpenAIAgentsProcessor(api_key=key)
Processor-->>Factory: Instance created
Factory-->>CLI: Processor instance
CLI->>Processor: analyze_changes(files, progress_callback)
Processor->>GraphManager: Process changes
GraphManager-->>Processor: Analysis result
Processor-->>CLI: DiffAnalysis(summary, mermaid_diagram)
CLI-->>User: Output results
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Possibly related PRs
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (6)
diffgraph/processing_modes/openai_agents_dependency.py (5)
30-40: Fix mutable default lists in Pydantic model.Using [] as defaults shares the same list across instances. Use Field(default_factory=list) and import Field.
Apply this diff:
-from pydantic import BaseModel +from pydantic import BaseModel, Field @@ class ComponentAnalysis(BaseModel): @@ - dependencies: List[str] = [] - dependents: List[str] = [] - nested_components: List[str] = [] # names of components that are nested within this one + dependencies: List[str] = Field(default_factory=list) + dependents: List[str] = Field(default_factory=list) + nested_components: List[str] = Field(default_factory=list) # names of components that are nested within this oneAlso applies to: 11-11
98-105: Propagate API key to runtime; don’t just store it.self.api_key is never applied to the SDK. Ensure downstream libs see it.
Apply this diff:
self.api_key = api_key or os.getenv("OPENAI_API_KEY") if not self.api_key: raise ValueError("OpenAI API key is required. Set OPENAI_API_KEY environment variable.") + # Ensure the Agents/OpenAI SDKs can discover the key + os.environ["OPENAI_API_KEY"] = self.api_key + # If older OpenAI SDK patterns are in use, optionally set attribute (harmless if unused) + try: + setattr(openai, "api_key", self.api_key) + except Exception: + pass
167-171: Correct return type of _run_agent_analysis.Function returns a structured model but is annotated as str.
Apply this diff:
- def _run_agent_analysis(self, prompt: str) -> str: + def _run_agent_analysis(self, prompt: str) -> CodeChangeAnalysis: """Run the agent analysis with retry logic.""" result = Runner.run_sync(self.agent, prompt) return result.final_output
242-254: Harden ChangeType parsing against LLM output variance.Strip/upper and provide a safe fallback to avoid KeyError on unexpected labels.
Apply this diff:
- change_type = ChangeType[comp.change_type.upper()] + try: + change_type = ChangeType[comp.change_type.strip().upper()] + except Exception: + # Default to MODIFIED if the model returns an unknown change_type + change_type = ChangeType.MODIFIED
275-285: Add a post-pass to resolve cross-file dependencies after all components exist.Current linking runs while processing a single file; references to components in later files remain unresolved. Do a second pass before rendering.
Apply this diff:
- # Generate the final Mermaid diagram - if progress_callback: - progress_callback(None, total_files, "generating_diagram") + # Second pass: resolve dependencies/dependents now that all components are known + for comp_id, comp_node in self.graph_manager.component_nodes.items(): + pseudo = ComponentAnalysis( + name=comp_node.name, + component_type=getattr(comp_node, "component_type", "function"), + change_type=comp_node.change_type.value, + summary=comp_node.summary or "", + parent=getattr(comp_node, "parent", None), + dependencies=comp_node.dependencies, + dependents=comp_node.dependents, + nested_components=[] + ) + self._process_dependencies(pseudo, comp_node.file_path, DependencyMode.DEPENDENCY) + self._process_dependencies(pseudo, comp_node.file_path, DependencyMode.DEPENDENT) + + # Generate the final Mermaid diagram + if progress_callback: + progress_callback(None, total_files, "generating_diagram") try: - mermaid_diagram = self.graph_manager.get_mermaid_diagram() - print(f"Mermaid diagram generated successfully: {mermaid_diagram}") + mermaid_diagram = self.graph_manager.get_mermaid_diagram() + print("Mermaid diagram generated successfully.") except Exception as e: print(f"Error generating Mermaid diagram: {str(e)}") mermaid_diagram = "Error generating diagram"diffgraph/cli.py (1)
192-209: Abstraction violation: accessing processor-specific internals.Line 198 accesses
processor.graph_manager.processed_files, which assumes all processors have agraph_managerattribute with aprocessed_filesproperty. This violates theBaseProcessorabstraction and creates tight coupling to theOpenAIAgentsProcessorimplementation. Future processors may track progress differently or not have a graph_manager at all.Consider one of these solutions:
Option 1 (Recommended): Add a method to the
BaseProcessorinterface:In
diffgraph/processing_modes/base.py:@abstractmethod def get_progress_count(self) -> int: """Return the number of files processed so far.""" passThen update this line to:
- current_index = len(processor.graph_manager.processed_files) + 1 + current_index = processor.get_progress_count() + 1Option 2: Pass
current_indexas a parameter to the progress callback instead of computing it in the CLI. Modify the processor interface to have the callback signature be(current_file, current_index, total_files, status).
🧹 Nitpick comments (8)
tests/README.md (1)
7-18: Specify language for fenced code block.The code block showing the directory structure should specify a language identifier for better rendering.
Apply this diff:
-``` +```text tests/ ├── __init__.pydiffgraph/processing_modes/__init__.py (1)
93-99: Consider sorting all alphabetically.For consistency, consider sorting the exported names alphabetically.
Apply this diff:
__all__ = [ "BaseProcessor", "DiffAnalysis", - "register_processor", "get_processor", + "list_available_modes", + "register_processor", - "list_available_modes", ]diffgraph/processing_modes/openai_agents_dependency.py (3)
189-190: Remove unused processed_files variable.It’s incremented but never used; rely on graph_manager.processed_files instead.
Apply this diff:
- processed_files = 0 @@ - processed_files += 1 @@ - processed_files += 1Also applies to: 265-266, 273-274
55-86: Broaden 429 retry handling and use Retry-After header when present.Catching only openai.RateLimitError may miss SDK/HTTP stack variations (e.g., APIStatusError 429). Also prefer header parsing when available.
Refactor idea:
- Catch a tuple of rate limit–like errors (including generic exceptions where e.status_code == 429).
- When available, read retry-after from e.response.headers.get("retry-after") (case-insensitive).
- Keep exponential backoff as fallback.
If you want, I can draft a concrete drop-in patch.
379-381: Use logging instead of print for unresolved relations; consider collecting metrics.Printing warnings on every unresolved link can spam the CLI. Use logging at debug/warn and optionally count unresolved edges for summary.
tests/conftest.py (1)
4-6: Remove unused import.os is imported but unused.
Apply this diff:
-import ostests/processing_modes/test_openai_agents.py (2)
17-23: Make API key requirement test resilient to ambient env.If OPENAI_API_KEY is set in CI, this test may not fail. Clear it via monkeypatch.
Apply this diff:
def test_openai_processor_requires_api_key(): """Test that OpenAI processor requires API key.""" - with pytest.raises(ValueError) as exc_info: - get_processor("openai-agents-dependency-graph") + def _clear_env(monkeypatch): + monkeypatch.delenv("OPENAI_API_KEY", raising=False) + + with pytest.raises(ValueError) as exc_info: + _clear_env(pytest.MonkeyPatch()) + get_processor("openai-agents-dependency-graph")
8-15: Optional: lazy-initialize Agent to speed tests and reduce coupling.Creating the Agent in init forces SDK presence in unit tests. Consider deferring creation to first analyze_changes call.
I can provide a small refactor (_ensure_agent()) if you want it in this PR.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (17)
CHANGELOG.md(1 hunks)README.md(1 hunks)diffgraph/__init__.py(1 hunks)diffgraph/cli.py(5 hunks)diffgraph/processing_modes/__init__.py(1 hunks)diffgraph/processing_modes/base.py(1 hunks)diffgraph/processing_modes/openai_agents_dependency.py(6 hunks)docs/ADDING_PROCESSING_MODES.md(1 hunks)setup.py(1 hunks)tests/README.md(1 hunks)tests/__init__.py(1 hunks)tests/conftest.py(1 hunks)tests/processing_modes/__init__.py(1 hunks)tests/processing_modes/test_base.py(1 hunks)tests/processing_modes/test_openai_agents.py(1 hunks)tests/processing_modes/test_registry.py(1 hunks)tests/test_cli.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (8)
tests/processing_modes/test_base.py (1)
diffgraph/processing_modes/base.py (2)
BaseProcessor(18-77)DiffAnalysis(12-15)
diffgraph/processing_modes/__init__.py (2)
diffgraph/processing_modes/base.py (3)
BaseProcessor(18-77)DiffAnalysis(12-15)description(65-67)diffgraph/processing_modes/openai_agents_dependency.py (1)
description(148-150)
tests/processing_modes/test_openai_agents.py (3)
diffgraph/processing_modes/__init__.py (1)
get_processor(33-55)diffgraph/processing_modes/base.py (2)
name(59-61)description(65-67)diffgraph/processing_modes/openai_agents_dependency.py (2)
name(143-145)description(148-150)
tests/processing_modes/test_registry.py (3)
diffgraph/processing_modes/__init__.py (2)
get_processor(33-55)list_available_modes(58-85)diffgraph/processing_modes/base.py (3)
name(59-61)analyze_changes(36-55)description(65-67)diffgraph/processing_modes/openai_agents_dependency.py (3)
name(143-145)analyze_changes(172-297)description(148-150)
diffgraph/processing_modes/base.py (2)
diffgraph/processing_modes/openai_agents_dependency.py (4)
analyze_changes(172-297)name(143-145)description(148-150)get_required_config(153-155)diffgraph/cli.py (1)
progress_callback(192-209)
diffgraph/cli.py (3)
diffgraph/processing_modes/__init__.py (2)
get_processor(33-55)list_available_modes(58-85)diffgraph/processing_modes/base.py (2)
description(65-67)analyze_changes(36-55)diffgraph/processing_modes/openai_agents_dependency.py (2)
description(148-150)analyze_changes(172-297)
tests/test_cli.py (1)
diffgraph/cli.py (1)
main(142-250)
diffgraph/processing_modes/openai_agents_dependency.py (4)
diffgraph/graph_manager.py (4)
GraphManager(54-303)FileStatus(15-20)ChangeType(8-13)ComponentNode(37-52)diffgraph/processing_modes/base.py (6)
BaseProcessor(18-77)DiffAnalysis(12-15)name(59-61)description(65-67)get_required_config(70-77)analyze_changes(36-55)diffgraph/processing_modes/__init__.py (1)
register_processor(15-30)diffgraph/cli.py (1)
progress_callback(192-209)
🪛 markdownlint-cli2 (0.18.1)
tests/README.md
7-7: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🪛 Ruff (0.14.2)
diffgraph/processing_modes/__init__.py
49-52: Avoid specifying long messages outside the exception class
(TRY003)
81-81: Do not catch blind exception: Exception
(BLE001)
81-81: Local variable e is assigned to but never used
Remove assignment to unused variable e
(F841)
90-90: Unused noqa directive (non-enabled: F401, E402)
Remove unused noqa directive
(RUF100)
93-99: __all__ is not sorted
Apply an isort-style sorting to __all__
(RUF022)
🔇 Additional comments (16)
setup.py (1)
5-5: LGTM! Version bump is consistent.The version update to 1.1.0 aligns with the modular processing modes feature introduced in this PR.
diffgraph/__init__.py (1)
5-5: LGTM! Version consistent across package.The version update matches setup.py and correctly reflects the 1.1.0 release.
tests/processing_modes/__init__.py (1)
1-3: LGTM! Appropriate test package marker.Simple and clear documentation for the test subpackage.
README.md (2)
61-78: LGTM! Clear documentation of new CLI options.The updated usage examples effectively demonstrate the new mode selection and listing capabilities.
80-107: LGTM! Comprehensive processing modes documentation.The new section clearly explains available modes, future plans, and how to extend with custom processors. This provides excellent guidance for both users and developers.
CHANGELOG.md (1)
8-35: LGTM! Comprehensive and well-structured changelog.The 1.1.0 release notes clearly document all additions, changes, and removals. The format follows Keep a Changelog standards.
diffgraph/processing_modes/base.py (2)
12-15: LGTM! Clean data model.The DiffAnalysis Pydantic model provides a clear structure for processor output.
18-77: LGTM! Well-designed abstract base class.The BaseProcessor interface is clean and extensible:
- Clear contract via abstract methods
- Flexible configuration through **kwargs
- Optional progress callback support
- Sensible defaults for get_required_config
This provides an excellent foundation for the modular processing modes system.
diffgraph/processing_modes/__init__.py (2)
15-30: LGTM! Clean decorator implementation.The @register_processor decorator provides a clear and intuitive way to register processing modes.
33-55: LGTM! Helpful error handling in factory.The get_processor factory provides clear error messages listing available modes when an unknown mode is requested.
tests/__init__.py (1)
1-3: LGTM.Docstring only; no issues.
tests/processing_modes/test_base.py (1)
4-6: No action needed—re-exports confirmed.The imports are present in
diffgraph/processing_modes/__init__.py(line 9:from .base import BaseProcessor, DiffAnalysis), enabling the test to import successfully fromdiffgraph.processing_modes.tests/processing_modes/test_registry.py (1)
1-50: LGTM! Comprehensive test coverage for the processor registry.The tests effectively validate:
- Registry enumeration via
list_available_modes()- Processor instantiation via
get_processor()- Error handling for invalid modes
- Interface compliance with the BaseProcessor contract
diffgraph/cli.py (2)
139-152: LGTM! Clean implementation of mode selection and listing.The CLI options for
--mode,-m, and--list-modesare well-designed with clear help text and sensible defaults. The list-modes handler provides user-friendly output formatting.
182-189: LGTM! Robust error handling for processor initialization.The try-catch block properly handles invalid mode errors and guides users to use
--list-modes. The error messages are clear and actionable.docs/ADDING_PROCESSING_MODES.md (1)
1-257: LGTM! Excellent developer documentation.This guide is comprehensive and well-structured, providing:
- Clear architectural overview of the processing mode system
- Step-by-step instructions with complete code examples
- Best practices for progress reporting, error handling, and diagram generation
- Practical examples of future modes to implement
- Testing and documentation guidance
The examples align with the established patterns in the codebase and will help developers create new processing modes efficiently.
| description = desc | ||
| else: | ||
| description = "No description available" | ||
| except Exception as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Replace unused exception variable.
The caught exception is never used. Replace e with _ to indicate it's intentionally ignored.
Apply this diff:
- except Exception as e:
+ except Exception:
# Fallback: try to get from docstring or use default
description = processor_class.__doc__.split('\n')[0] if processor_class.__doc__ else "No description available"🧰 Tools
🪛 Ruff (0.14.2)
81-81: Do not catch blind exception: Exception
(BLE001)
81-81: Local variable e is assigned to but never used
Remove assignment to unused variable e
(F841)
🤖 Prompt for AI Agents
In diffgraph/processing_modes/__init__.py around line 81, the except clause
declares an unused variable `e`; replace it with `_` to signal the exception is
intentionally ignored by changing `except Exception as e:` to `except Exception
as _:` (or simply `except Exception:`) and ensure no references to `e` remain in
that block.
| def test_cli_invalid_mode_error(cli_runner): | ||
| """Test that invalid mode shows helpful error.""" | ||
| # We need a git repo and changes for this to work properly | ||
| # This test will fail early with invalid mode error | ||
| result = cli_runner.invoke(main, ['diff', '--mode', 'invalid-mode']) | ||
|
|
||
| # Should show error about invalid mode | ||
| # (might fail earlier if not in a git repo, which is fine) | ||
| assert result.exit_code != 0 or 'Available processing modes' in result.output | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strengthen the assertion to verify both exit code and error message.
The assertion on line 51 uses or, which allows the test to pass if either condition is true. This is too permissive and may allow false positives. When an invalid mode is provided, the CLI should both exit with a non-zero code AND display a helpful error message.
Consider applying this diff to strengthen the test:
def test_cli_invalid_mode_error(cli_runner):
"""Test that invalid mode shows helpful error."""
- # We need a git repo and changes for this to work properly
- # This test will fail early with invalid mode error
result = cli_runner.invoke(main, ['diff', '--mode', 'invalid-mode'])
- # Should show error about invalid mode
- # (might fail earlier if not in a git repo, which is fine)
- assert result.exit_code != 0 or 'Available processing modes' in result.output
+ # Should exit with error and show helpful message
+ assert result.exit_code != 0
+ assert 'Unknown processing mode' in result.output or 'list-modes' in result.output🤖 Prompt for AI Agents
In tests/test_cli.py around lines 43 to 52, the final assertion currently uses
`or` which is too permissive; change the assertion to require both a non-zero
exit code and the presence of the helpful error text. Update the assertion so it
asserts result.exit_code != 0 AND 'Available processing modes' in result.output
(ensuring both conditions must be true), and keep the surrounding comment about
possible early git-repo failures if desired.
Introduced a modular architecture for processing modes that enables different strategies for analyzing code changes and generating diffgraphs.
Added:
Changed:
Removed:
BREAKING CHANGES: None
This is fully backward compatible. The default mode preserves all existing functionality. Users can continue using 'wild diff' without changes.
Future modes can be easily added by:
Closes: #[issue-number] (if applicable)
Long version:
Changes
Architecture
BaseProcessorthat defines the interface for all processing modesget_processor()factory function for instantiating processors--modeCLI optionNew Module Structure
Code Changes
Created
diffgraph/processing_modes/base.py: AbstractBaseProcessorclass withanalyze_changes()interface__init__.py: Registry decorator and factory functionopenai_agents_dependency.py: Refactored existingCodeAnalysisAgentintoOpenAIAgentsProcessorRemoved
diffgraph/ai_analysis.pyUpdated
diffgraph/cli.py--modeoption to select processing mode (default:openai-agents-dependency-graph)--list-modesoption to display available modes with descriptionsUpdated Documentation
README.md: Added "Processing Modes" section with usage examplesCHANGELOG.md: Documented all changes for v1.1.0docs/ADDING_PROCESSING_MODES.md: Guide for implementing new processing modesComprehensive Test Suite
Version Bump
1.0.0→1.1.0(semantic versioning: new features)diffgraph/__init__.pyandsetup.pyv1.1.0Usage
List Available Modes
Use Specific Mode
Default Behavior (Unchanged)
diffgraph [files] # Uses openai-agents-dependency-graph by defaultCurrent Processing Modes
Future Extensibility
This architecture enables easy addition of new processing modes:
tree-sitter-dependency-graph: AST-based static analysis (planned)dataflow-analysis: Data flow trackingarchitecture-view: System architecture focusSee
docs/ADDING_PROCESSING_MODES.mdfor implementation guide.Testing
Run the test suite:
All tests pass with the new modular architecture.
Breaking Changes
None. The default behavior remains unchanged. Existing users will see no difference unless they opt-in to new modes via
--modeflag.Checklist
ai_analysis.py) for clean refactoringv1.1.0)Related Issues
Implements foundation for multiple analysis approaches, enabling experimentation with different code analysis techniques.
Summary by CodeRabbit
Release Notes
New Features
--modeCLI option.--list-modesCLI command to view available processing modes.openai-agents-dependency-graphfor component-level analysis and dependency graphs.Documentation
Chores