@@ -138,6 +138,123 @@ python3 scripts/compare_benchmarks.py \
138138cat comparison.md
139139```
140140
141+ ## Version Comparison Workflow
142+
143+ For comparing performance across multiple commits (e.g., to find when a regression was introduced), use the ` compile_benchmark_versions.sh ` script.
144+
145+ ### ` compile_benchmark_versions.sh `
146+
147+ This script compiles the benchmark tool for every commit in a range, making it easy to run performance comparisons across different versions.
148+
149+ ** Features:**
150+
151+ - ** Idempotent** : Only compiles versions that don't already exist
152+ - ** Safe** : Uses git worktrees in temporary directories (doesn't affect your working directory)
153+ - ** Convenient** : Stores binaries with commit SHA for easy identification
154+ - ** Non-intrusive** : Works even with uncommitted changes in your main working directory
155+ - ** Storage** : Uses ` $XDG_DATA_HOME/string_pipeline/benchmarks/ ` (typically ` ~/.local/share/string_pipeline/benchmarks/ ` )
156+
157+ ** Usage:**
158+
159+ ``` bash
160+ # Compile all versions from 78594af (stabilized benchmark tool v1.0.0) to HEAD
161+ ./scripts/compile_benchmark_versions.sh
162+
163+ # Compile specific range
164+ ./scripts/compile_benchmark_versions.sh --start abc1234 --end def5678
165+
166+ # See what would be compiled (dry run)
167+ ./scripts/compile_benchmark_versions.sh --dry-run
168+
169+ # List already compiled versions
170+ ./scripts/compile_benchmark_versions.sh --list
171+
172+ # Remove all compiled versions
173+ ./scripts/compile_benchmark_versions.sh --clean
174+
175+ # Verbose output for debugging
176+ ./scripts/compile_benchmark_versions.sh --verbose
177+ ```
178+
179+ ** Example Workflow - Finding a Performance Regression:**
180+
181+ ``` bash
182+ # 1. Compile all versions
183+ ./scripts/compile_benchmark_versions.sh
184+
185+ # 2. Set up benchmark directory path
186+ BENCH_DIR=" ${XDG_DATA_HOME:- $HOME / .local/ share} /string_pipeline/benchmarks"
187+
188+ # 3. Run benchmarks on two versions
189+ $BENCH_DIR /bench_throughput_abc1234 \
190+ --sizes 10000 \
191+ --iterations 100 \
192+ --output before.json
193+
194+ $BENCH_DIR /bench_throughput_def5678 \
195+ --sizes 10000 \
196+ --iterations 100 \
197+ --output after.json
198+
199+ # 4. Compare results
200+ python3 scripts/compare_benchmarks.py before.json after.json
201+
202+ # 5. If regression found, bisect by testing commits in between
203+ $BENCH_DIR /bench_throughput_xyz9999 --sizes 10000 --iterations 100 --output middle.json
204+ python3 scripts/compare_benchmarks.py before.json middle.json
205+ ```
206+
207+ ### ` compare_benchmark_versions.sh `
208+
209+ After compiling benchmark binaries, use this script to quickly compare performance between two versions using hyperfine.
210+
211+ ** Features:**
212+
213+ - ** Fast comparison** : Uses hyperfine for accurate benchmark timing
214+ - ** Automatic validation** : Checks that both binaries exist before running
215+ - ** Flexible parameters** : Customize warmup, runs, and sizes
216+ - ** Clear output** : Shows which version is faster with statistical confidence
217+
218+ ** Requirements:**
219+
220+ - hyperfine must be installed (` apt install hyperfine ` or ` brew install hyperfine ` )
221+
222+ ** Usage:**
223+
224+ ``` bash
225+ # Basic comparison with defaults
226+ ./scripts/compare_benchmark_versions.sh 78594af c5a8a11
227+
228+ # Custom warmup and runs for better accuracy
229+ ./scripts/compare_benchmark_versions.sh 78594af c5a8a11 --warmup 5 --runs 20
230+
231+ # Compare with specific benchmark parameters
232+ ./scripts/compare_benchmark_versions.sh abc1234 def5678 --sizes 10000
233+ ```
234+
235+ ** Example Workflow - Performance Comparison:**
236+
237+ ``` bash
238+ # 1. Compile the versions you want to compare
239+ ./scripts/compile_benchmark_versions.sh --start 78594af --end c5a8a11
240+
241+ # 2. Run hyperfine comparison
242+ ./scripts/compare_benchmark_versions.sh 78594af c5a8a11
243+
244+ # Output shows:
245+ # - Mean execution time for each version
246+ # - Standard deviation
247+ # - Min/max range
248+ # - Relative speed comparison (e.g., "1.05x faster")
249+ ```
250+
251+ ** Important Notes:**
252+
253+ - This compares ** execution time** of the entire benchmark run, not the benchmark throughput metrics
254+ - Both versions run with identical parameters for fair comparison
255+ - Hyperfine handles warmup runs and statistical analysis automatically
256+ - For more detailed performance analysis, use the benchmark JSON output with ` compare_benchmarks.py `
257+
141258## Configuration
142259
143260### Benchmark Parameters
0 commit comments