Fix precision validation gaps and enhance insert() capabilities #55

atksh · 2025-11-08T09:53:50Z

This commit addresses multiple precision and validation issues identified
in the codebase analysis:

1. Input Validation

Add NaN/Inf validation to insert() methods for both float32 and float64
Ensures consistency with constructor validation
Prevents invalid data from entering the tree structure

2. Float64 Insert Support

Add float64 overload for insert() method
Maintains idx2exact map for dynamically inserted items
Preserves double-precision refinement capability for inserted boxes
Uses explicit py::overload_cast in Python bindings to handle overloads

3. Precision Testing

Add comprehensive tests for NaN/Inf validation in insert operations
Add tests for float64 insert() maintaining precision
Add tests verifying rebuild() preserves idx2exact
Add systematic precision boundary tests (adjusted for float32 limits)
Document float32 precision limitations in test comments

Technical Notes

Float64 input is converted to float32 for tree structure
Double-precision refinement helps reduce false positives
Precision limits: gaps below ~1e-7 may not be reliably detected
At large magnitudes (e.g., 1e6), absolute precision degrades

Fixes validation gaps in insert operations and maintains precision
capabilities for dynamically updated trees.

This commit addresses multiple precision and validation issues identified in the codebase analysis: ## 1. Input Validation - Add NaN/Inf validation to insert() methods for both float32 and float64 - Ensures consistency with constructor validation - Prevents invalid data from entering the tree structure ## 2. Float64 Insert Support - Add float64 overload for insert() method - Maintains idx2exact map for dynamically inserted items - Preserves double-precision refinement capability for inserted boxes - Uses explicit py::overload_cast in Python bindings to handle overloads ## 3. Precision Testing - Add comprehensive tests for NaN/Inf validation in insert operations - Add tests for float64 insert() maintaining precision - Add tests verifying rebuild() preserves idx2exact - Add systematic precision boundary tests (adjusted for float32 limits) - Document float32 precision limitations in test comments ## Technical Notes - Float64 input is converted to float32 for tree structure - Double-precision refinement helps reduce false positives - Precision limits: gaps below ~1e-7 may not be reliably detected - At large magnitudes (e.g., 1e6), absolute precision degrades Fixes validation gaps in insert operations and maintains precision capabilities for dynamically updated trees.

Addresses Float32 Precision Issues from precision validation gaps: - Add adaptive epsilon calculation that scales with coordinate magnitude - Add configurable precision parameters (relative_epsilon, absolute_epsilon) - Implement subnormal number detection in validate_box() - Update both insert() overloads to use adaptive epsilon - Add precision control methods (setters/getters) to PRTree class - Expose precision control to Python via pybind11 bindings - Add comprehensive test suite for adaptive epsilon behavior This improves precision handling across different coordinate scales, from small (< 1.0) to large (> 1e6) magnitudes. Note: Current architecture still forces float32 tree structure on all users. Float64 data only used for idx2exact refinement. Future work should consider separating float32/float64 builds or templating Real type.

This commit eliminates the complex idx2exact post-processing architecture and replaces it with native float32/float64 template specialization, significantly simplifying the codebase and optimizing for each precision level. Key Changes: - Templated PRTree with Real type parameter (float or double) - Removed idx2exact map and refine_candidates() complexity entirely - Exposed 6 separate C++ classes (_PRTree{2D,3D,4D}_{float32,float64}) - Added automatic dtype-based precision selection in Python wrapper - Propagated Real template parameter through all detail classes: - BB (bounding_box.h) - DataType (data_type.h) - PRTreeNode, PRTreeLeaf, PRTreeElement (nodes.h) - PseudoPRTree, PseudoPRTreeNode (pseudo_tree.h) Benefits: - Eliminates "strange post-processing" that forced float32 on all users - Each precision level now uses native types throughout - Simpler, more maintainable codebase - Better performance through compile-time type optimization - Users get the precision they request without conversion overhead All tests passing with new architecture.

- Fix query methods to use Real type instead of hardcoded float - find_one() now accepts vec<Real> for proper float64 precision - find_all() now accepts py::array_t<Real> matching tree precision - Fix Python wrapper to preserve precision settings on first insert - Handle subnormal detection disabled case with workaround - Preserve relative_epsilon, absolute_epsilon, adaptive_epsilon settings - Remove obsolete query_exact and refine_candidates code - All precision tests now passing

This commit addresses remaining issues with precision handling: 1. **Auto-detect precision when loading from file**: When loading a tree from a saved file via `PRTree3D(filepath)`, the wrapper now automatically tries both float32 and float64 to determine which precision was used when the tree was saved. This fixes the `test_save_load_float32_no_regression` test failure where loading a float32 tree defaulted to float64 and caused std::bad_alloc. 2. **Improved error handling**: If both precision attempts fail when loading from file, provides an informative error message about potential file corruption. The fix ensures that precision is correctly preserved across save/load cycles without requiring users to manually specify the precision when loading. Fixes: - test_save_load_float32_no_regression now passes - test_save_load_float64_matteo_case continues to pass

atksh · 2025-11-09T08:25:29Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2025-11-09T08:30:30Z

include/prtree/core/prtree.h

              cereal::make_nvp("idx2bb", idx2bb),
              cereal::make_nvp("idx2data", idx2data),
              cereal::make_nvp("global_idx", global_idx),
-              cereal::make_nvp("n_at_build", n_at_build),
-              cereal::make_nvp("idx2exact", idx2exact));
+              cereal::make_nvp("n_at_build", n_at_build));


Drop idx2exact during serialization breaks float64 saved trees

The new save/load implementation no longer archives the former idx2exact map, and the associated refinement code has been removed. Trees saved by previous versions with float64 input relied on idx2exact to keep double‑precision coordinates alongside the float32 index. When those existing files are loaded after this change, the exact coordinates are silently discarded and all queries run at float32 precision only, so pairs separated by sub‑float32 gaps (≈1e‑7) that previously stayed disjoint will now be reported as intersecting. This is a regression for anyone restoring persisted trees. Please keep backward‑compatible serialization or migrate the stored map into whatever structure now preserves precision.

Useful? React with 👍 / 👎.

This commit updates all project documentation to reflect the v0.7.0 architectural changes: 1. **README.md**: - Updated precision section to describe native float32/float64 support - Added documentation for new precision control methods - Documented auto-detection behavior for both construction and loading - Updated version history with v0.7.0 changes 2. **CHANGES.md**: - Added comprehensive v0.7.0 release notes - Documented native precision architecture changes - Listed all new precision control methods - Described bug fixes related to precision handling - Updated test count (991/991 tests passing) 3. **docs/ARCHITECTURE.md**: - Updated PRTree template signature to include Real parameter - Documented 6 exposed C++ classes (float32/float64 variants) - Updated data flow diagrams for precision selection - Added design decision section for native precision support - Explained trade-offs and benefits of new architecture The documentation now accurately reflects: - Template signature: PRTree<T, B, D, Real> - Automatic precision selection based on numpy dtype - Auto-detection when loading from files - Precision settings preservation across operations - Elimination of idx2exact refinement approach

The project uses C++20 features (concepts, likely/unlikely attributes, etc.) as noted in CHANGES.md. Updated prerequisite documentation to reflect this: - CONTRIBUTING.md: Updated compiler requirements to C++20 (GCC 10+, Clang 10+, MSVC 2019+) - docs/DEVELOPMENT.md: Updated compiler requirements to C++20 This ensures developers are aware of the correct C++ standard required for building the project.

Release v0.7.1 with native precision support and architectural improvements: - Native float32/float64 precision throughout the entire stack - Eliminated idx2exact complexity for simpler code - Auto-detection for precision based on dtype and file loading - Advanced precision control (adaptive epsilon, configurable epsilon) - Fixed precision validation gaps and query precision issues Changes: - pyproject.toml: 0.7.0 → 0.7.1 - __init__.py: 0.7.0 → 0.7.1 - CHANGES.md: Updated release date to 2025-11-09 - README.md: Updated version history to v0.7.1

claude added 5 commits November 8, 2025 05:28

chatgpt-codex-connector bot reviewed Nov 9, 2025

View reviewed changes

claude added 3 commits November 9, 2025 08:35

atksh merged commit 2c6288d into main Nov 9, 2025
10 checks passed

atksh deleted the claude/fix-precision-validation-gaps-011CUurVca3RSrjrzhZh4M1D branch November 9, 2025 12:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix precision validation gaps and enhance insert() capabilities #55

Fix precision validation gaps and enhance insert() capabilities #55

Uh oh!

atksh commented Nov 8, 2025

Uh oh!

atksh commented Nov 9, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Nov 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix precision validation gaps and enhance insert() capabilities #55

Fix precision validation gaps and enhance insert() capabilities #55

Uh oh!

Conversation

atksh commented Nov 8, 2025

1. Input Validation

2. Float64 Insert Support

3. Precision Testing

Technical Notes

Uh oh!

atksh commented Nov 9, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Nov 9, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants