ENH: Add engine='polars' support in read_csv #61989
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
🚀 Enhancement: Add
engine='polars'Support inread_csv🔧 Summary of Changes
This PR introduces support for using [Polars](https://pola-rs.github.io/polars/py-polars/html/reference/api/pl.read_csv.html) as a backend CSV parsing engine in
pandas.read_csv, providing faster parsing capabilities for large files.The following changes are included:
✅ Added support for
engine="polars"inpandas.read_csv✅ Dynamically imported Polars and handled
ImportErrorgracefully✅ Filtered
read_csv()kwargs to only allow those compatible with Polars✅ Converted
Pathinput to string (Polars does not accept path-like objects in all versions)✅ Added test case
test_read_csv_with_polarsundertests/io/parser✅ Updated version to
2.3.3.dev0in__init__.pyandpyproject.toml(as part of the development build)✅ Resolved all
rufflinter errors and pre-commit hook failures (e.g., B904, E501, F841, SC1017)✅ Formatted shell scripts using
dos2unixto fix line-ending issues across:ci/code_checks.shci/run_tests.shscripts/cibw_before_build.shscripts/download_wheels.shscripts/upload_wheels.shgitpod/workspace_config📆 Usage Example
✅ Expected Output:
💡 Why This Matters
Polars is a high-performance DataFrame library designed for speed and multi-threaded performance. Adding it as a supported backend:
c,python, orpolars)✅ Tests & Quality Checks
test_read_csv_with_polarsruff,shellcheck,cython-lint,codespell, etc.dos2unixfor consistent CI/CD compatibility🧠 Notes
polarsis treated as an optional dependency“Polars is not installed. Please install it with 'pip install polars'.”
🙌 Acknowledgements
Thanks to the maintainers for reviewing this contribution!
Looking forward to feedback or further improvements.