Releases: pymupdf/PyMuPDF
Minor bug fixes and enhancements
Fixes: #1505, #1484, #1479, #1474.
Changes:
- Full support of PDF page rectangles like
/ArtBoxetc. - New global variable TESSDATA_PREFIX for comfortably checking presence of OCR support
- Changed
Document.xref_set_key()such that dictionary keys will physically be removed if set to value "null". - Changed
Document.extract_font()to optionally return a dictionary (instead of a tuple).
New features for class Pixmap and several fixes
Fixes:
#1351, #1417, #1418, #1430, #1433
- New or changed Pixmap methods
color_topusage(),color_count(),warp(). Some of them solve #1397. - New Annot method and property
irt_xref,set_irt_xref(). Implements #1450. - New
Rect/IRectmethodtorect()which creates a matrix to transform between given rectangles. Page.get_texttrace()now also supports non-horizontal text.
Improvements for drawings extraction and bug fixes
Important improvements for OCR support
OCR of a document page has been improved a lot compared to v1.19.0.
Text extractions now also come with an integrated sort.
Fixes: #1328
First version to support MuPDF v1.19.*
Introduces major new features like PDF journalling and OCR support by directly invoking Tesseract-OCR.
In addition, it is possible to detect whether object are covered (hidden) by other objects.
As part of the new version, the following issues have resolved:
#1313, #1311, #1290, #1286, #1287, #1284.
Hotfix
Implement various fixes
Performance improvement for drawings extraction
improve test scripts `show_pdf_page` and `insert_image` are now tested with rotated insertions.
Layout Preserving Text Extraction
Support of Small Capitals, assigning subset font name tags
Apart from some minor fixes, this release introduces support for small caps in TextWriter based text output.
In addition, method Document.subset_fonts() now prefixes subsetted font names with the 6 upper case letter prefix as prescribed by the PDF standard.