Inline images extraction #1231
-
|
There are some small pictures in my PDFs and I am not able to extract them with page.getImageList() (it returns the empty list). Can you please recommend me what I can do to extract those images, maybe another library? I need those images very much) |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 11 replies
-
|
If this happens, you
Drawings can be extracted via Inline images are only contained in the internal page command source (the |
Beta Was this translation helpful? Give feedback.
-
|
The list of drawings dicts |
Beta Was this translation helpful? Give feedback.



page.get_text("dict")["blocks"]is a list of blocks on the page. Each one is either a text block or an image block - see documentation forTextPage. Image blocks haveblock["type"] = 1. The image binary is contained inblock["image"]. More info is contained in the other dict keys.The list of drawings dicts
page.get_drawings()can be used to re-draw each on some other page - see the docu here.Each "path" dict therein has a
path["rect"], which is the rectangle containing all the elementary draws in it.You could also do a
page.get_pixmap(..., clip=path["rect"])to create an image of the path. Of course there is the risk, that other things (not belonging to the path) are also part of that …