centos python pdf2image“可能不是PDF文件”错误

tcbh2hod  于 2022-11-07  发布在  Python
关注(0)|答案(1)|浏览(315)

在Centos 8操作系统上,我在用Python将pdf页面转换为jpg文件时得到一个错误。

from pdf2image import convert_from_path
import sys

images = convert_from_path("test.pdf",500)
for i in range(len(images)):
    images[i].save('page'+ str(i) +'.jpg', 'JPEG')

结果它给出了这个错误。我可以在本地运行PDF文件,但当我想将其保存为jpg时,它不起作用。

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/pdf2image/pdf2image.py", line 479, in pdfinfo_from_path
    raise ValueError
ValueError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "pdf_conv.py", line 7, in <module>
    images = convert_from_path(pdf_path,500)
  File "/usr/local/lib/python3.6/site-packages/pdf2image/pdf2image.py", line 98, in convert_from_path
    page_count = pdfinfo_from_path(pdf_path, userpw, poppler_path=poppler_path)["Pages"]
  File "/usr/local/lib/python3.6/site-packages/pdf2image/pdf2image.py", line 489, in pdfinfo_from_path
    "Unable to get page count.\n%s" % err.decode("utf8", "ignore")
pdf2image.exceptions.PDFPageCountError: Unable to get page count.
Syntax Warning: May not be a PDF file (continuing anyway)
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't read xref table
xkftehaa

xkftehaa1#

PDF!= PDF -它有不同的版本。也许你的python pdf2image不喜欢/不知道你输入的PDF的 * 种类 *。使用AcrobatReader或类似的东西来检查你试图转换的内容,看看pdf2image是否支持它。
参见Which ISO standards does pdf2image support(简称:pdf 2 image支持poppler支持的所有PDF标准。)

相关问题