Python:类型错误:应为字符串、字节或os,PathLike对象,而不是PdfFileReader

z3yyvxxp  于 2023-01-19  发布在  Python
关注(0)|答案(1)|浏览(141)

我有下面的代码。这只是一个起点。稍后我想从一个csv文件中的项目,我读和循环通过的每一个项目,以取代静态的“你好字”文本。我想在每一页的水印。

# importing the required modules
import PyPDF2
import io
from reportlab.pdfgen import canvas
from reportlab.lib.pagesizes import letter

def add_watermark(wmFile, pageObj):
    # opening watermark pdf file
    wmFileObj = open(wmFile, 'rb')

    # creating pdf reader object of watermark pdf file
    pdfReader = PyPDF2.PdfFileReader(wmFileObj)

    # merging watermark pdf's first page with passed page object.
    pageObj.mergePage(pdfReader.getPage(0))

    # closing the watermark pdf file object
    wmFileObj.close()

    # returning watermarked page object
    return pageObj

def main():
    import PyPDF2
    import io
    from reportlab.pdfgen import canvas
    from reportlab.lib.pagesizes import letter
    # watermark pdf file name
    packet = io.BytesIO()
    # Create a new PDF with Reportlab
    can = canvas.Canvas(packet, pagesize=letter)
    can.setFont('Helvetica-Bold',18)
    can.drawString(10, 100, "Hello world")
    can.showPage()
    can.save()

    # Move to the beginning of the StringIO buffer
    packet.seek(0)
    mywatermark = PyPDF2.PdfFileReader(packet)

    # original pdf file name
    origFileName = 'Module1.pdf'

    # new pdf file name
    newFileName = 'watermarked_example.pdf'

    # creating pdf File object of original pdf
    pdfFileObj = open(origFileName, 'rb')

    # creating a pdf Reader object
    pdfReader = PyPDF2.PdfFileReader(pdfFileObj)

    # creating a pdf writer object for new pdf
    pdfWriter = PyPDF2.PdfFileWriter()

    # adding watermark to each page
    for page in range(pdfReader.numPages):
        # creating watermarked page object
        wmpageObj = add_watermark(mywatermark, pdfReader.getPage(page))

        # adding watermarked page object to pdf writer
        pdfWriter.addPage(wmpageObj)

    # new pdf file object
    newFile = open(newFileName, 'wb')

    # writing watermarked pages to new file
    pdfWriter.write(newFile)

    # closing the original pdf file object
    pdfFileObj.close()
    # closing the new pdf file object
    newFile.close()

if __name__ == "__main__":
    main()

我得到的错误是:

Traceback (most recent call last):
  File "watermark.py", line 101, in <module>
    main()
  File "watermark.py", line 83, in main
    wmpageObj = add_watermark(mywatermark, pdfReader.getPage(page))
  File "watermark.py", line 32, in add_watermark
    wmFileObj = open(wmFile, 'rb')
TypeError: expected str, bytes or os.PathLike object, not PdfFileReader

我想我已经明白了,它需要的是一个字符串、字节或文件,而我并没有写这些东西,它只是一个“对象”。
我尝试了几件事,但无论我尝试什么,实际上都会让事情变得更糟
有人能帮忙吗?我很确定这只是一件小事,因为我擅长监督显而易见的事情。
任何帮助都是感激的。
谢谢

avwztpqn

avwztpqn1#

我将把指南和缺陷留到最后,下面是您如何修复这段代码:
1)将变量“packet”设置为脚本所在目录中的现有PDF文件名:

packet = 'my_watermark.pdf'

2)删除移动到'stringIO'缓冲区开头的操作(就像我们需要它一样):

packet.seek(0)     # delete this
mywatermark = PyPDF2.PdfFileReader(packet) #delete this too

3)在for循环块中给予“packet”作为参数,而不是“mywatermark”:

wmpageObj = add_watermark(packet, pdfReader.getPage(page))

4)从add_watermark函数删除文件打开和关闭,仅保留PdfFileReader示例的构造,但使用参数“wmFile”:

wmFileObj = open(wmFile, 'rb')                # delete this
pdfReader = PyPDF2.PdfFileReader(wmFile)      # let this be, but change wmFileObj to wmFile
pageObj.mergePage(pdfReader.getPage(0))       # let this be
wmFileObj.close()                             # delete this
return pageObj                                # let this be

另外,在你的代码中,你的main函数中有一些导入,把它们移到文件的开头,并阅读一些文档。PyPDF2的文档展示了如何合并页面(这是模块的特色tbh),虽然它有点简洁,但另一方面,Reportlab的用户指南非常全面,但很直接。总是试图看到代码背后的含义。

相关问题