用PHP阅读PDF的内容？

0g0grzrc 于 2023-09-29 发布在 PHP

关注(0)|答案(4)|浏览(86)

我需要从一个复杂的PDF中读取某些部分。我在网上搜索了一下，有人说FPDF很好，但是它不能读PDF，只能写。有没有一个lib可以让你获取给定PDF的某些内容？
如果没有，阅读给定PDF的某些部分的好方法是什么？
谢谢你，谢谢

php

来源：https://stackoverflow.com/questions/8835833/read-the-content-of-a-pdf-with-php

4条答案

按热度按时间

zzwlnbp81#

我在这里看到两个解决方案：

将您的PDF文件转换为其他文件之前：text，html.
使用一个库来这样做，这里有个坏消息，它们中的大多数都是用Java编写的。

https://whatisprymas.wordpress.com/2010/04/28/lucene-how-to-index-pdf-files/（存档version from 2012）

赞(0）回复(0）举报 2023-09-29

jgwigjjp2#

$result = pdf2text ('sample.pdf');
echo "<pre>$result</pre>";

如何获得“干净”的文本：source code pdf2text
http://webcheatsheet.com/php/reading_clean_text_from_pdf.php

赞(0）回复(0）举报 2023-09-29

pvcm50d13#

那怎么样？
http://www.phpclasses.org/package/702-PHP-Searches-pdf-documents-for-text.html
ps：我不测试这个类，只是阅读说明。

赞(0）回复(0）举报 2023-09-29

k5hmc34c4#

现在还有https://github.com/smalot/pdfparser：

use Smalot\PdfParser\Parser;

$pdfParser = new Parser();
$pdf = $pdfParser->parseFile('../path/to/your.pdf');

$content = $pdf->getText()

// or if you need to maintain the paragraphs
$content = preg_replace('/\s{3,}/m', "\n\n", trim($pdf->getText()));

赞(0）回复(0）举报 2023-09-29

我来回答

用PHP阅读PDF的内容？

4条答案

相关问题

热门标签

最新问答