从PDF中删除段落- Java

yfjy0ee7  于 2023-01-16  发布在  Java
关注(0)|答案(1)|浏览(143)

问候,什么是最简单的方法来删除文本/段落从PDF文档。该程序需要一个PDF文档,并创建一个单独的PDF文档的每一页。在每个文档中,我有文本从原来的,我想删除。我尝试了几个例子,但它不工作,或者他们使用旧的库我使用的iText 7库

private void processPDF(String src, String dest) throws IOException, DocumentException {

    PdfReader reader = new PdfReader(src);
    PdfDictionary dict = reader.getPageN(1);
    PdfObject object = dict.getDirectObject(PdfName.CONTENTS);

    if (object instanceof PRStream) {
        PRStream stream = (PRStream) object;
        byte[] data = PdfReader.getStreamBytes(stream);
        String dd = new String(data, "UTF8")
                .replace("Hand made software", "");
        stream.setData(dd.getBytes("UTF8"));
        if (dd.contains("Hand made software")) {
            System.out.println("Contains");
        } 
    }

    PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
    stamper.close();
    reader.close();

}

private void processPDF2(String src, String dest) throws InvalidPasswordException, IOException {
    Map<String, String> map = new HashMap<>();
    map.put("Hand made software", "");
    File template = new File(src);
    PDDocument document = PDDocument.load(template);
    List<PDField> fields = document.getDocumentCatalog().getAcroForm().getFields();
    for (PDField field : fields) {
        for (Map.Entry<String, String> entry : map.entrySet()) {
            if (entry.getKey().equals(field.getFullyQualifiedName())) {
                field.setValue(entry.getValue());
                field.setReadOnly(true);
            }
        }
    }
    File out = new File(dest);
    document.save(out);
    document.close();
}

我想删除“手工制作软件”行

0md85ypi

0md85ypi1#

你可以轻松地遍历PDF元素。首先创建一个PDF阅读器和写入器,它将读取位于src字符串路径中的模板。

File template = new File(src);
PdfReader reader = new PdfReader(template);

File out = new File(dest);
PdfWritter writter = new PdfWritter(out);

然后通过首先创建PdfDocument来创建Document对象:

PdfDocument pdf = new PdfDocument(reader, writter);
Document document = new Document(pdf);

最后迭代pdf文档的元素,直到找到“手工制作的软件”行:

for (int i = 0; i < document.getRoots().size(); i++) {
     if (document.getRoots().get(i) instanceof Paragraph) { //iterate only through paragraphs
         Paragraph paragraph = (Paragraph) document.getRoots().get(i);
         if(paragraph.getText().equals("Hand made software")){ //if the paragraph equals to the string to be removed, remove from the document
             document.getRoots().remove(i);
             i--;
         }
     }
}

最后关闭文档以保存更改

document.close();

相关问题