Android利用PdfRenderer解析并展示Jsoup获取的PDF文件

x33g5p2x  于2021-12-28 转载在 Android  
字(2.8k)|赞(0)|评价(0)|浏览(539)

前言:最近一个安卓项目需要爬取网页上的PDF,并通过ImageView展示,然后发现网上基本都是利用pdfbox和ImageIO来处理的,但是安卓并不支持ImageIO,于是利用安卓自带的PdfRenderer来解决这一问题。

1.获取PDF流

爬虫工具因人而异,我选择的是Jsoup:

connection = Jsoup.connect(url);
connection.header("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.61 Safari/537.36");
response = connection.cookies(cookies_innet).cookies(cookies).ignoreContentType(true).followRedirects(true)
                .method(Connection.Method.GET).execute();

url为PDF下载链接,最终我们返回了一个response,我们打印response的contentType:

System.out.println(response.contentType());

结果为:

application/pdf

2.利用pdfbox保存为png文件

这里先讲利用pdfbox来将pdf流存为jpg或者png文件:

<dependencies>
    <dependency>
        <groupId>org.apache.pdfbox</groupId>
        <artifactId>pdfbox</artifactId>
        <version>2.0.4</version>
    </dependency>
</dependencies>

先获取response的bodyStream():

BufferedInputStream in = response.bodyStream();

接着存入文件:

ocument doc = PDDocument.load(in);
PDFRenderer renderer = new PDFRenderer(doc);
int pageCount = doc.getNumberOfPages();
for(int i = 0; i < pageCount; i++) {
    BufferedImage image = renderer.renderImageWithDPI(i, 500);
// BufferedImage image = renderer.renderImage(i, 1.0f);
    ImageIO.write(image, "PNG", new File("total.png"));
}

但令人遗憾的是,安卓当中并不允许使用ImageIO,所以我们只能利用安卓自带的处理PDF的包:
      android.graphics.pdf.PdfRenderer

3.安卓自带的PdfRenderer

首先我们先将pdf流保存为pdf文件:(这一步是否需要我也不太清楚,我是觉得下方生成ParcelFileDescriptor对象需要一个PDF文件对象,于是就先保存为PDF文件了)

static void saveImage(BufferedInputStream in) throws IOException {
    System.out.println(Environment.getExternalStorageDirectory() +"/score.pdf");
    File file=new File(Environment.getExternalStorageDirectory() +"/score.pdf");
    int index;
    byte[] bytes = new byte[1024];
    FileOutputStream out = new FileOutputStream(file);
    while ((index = in.read(bytes)) != -1) {
        out.write(bytes, 0, index);
        out.flush();
    }
    in.close();
    out.close();
}

接着利用该PDF文件生成一个PdfRenderer对象:

File file = new File(Environment.getExternalStorageDirectory() +"/score.pdf");
ParcelFileDescriptor pdfFile = ParcelFileDescriptor.open(file, ParcelFileDescriptor.MODE_READ_ONLY);
PdfRenderer renderer = new PdfRenderer(pdfFile);

最后将renderer转为Bitmap并利用ImageView展示:

final int pageCount = renderer.getPageCount();//获取pdf的页码数
Bitmap[] bitmaps=new Bitmap[pageCount];//新建一个bmp数组用于存放pdf页面
Display display = getWindowManager().getDefaultDisplay();
Point outSize = new Point();
display.getSize(outSize);//不能省略,必须有
int screenWidth = outSize.x;//得到屏幕的宽度
int screenHeight = outSize.y;//得到屏幕的高度
for (int i = 0; i < pageCount; i++) {
    PdfRenderer.Page page = renderer.openPage(i);//根据i的变化打开每一页
    Bitmap bitmap = Bitmap.createBitmap(page.getWidth() * screenWidth / page.getHeight(), screenWidth, Bitmap.Config.ARGB_8888);//根据屏幕的高宽缩放生成bmp对象
// bitmap = adjustPhotoRotation(bitmap, 90);
    page.render(bitmap, null, null, PdfRenderer.Page.RENDER_MODE_FOR_DISPLAY);
    bitmaps[i] = bitmap;
    page.close();
}
imageView.setImageBitmap(bitmaps[0]);

相关文章