spring AWS Textract使用Java/ Sping Boot 的QUERIES功能

1l5u6lss  于 11个月前  发布在  Spring
关注(0)|答案(1)|浏览(90)

我正在处理一个需要使用AWS Texextract的Spring Boot 项目。我想使用查询功能。但我不知道在哪里包含查询。
我查阅了aws文档,并使用了他们的java sdk v2示例代码。但我不知道在哪里添加查询?例如“客户的名字是什么”。我应该在哪里添加查询文档。
下面是我使用的代码

public static void analyzeDoc(TextractClient textractClient, String sourceDoc) {

        try {
            InputStream sourceStream = new FileInputStream(new File(sourceDoc));
            SdkBytes sourceBytes = SdkBytes.fromInputStream(sourceStream);

            // Get the input Document object as bytes
            Document myDoc = Document.builder()
                    .bytes(sourceBytes)
                    .build();

            List<FeatureType> featureTypes = new ArrayList<>();
            featureTypes.add(FeatureType.QUERIES);

            AnalyzeDocumentRequest analyzeDocumentRequest = AnalyzeDocumentRequest.builder()
                    .featureTypes(featureTypes)
                    .document(myDoc)
                    .build();

            AnalyzeDocumentResponse analyzeDocument = textractClient.analyzeDocument(analyzeDocumentRequest);
            List<Block> docInfo = analyzeDocument.blocks();
            Iterator<Block> blockIterator = docInfo.iterator();

            while(blockIterator.hasNext()) {
                Block block = blockIterator.next();
                System.out.println("The block type is " + block.blockType().toString());
            }

        } catch (TextractException | FileNotFoundException e) {

            System.err.println(e.getMessage());
            System.exit(1);
        }

    }

字符串

3htmauhk

3htmauhk1#

在花了很多时间之后,我终于找到了一种方法来包含查询。希望这能帮助像我一样被卡住的人。

public static void analyzeDoc(TextractClient textractClient, String sourceDoc) {

    try {
        InputStream sourceStream = new FileInputStream(sourceDoc);
        SdkBytes sourceBytes = SdkBytes.fromInputStream(sourceStream);

        // Get the input Document object as bytes
        Document myDoc = Document.builder()
                .bytes(sourceBytes)
                .build();

        List<FeatureType> featureTypes = new ArrayList<>();
        featureTypes.add(FeatureType.QUERIES);
        
        // building the Query config object with desired query to extract information
        QueriesConfig queryConfig = QueriesConfig.builder()
                .queries(Query.builder().text("YOUR QUERY GOES HERE").build())
                .build();

        AnalyzeDocumentRequest analyzeDocumentRequest = AnalyzeDocumentRequest.builder()
                .featureTypes(featureTypes)
                .queriesConfig(queryConfig)
                .document(myDoc)
                .build();

        AnalyzeDocumentResponse analyzeDocument = textractClient.analyzeDocument(analyzeDocumentRequest);
        List<Block> docInfo = analyzeDocument.blocks();
        Iterator<Block> blockIterator = docInfo.iterator();

        while (blockIterator.hasNext()) {
            Block block = blockIterator.next();
            System.out.println("The block type is " + block.blockType().toString());
            if (block.blockType().toString().equals("QUERY_RESULT")) {
                System.out.println(block.text());
            }
        }

    } catch (TextractException | FileNotFoundException e) {

        System.err.println(e.getMessage());
        System.exit(1);
    }

}

字符串

相关问题