我目前正在测试一些基本的手写文本识别的谷歌视觉API,并没有麻烦得到一个体面的回应,我的形象。
但是,我正在努力保存收到的本地响应。我尝试了几种解析AnnotateImageResponse的方法。我试过:
- 使用来自google.protobuf.json_format的MessageToJson和MessageToDict
- response.SerializeToString()inside json.loads()
- 将响应保存为二进制文件并从磁盘重新加载以进行JSON解析
- 仅保存部分响应(分别为response.text_annotations和response.full_text_annotation)
我一直得到一个错误,告诉我对象没有一个名为“DESCRIPTOR”的属性。
当我查看我将响应转换为String创建的response.txt文件时,我意识到full_text_annotations在顶级键上没有描述符标记,所以我认为只保存text_annotations可以解决这个问题。不幸的是,我仍然得到同样的错误。
这是我的代码到目前为止(不工作)
import io
import os
import json
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from google.cloud import vision
from google.protobuf.json_format import MessageToJson
from google.protobuf import json_format
vision_client = vision.ImageAnnotatorClient()
path = './images/'
name = 'test.jpg'
with io.open(path+name, 'rb') as image_file:
opened_image = image_file.read()
image = vision.Image(content=opened_image)
response = vision_client.document_text_detection(image=image, image_context={"language_hints": ["en-t-i0-handwrit"]})
# tried extracting only whole words here - doesn't work
all_words = response.text_annotations
all_words_json = MessageToJson(all_words)
# Causes AttributeError: 'RepeatedComposite' object has no attribute 'DESCRIPTOR'
感谢所有帮助,无论是关于如何将响应直接转换为JSON文件,还是如何正确地迭代它以将其转换为JSON。谢谢!
这是收到的Cloud Vision响应的一部分:
...
text_annotations {
description: "Congratulations"
bounding_poly {
vertices {
x: 2334
y: 2452
}
vertices {
x: 3284
y: 2464
}
vertices {
x: 3282
y: 2615
}
vertices {
x: 2332
y: 2603
}
}
}
text_annotations {
description: "!"
bounding_poly {
vertices {
x: 3321
y: 2464
}
vertices {
x: 3411
y: 2465
}
vertices {
x: 3409
y: 2615
}
vertices {
x: 3319
y: 2614
}
}
}
full_text_annotation {
pages {
property {
detected_languages {
language_code: "en"
confidence: 0.991942465
}
}
width: 4032
height: 3024
blocks {
bounding_box {
vertices {
x: 446
y: 486
}
vertices {
x: 3541
y: 475
}
vertices {
x: 3549
y: 2618
}
vertices {
x: 454
y: 2629
}
}
paragraphs {
bounding_box {
vertices {
x: 2490
y: 912
}
vertices {
x: 2633
y: 910
}
vertices {
x: 2634
y: 957
}
vertices {
x: 2491
y: 959
}
}
words {
bounding_box {
vertices {
x: 2490
y: 912
}
vertices {
x: 2633
y: 910
}
vertices {
x: 2634
y: 957
}
vertices {
x: 2491
y: 959
}
}
symbols {
bounding_box {
vertices {
x: 2490
y: 913
}
vertices {
x: 2552
y: 912
}
vertices {
x: 2553
y: 958
}
vertices {
x: 2491
y: 959
}
}
text: "a"
confidence: 0.961887
}
symbols {
property {
detected_break {
type_: LINE_BREAK
}
}
bounding_box {
vertices {
x: 2552
y: 911
}
vertices {
x: 2633
y: 910
}
vertices {
x: 2634
y: 955
}
vertices {
x: 2553
y: 956
}
}
text: "m"
confidence: 0.957640529
}
confidence: 0.959763765
}
confidence: 0.959763765
}
paragraphs {
bounding_box {
vertices {
x: 471
y: 485
}
...
1条答案
按热度按时间nvbavucw1#
我现在想通了:AnnotateImageResponse是一个ProtoBuffer对象,因此在将响应传递给MessageToJson函数时需要使用特殊的符号:
请注意,我们将
response._pb
而不是response
传递给MessageToJson()
。