elasticsearch从本地json文件[python]中查找单词

bmp9r5qi  于 2021-06-15  发布在  ElasticSearch
关注(0)|答案(0)|浏览(334)

我尝试使用elasticsearch,因为它可以通过python从本地json文件获取一些值和频率。json文件有多个术语,如下所示;

[{"id": "251088", "tweet": "lorem ipsum", "username": "Ahmet"},
{"id": "251059", "tweet": "bla bla bla","username": "Ali", },
...
]

json文件包含大约500k条tweets和信息。
我的目标是通过elasticsearch更快地获得术语频率。

import requests, json, os
from elasticsearch import Elasticsearch

es = Elasticsearch([{'host': 'localhost', 'port': 9200}])
i = 1
f = open("tweets_test.json")
docket_content = f.read()

# print(docket_content)

# only wait for 1 second, regardless of the client's default

es.cluster.health(wait_for_status='yellow', request_timeout=1)

es.index(index='tweets', ignore=[400, 404], doc_type='docket', id=i, body=json.loads(docket_content))

res = es.search(index="tweets", doc_type="docket", body={"query": {"match": {"tweet": "any-word"}}})
print("%d documents found" % res['hits']['total'])
for doc in res['hits']['hits']:
    print("%s) %s" % (doc['_id'], doc['_source']['content']))

输出为;

0 documents found 
Process finished with exit code 0

为什么不使用这个代码?
我得到术语频率的平台错了吗?

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题