我是Elasticsearch的新手。最近一直在探索这一点。
我想从一个表的ElasticSearch中获取所有数据。但在同一时间,我只需要thoose数据,其中“phoneNumber”字段不是一个有效的电话号码。我如何才能做到这一点?
数据结构:
{
"_index": "test_session3",
"_type": "_doc",
"_id": "f76adaaf-23e0-455a-9d74-6e2335b60bd9",
"_score": null,
"_source": {
"phoneNumber": "12424242424",
"utilizationFeeType": "flat",
"chargingEndTime": "2023-06-07T04:56:37.421Z",
"invFee": 22.22,
"sessionStartTime": "2023-06-07T04:56:38.340Z",
"chargerId": "584eb6b8-4f69-4628-b94a-3ea5c49e14bc",
"plugStatus": "connected",
"idleEndTime": "2023-06-07T04:56:37.421Z",
"sessionEndTime": "2023-06-07T04:56:39.108Z",
"activeFeeType": "MinimumFee",
"chargingStartTime": "2023-06-07T04:56:38.340Z",
"chargingRate": 7.7,
"createdAt": "2023-06-07T04:54:56.604Z",
"zone": "Asia/Dhaka",
"transactionId": 3507,
"energyConsumed": 0,
"updatedAt": "2023-06-07T04:56:38.340Z",
"userId": "53753b5f-1e5f-491f-b8bb-dcd6658f6e5d",
"sessionStatus": "ended",
"id": "f76adaaf-23e0-455a-9d74-6e2335b60bd9",
"chargingDuration": -919,
"idleDuration": 0,
"sessionDuration": -919,
"totals": 12.07,
"idleFee": 0,
"idleTime": "",
"userName": "fex fo",
"userType": "admin",
"doubleinvFee": 22.22,
"doubleTotals": 12.07,
"locationId": "e3ba8172-0a8c-4c09-9f71-f074e094de9d",
"propertyId": "a0099bbd-13aa-4c2c-a2ec-68b8d0967947",
"companyId": "cf58acca-d7df-42b4-9555-7693d4fcc73c",
"location": {
"currentPropertyId": "a0099bbd-13aa-4c2c-a2ec-68b8d0967947",
"currentCompanyId": "cf58acca-d7df-42b4-9555-7693d4fcc73c",
"title": "Lubowitz Avenue 53370812",
"landmark": "Southwest of the front entrance",
"zip": "10001",
"city": "New York",
"state": "New York",
"country": "US",
"address": "East Avenue, Rochester, NY, USA",
"latitude": "34.88923",
"longitude": "-118.13612",
"online": true,
"availableForGuest": true,
"status": "Active",
"id": "e3ba8172-0a8c-4c09-9f71-f074e094de9d",
"currentCompanyName": "Rich Information Technology",
"currentPropertyName": "East Avenue",
"chargersCount": 2,
"longitudeNum": -118.13612,
"latitudeNum": 34.88923,
"location": {
"lat": 34.88923,
"lon": -118.13612
}
},
"company": {
"zip": "10001",
"status": "Active",
"zohoNewComAdded": true,
"isDeleted": false,
"email": "contact@richinfotech.org",
"country": "US",
"name": "Rich Information Technology",
"state": "New York",
"city": "New York",
"byCreatedAt": "byCreatedAt",
"fileId": "15f98c05-5615-4d95-8cd6-23323668cc7f",
"byEmail": "byEmail",
"id": "cf58acca-d7df-42b4-9555-7693d4fcc73c",
"zohoVendorId": "4064488000000194001",
"phone": "1010101011",
"byPhone": "byPhone",
"website": "richinfotech.org",
"ein": "",
"createdAt": "2022-10-12T10:54:27.977Z",
"zohoCompanyId": "4064488000000318001",
"address": "Silicon Valley",
"byName": "byName",
"zohoCompanyErrorMessage": "",
"updatedAt": "2023-06-05T09:27:25.158Z",
"zohoCompanyError": false
}
},
"sort": [
1686113799108
]
}
我想用正则表达式验证数据:/^+?[0-9]{11}$/ ->其中电话号码的长度必须为11,并且可以有前导“+”
以下是我的方法:
const searchFilters = {
query: {
bool: {
must: [],
must_not: [
{
regexp: {
'phoneNumber.keyword': '^\\+?[0-9]{11}$',
},
},
],
filter: [],
},
},
size: 10000,
};
const resp = await ElasticSearchHelper.search(IndexNames.SESSION, searchFilters);
const data = resp.body?.hits?.hits;
但它返回空数组数据。我希望有所有的数据,其中电话号码是无效的。
2条答案
按热度按时间iqih9akk1#
Tldr;
由于elasticsearch的regex运算符支持,您没有任何结果。在文档中提到,elasticsearch正则表达式不支持
^
和$
。Lucene的正则表达式引擎不支持锚操作符,例如^(行首)或$(行尾)。要匹配一个术语,正则表达式必须匹配整个字符串。
您将需要构建另一个匹配Lucene约束的正则表达式(搜索库elasticsearch就是构建在其上的)
nkcskrwz2#
你的方法看起来基本正确,但是在正则表达式和Elasticsearch查询中有几个问题。以下是您可以进行的调整,以使用给定的正则表达式验证数据并检索所需的结果:
1.正则表达式:您提供的正则表达式模式
^+?[0-9]{11}$
有一个小错误。+
字符需要用反斜杠(\
)进行转义,以按字面意思匹配。正确的正则表达式模式是^\+?[0-9]{11}$
。regexp
查询,而是将must_not
子句与term
查询一起使用,以实现基于电话号码字段的所需筛选。以下是Elasticsearch查询的更新版本:在
must_not
子句中,可以通过设置term
查询的value
字段来提供要排除的特定无效电话号码。这将获取电话号码字段不等于指定的无效电话号码的所有数据。确保将
IndexNames.SESSION
替换为Elasticsearch索引的正确名称。通过这些调整,您应该能够根据定义的正则表达式模式检索电话号码无效的数据。