我使用es bulkprocessor编写数据(我尝试了python脚本、storm es bolt、flink es sink),但是在创建索引Map之后,索引速度太慢了。
情况1:保留所有索引设置为默认值,索引率可以达到10000+左右。
情况二:只需创建索引Map,索引率就降到3000。
我使用相同的数据,相同的代码,相同的机器。
结果
flink es sink将json数据写入es:
我的数据
重复在下面写入相同的数据 message
字段是原始日志,大小约为7kb,用于删除超出问题限制的部分内容):
{
"_index": "nyc_flink_test997",
"_type": "doc",
"_id": "k8uS92cBOH4ugSIjCzmn",
"_score": 1,
"_source": {
"exception": "false",
"log_id": "8F71AF1606EE46BFA9D57AA2282D8596",
"offset": "2368",
"message_length": "2103",
"level": "INFO",
"source": "/opt/hadoop/elastic-stack/s_login/Gusermanager.usermanager.s_login.20.log",
"sessionid": "provider-60-2883b4bd3ff2b",
"associate_id": "33d081b83a0654a2",
"message": """
[16:41:33.376][I][ec4edfe0b2584b73]log start:53F9A1A1E71044E281755E930E1B004C
[16:41:33.376][T][ec4edfe0b2584b73]入参0=__REQ__
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4119)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2570)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2731)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2815)
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2155)
at com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:2322)
at cn.com.agree.addal.cp.ProxyPreparedStatement.executeQuery(ProxyPreparedStatement.java:46)
at tc.bank.aesb.mbs.MBS_DBIMPL.PyDBGetSel(MBS_DBIMPL.java:1624)
at tc.bank.aesb.mbs.MBS_DBIMPL.PyDBExecOneSQL(MBS_DBIMPL.java:466)
at tc.bank.aesb.mbs.MBS_DBIMPL.PyDBExecGrpSQL(MBS_DBIMPL.java:123)
at tc.bank.aesb.mbs.B_MBS_DataBase.B_DBUnityRptOpr(B_MBS_DataBase.java:121)
at CUST.CustomerInfoQry.TCustomerInfoQry$Step1$Node4.execute(TCustomerInfoQry.java:200)
at CUST.CustomerInfoQry.TCustomerInfoQry$Step1.execute(TCustomerInfoQry.java:113)
at CUST.CustomerInfoQry.TCustomerInfoQry.execute(TCustomerInfoQry.java:76)
at cn.com.agree.afa.svc.javaengine.JavaEngine.execute(JavaEngine.java:237)
at cn.com.agree.afa.svc.handler.TradeHandler.handle(TradeHandler.java:62)
[16:41:33.414][I][ec4edfe0b2584b73]log end:53F9A1A1E71044E281755E930E1B004C
""",
"exec_ip": "10.88.188.167",
"start_time": "2018-12-09 16:46:14.764",
"group_v2": "Gusermanager",
"script_exec_time": "1",
"trade_exec_time": "2"
}
}
索引Map
{
"mappings": {
"doc":{
"dynamic_templates": [
{
"string_fields": {
"match": "*",
"match_mapping_type": "string",
"mapping": {
"type": "text",
"norms": false,
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
],
"properties": {
"@timestamp": {
"type": "date"
},
"@version": {
"type": "keyword"
},
"geoip": {
"dynamic": true,
"properties": {
"ip": {
"type": "ip"
},
"location": {
"type": "geo_point"
},
"latitude": {
"type": "half_float"
},
"longitude": {
"type": "half_float"
}
}
},
"exception": {
"type": "boolean"
},
"message":{
"type":"text",
"norms": false,
"analyzer": "ik_max_word"
},
"associate_id": {
"type": "text",
"analyzer": "ik_max_word"
},
"end_time": {
"type": "date",
"format": "date_time||yyyy-MM-dd HH:mm:ss.SSS||yyyy-MM-dd||epoch_millis||HH:mm:ss.SSS"
},
"start_time": {
"type": "date",
"format": "date_time||yyyy-MM-dd HH:mm:ss.SSS||yyyy-MM-dd||epoch_millis||HH:mm:ss.SSS"
},
"exec_ip": {
"type": "ip"
},
"level": {
"type": "keyword"
},
"script_exec_time": {
"type": "long"
},
"trade_exec_time": {
"type": "long"
},
"sessionid": {
"type": "text",
"analyzer": "ik_max_word"
},
"log_id": {
"type": "text",
"analyzer": "ik_max_word"
},
"discard_time": {
"type": "long"
},
"scene_code": {
"type": "text",
"analyzer": "ik_max_word"
},
"service_code": {
"type": "text",
"analyzer": "ik_max_word"
},
"group": {
"type": "text"
},
"group_v2" :{
"type": "text",
"analyzer": "ik_max_word"
},
"message_length":{
"type": "long"
},
"log_filename":{
"type": "text",
"analyzer": "ik_max_word"
},
"ingest_time":{
"type": "date"
}
}
}
}
}
我试着用python编写scirpts,storm es bolt,结果是一样的,创建索引Map后索引率下降。有人能给我一些建议吗。提前谢谢。
暂无答案!
目前还没有任何答案,快来回答吧!