例如,我尝试在给定字段上使用angularjs和elasticsearch创建自动完成功能countryname。它可以包含简单的名称,例如“法国”,“西班牙”或“组合名称”,例如“塞拉利昂”。
countryname
在映射中,此字段是not_analyzed为了防止弹性标记“组合名称”
not_analyzed
"COUNTRYNAME" : {"type" : "string", "store" : "yes","index": "not_analyzed" }
我需要查询elasticsearch:
我不能在“ not_analyzed”字段中使用通配符:
这是我的查询,但“值”变量中的通配符不起作用,并且区分大小写:
仅通配符她的工作:
curl -XGET 'local_host:9200/botanic/specimens/_search?size=0' -d '{ "fields": [ "COUNTRYNAME" ], "query": { "query_string": { "query": "COUNTRYNAME:*" } }, "aggs": { "general": { "terms": { "field": "COUNTRYNAME", "size": 0 } } } }'
但这不起作用(franc *):
curl -XGET 'local_host:9200/botanic/specimens/_search?size=0' -d '{ "fields": [ "COUNTRYNAME" ], "query": { "query_string": { "query": "COUNTRYNAME:Franc*" } }, "aggs": { "general": { "terms": { "field": "COUNTRYNAME", "size": 0 } } } }'
我也尝试过bool must query但不使用此not_analyzed字段和通配符:
bool must query
curl -XGET 'local_host:9200/botanic/specimens/_search?size=0' -d '{ "fields": [ "COUNTRYNAME" ], "query": { "bool": { "must": [ { "match": { "COUNTRYNAME": "Franc*" } } ] } }, "aggs": { "general": { "terms": { "field": "COUNTRYNAME", "size": 0 } } } }'
我缺少或做错了什么?我应该analyzed在映射中保留该字段,并使用另一个不将组合名称拆分为令牌的分析器吗?
analyzed
我找到了一个可行的解决方案:“关键字”令牌生成器。创建一个自定义分析器,并将其用于我要保留的字段的映射中,而不用空格分开:
curl -XPUT 'localhost:9200/botanic/' -d '{ "settings":{ "index":{ "analysis":{ "analyzer":{ "keylower":{ "tokenizer":"keyword", "filter":"lowercase" } } } } }, "mappings":{ "specimens" : { "_all" : {"enabled" : true}, "_index" : {"enabled" : true}, "_id" : {"index": "not_analyzed", "store" : false}, "properties" : { "_id" : {"type" : "string", "store" : "no","index": "not_analyzed" } , ... "LOCATIONID" : {"type" : "string", "store" : "yes","index": "not_analyzed" } , "AVERAGEALTITUDEROUNDED" : {"type" : "string", "store" : "yes","index": "analyzed" } , "CONTINENT" : {"type" : "string","analyzer":"keylower" } , "COUNTRYNAME" : {"type" : "string","analyzer":"keylower" } , "COUNTRYCODE" : {"type" : "string", "store" : "yes","index": "analyzed" } , "COUNTY" : {"type" : "string","analyzer":"keylower" } , "LOCALITY" : {"type" : "string","analyzer":"keylower" } } } } }'
因此我可以在字段COUNTRYNAME的查询中使用通配符,该字段不会被拆分:
curl -XGET 'localhost:9200/botanic/specimens/_search?size=10' -d '{ "fields" : ["COUNTRYNAME"], "query": {"query_string" : { "query": "COUNTRYNAME:bol*" }}, "aggs" : { "general" : { "terms" : { "field" : "COUNTRYNAME", "size":0 } } }}'
结果 :
{ "took" : 14, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 45, "max_score" : 1.0, "hits" : [{ "_index" : "botanic", "_type" : "specimens", "_id" : "91E7B53B61DF4E76BF70C780315A5DFD", "_score" : 1.0, "fields" : { "COUNTRYNAME" : ["Bolivia, Plurinational State of"] } }, { "_index" : "botanic", "_type" : "specimens", "_id" : "7D811B5D08FF4F17BA174A3D294B5986", "_score" : 1.0, "fields" : { "COUNTRYNAME" : ["Bolivia, Plurinational State of"] } } ... ] }, "aggregations" : { "general" : { "buckets" : [{ "key" : "bolivia, plurinational state of", "doc_count" : 45 } ] } } }