我有这样的Elasticsearch数据-
PUT /text/_doc/1 { "name": "pdf1", "text":"For the past six weeks. The unemployment crisis has unfolded so suddenly and rapidly." } PUT /text/_doc/2 { "name": "pdf2", "text":"The unemployment crisis has unfolded so suddenly and rapidly." }
在此示例中,我进行了全文搜索,正在“ text”字段中搜索所有具有“ unemployment”子字符串的文档。最后,我希望所有文档以“文本”字段中“失业”字符串的索引值的升序排列。例如- 子字符串“失业”首先在doc2中的索引“ 4”处出现,因此我希望此文档在结果中首先返回。
GET /text/_search?pretty { "query": { "match": { "text": "unemployment" } } }
我尝试了一些诸如term_vector这样的事情,这是我使用的映射,但没有帮助。
PUT text/_mapping { "properties": { "name" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword" } } }, "text" : { "type" : "text", "term_vector": "with_positions_offsets" } } }
谁能帮助我进行正确的映射和搜索查询?
提前致谢!
试试这个查询
GET text/_search { "query": { "function_score": { "query": { "match": { "text": "unemployment" } }, "functions": [ { "script_score": { "script": { "source": """ def docval = doc['text.keyword'].value; def length = docval.length(); def index = (float) docval.indexOf('unemployment'); // the sooner the word appears the better so 'invert' the 'index' return index > -1 ? (1 / index) : 0; """ } } } ], "boost_mode": "sum" } } }
使用自动生成的映射
{ "text" : { "mappings" : { "properties" : { "name" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } }, "text" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } } } } } }
请注意,这是区分大小写的,因此也有一个小写标准化的关键字字段,然后在脚本分数脚本中对其进行访问是合理的。这可能会让您走上正确的道路。