小编典典

elasticsearch得到太多结果,需要帮助过滤查询

elasticsearch

我在理解ES查询系统的基础方面遇到很多问题。

我有以下查询示例:

{
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "referer": "www.xx.yy.com"
          }
        },
        {
          "range": {
            "@timestamp": {
              "gte": "now",
              "lt": "now-1h"
            }
          }
        }
      ]
    }
  },
  "aggs": {
    "interval": {
      "date_histogram": {
        "field": "@timestamp",
        "interval": "0.5h"
      },
      "aggs": {
        "what": {
          "cardinality": {
            "field": "host"
          }
        }
      }
    }
  }
}

该请求得到太多结果:

“状态”:500,“原因”:“ ElasticsearchException
[org.elasticsearch.common.breaker.CircuitBreakingException:数据太大,字段[@timestamp]的数据将大于[3200306380
/ 2.9gb]]的限制;嵌套: UncheckedExecutionException
[org.elasticsearch.common.breaker.CircuitBreakingException:数据太大,字段[@timestamp]的数据将大于限制[3200306380
/ 2.9gb]];嵌套:CircuitBreakingException [数据太大,字段[@的数据]时间戳记]将大于[3200306380 /
2.9gb]的限制];“

我试过了这个请求:

{
  "size": 0,
  "filter": {
    "and": [
      {
        "term": {
          "referer": "www.geoportail.gouv.fr"
        }
      },
      {
        "range": {
          "@timestamp": {
            "from": "2014-10-04",
            "to": "2014-10-05"
          }
        }
      }
    ]
  },
  "aggs": {
    "interval": {
      "date_histogram": {
        "field": "@timestamp",
        "interval": "0.5h"
      },
      "aggs": {
        "what": {
          "cardinality": {
            "field": "host"
          }
        }
      }
    }
  }
}

我想过滤数据以便能够获得正确的结果,我们将不胜感激!


阅读 482

收藏
2020-06-22

共1个答案

小编典典

我找到了解决方案,这很奇怪。我遵循了dimzak的建议并清除了缓存:

curl --noproxy localhost -XPOST "http://localhost:9200/_cache/clear"

然后我使用了过滤,而不是按照Olly的建议进行查询:

{
  "size": 0,
  "query": {
    "filtered": {
      "query":  {
        "term": {
          "referer": "www.xx.yy.fr"
        }
      },
      "filter" : { 
        "range": {
          "@timestamp": { 
            "from": "2014-10-04T00:00", 
            "to": "2014-10-05T00:00"
          }  
        }
      }
    }
  },
  "aggs": {
  "interval": {
    "date_histogram": {
    "field": "@timestamp",
    "interval": "0.5h"
    },
    "aggs": {
    "what": {
      "cardinality": {
      "field": "host"
      }
    }
    }
  }
  }
}

我不能给你们两个答案,我认为dimzak是最好的选择,但是请你们两个人赞成:)

2020-06-22