iT邦幫忙

第 12 屆 iThome 鐵人賽

DAY 13
0
Elastic Stack on Cloud

Python&Elasticsearch 入門系列 第 13

IT鐵人第13天 Elasticsearch 使用python查詢資料(1)

今天來的文章要講的是如何使用python操作Elasticsearch來操作文檔,再講程式碼之前,必須先知道ES搜尋的API如何使用,所以就先從搜尋的API開始吧

must(and)

must這個東西相當於數位邏輯裡面的and,簡單講就是全部的條件都達成才會把文檔搜尋出來,會算分
會算分是什麼意思呢,之後的文章會再提到,今天先著重學會搜尋吧!

{
    "bool": {
        "must": [
            條件1,
            條件2
        ]
    }
}

should(or)

should這個東西相當於數位邏輯裡面的or,簡單講就是所有條件內有一種條件達成就會把文檔搜尋出來

{
    "bool": {
        "should": [
            條件1,
            條件2
        ]
    }
}

filter(and)

功能跟must一樣,不會算分

{
    "bool": {
        "filter": [
            條件1,
            條件2
        ]
    }
}

must_not(not and)

must_not的作用是排除符合條件的文檔,通常會跟must或filter一起使用

{
    "bool": {
        "filter": [
            條件1,
            條件2
        ]
    }
}

term、terms

term用於指定條件,但不會先把條件分詞,而是直接去倒排索引搜尋,用這次的學生名單示範如何指定條件

#指定搜尋name=王小明的文檔
{
    "term": {
        "name": "王小明"
    }
}

terms可以一次用同一條件搜索多值

#搜尋name=王小明跟name=許小美
{
    "terms": {
        "name": [
            "王小明",
            "許小美"
        ]
    }
}

match

match用於指定條件,但會先把條件分詞,然後再去倒排索引搜尋,用這次的學生名單示範如何指定條件

#搜尋name包含“王”、“小”、“明”的文檔
{
    "match": {
        "name": "王小明"
    }
}

ES的搜尋就是must、should、filter、must_not搭配term、terms、match指定條件,完整格式如下

#搜尋age=20的文檔
{
    "query": {
        "bool": {
            "must": {
                "term": {
                    "age": 20
                }
            }
        }
    }
}

搜索出來的結果

{
  "took" : 17,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "school_members",
        "_type" : "_doc",
        "_id" : "qMYuynQBjyvRpz1FfbP9",
        "_score" : 1.0,
        "_source" : {
          "sid" : "s1090102",
          "name" : "許小美",
          "age" : 20,
          "class" : "資工二2"
        }
      },
      {
        "_index" : "school_members",
        "_type" : "_doc",
        "_id" : "qcYuynQBjyvRpz1FfbP9",
        "_score" : 1.0,
        "_source" : {
          "sid" : "s1090103",
          "name" : "風間",
          "age" : 20,
          "class" : "資工一1"
        }
      }
    ]
  }
}

排除sid=s1090103的文檔

{
    "query": {
        "bool": {
            "must": {
                "term": {
                    "age": 20
                }
            },
            "must_not": {
                "term": {
                    "sid": "s1090103"
                }
            }
        }
    }
}

結果:

{
  "took" : 7,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "school_members",
        "_type" : "_doc",
        "_id" : "qMYuynQBjyvRpz1FfbP9",
        "_score" : 1.0,
        "_source" : {
          "sid" : "s1090102",
          "name" : "許小美",
          "age" : 20,
          "class" : "資工二2"
        }
      }
    ]
  }
}

搜尋age=20或name=王小明

POST school_members/_search
{
    "query": {
        "bool": {
            "should": [
              {
                "term": {
                    "age": 20
                }
              },
              {
                "term": {
                  "name": "王小明"
                }
              }
            ]
        }
    }
}

結果:

{
  "took" : 28,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "school_members",
        "_type" : "_doc",
        "_id" : "YLWMz3QBFtpQmKq58RfM",
        "_score" : 1.0,
        "_source" : {
          "sid" : "s1090102",
          "name" : "許小美",
          "age" : 20,
          "class" : "資工二2"
        }
      },
      {
        "_index" : "school_members",
        "_type" : "_doc",
        "_id" : "YbWMz3QBFtpQmKq58RfM",
        "_score" : 1.0,
        "_source" : {
          "sid" : "s1090103",
          "name" : "風間",
          "age" : 20,
          "class" : "資工一1"
        }
      },
      {
        "_index" : "school_members",
        "_type" : "_doc",
        "_id" : "X7WMz3QBFtpQmKq58RfM",
        "_score" : 0.9808291,
        "_source" : {
          "sid" : "s1090101",
          "name" : "王小明",
          "age" : 18,
          "class" : "資工一1"
        }
      }
    ]
  }
}

python可以使用es提供search的方法搭配上面API的使用方始搜尋文檔,下面是完整程式碼

from elasticsearch import Elasticsearch
import json

def get_query():
    query = {
        "query": {
            "bool": {
                "must": {
                    "term": {
                        "age": 20
                    }
                }
            }
        }
    }
    return query

if __name__ == "__main__":
    es = Elasticsearch(hosts='192.168.1.59', port=9200)
    query = get_query()
    result = es.search(index='school', body=query)
    print(json.dumps(result, ensure_ascii=False))

今天的文章就到這邊啦,下一篇會教大家如何搜尋object、parent/child


上一篇
IT鐵人第12天 Elasticsearch 使用python匯入資料(2) bulk
下一篇
IT鐵人第14天 Elasticsearch 使用python查詢資料 parent/child
系列文
Python&Elasticsearch 入門30
圖片
  直播研討會
圖片
{{ item.channelVendor }} {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

尚未有邦友留言

立即登入留言