elasticsearch是一個高效的、可擴展的全文搜索引擎
C:\Program Files\elasticsearch
cd "C:\Program Files\elasticsearch\bin" && elasticsearch.bat
service install elasticsearch
net start elasticsearch
net stop elasticsearch
1
2 3 4 5 6 7 8 9 10 11 12 13 |
{
status: 200, name: "Smart Alec", cluster_name: "elasticsearch", version: { number: "1.6.0", build_hash: "cdd3ac4dde4f69524ec0a14de3828cb95bbb86d0", build_timestamp: "2015-06-09T13:36:34Z", build_snapshot: false, lucene_version: "4.10.4" }, tagline: "You Know, for Search" } |
es對外提供標准RESTAPI接口,使用他進行集群的所有操作:
格式:curl -X<REST verb> <Node>:<Port>/<Index>/<Type>/<ID>
cd "C:\Program Files\elasticsearch\bin" && plugin -i elasticsearch/marvel/latest
GET _cat/health?v
1
2 |
epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks 1442227489 18:44:49 elasticsearch yellow 1 1 50 50 0 0 50 0 |
說明:
status:表示集群的健康狀態,值可能為green,yellow,red, green表示主shard和replica(至少一個)正常,yellow表示主shard正常但replica都不正常,red表示有的主shard和replica都有問題
node.total:表示集群中節點的數量
GET /_cat/nodes?v
1
2 |
host ip heap.percent ram.percent load node.role master name silence 192.168.1.111 30 51 d * Thunderbird |
輸入: GET /_cat/indices?v
輸出:
1
2 3 |
health status index pri rep docs.count docs.deleted store.size pri.store.size yellow open .marvel-2015.09.02 1 1 93564 0 78.4mb 78.4mb yellow open .marvel-2015.09.01 1 1 39581 0 45.9mb 45.9mb |
輸入: PUT /test1?pretty
輸出:
1
2 3 |
{
"acknowledged" : true } |
查詢所有索引:
1
2 |
health status index pri rep docs.count docs.deleted store.size pri.store.size yellow open test1 5 1 0 0 575b 575b |
說明:
health:由於只運行一個節點,replica不能與主shard在同一node中,因此replica不正常,該index的狀態為yellow
index:為索引名稱
pri:表示主shard個數
rep:表示每個shard的復制個數
docs.count:表示index中document的個數
索引文檔
1
2 |
PUT |
輸出:
1
2 3 4 5 6 7 |
{
"_index" : "test1 "_type" : "user", "_id" : "1", "_version" : 1, "created" : true } |
1
2 |
POST |
輸出:
1
2 3 4 5 6 7 |
{
"_index" : "test1", "_type" : "user", "_id" : "2", "_version" : 1, "created" : true } |
1
2 |
POST |
輸出:
1
2 3 4 5 6 7 |
{
"_index" : "test1", "_type" : "user", "_id" : "AU_MdQoXRYiHSIs7UGBQ", "_version" : 1, "created" : true } |
說明: 在索引文檔時若需要指定文檔ID值則需要使用PUT或者POST提交數據並顯示指定ID值,若需要由es自動生成ID,則需要使用POST提交數據
讀取文檔:
輸入: GET /test1/user/1?pretty
輸出:
1
2 3 4 5 6 7 8 |
{
"_index" : "test1", "_type" : "user", "_id" : "1", "_version" : 1, "found" : true, "_source":{"name": "silence1"} } |
說明:
_index,_type:表示文檔存儲的Index和Type信息
_id:表示文檔的編號
_version:表示文檔的版本號,主要用於並發處理時使用樂觀鎖防止臟數據
found:表示請求的文檔是否存在
_souce:格式為json,為文檔的內容
注意:在之前我們並未創建user的Type,在進行文檔索引時自動創建了user,在es中可以不顯示的創建Index和Type而使用默認參數或者根據提交數據自定義,但不建議這么使用,在不清楚可能導致什么情況時顯示創建Index和Type並設置參數
刪除文檔:
輸入: DELETE /test1/user/1?pretty
輸出:
1
2 3 4 5 6 7 |
{
"found" : true, "_index" : "test1", "_type" : "user", "_id" : "1", "_version" : 2 } |
再次讀取文檔輸出:
1
2 3 4 5 6 |
{
"_index" : "test1", "_type" : "user", "_id" : "1", "found" : false } |
輸入: DELETE /test1?pretty
輸出:
1
2 3 |
{
"acknowledged" : true } |
初始化文檔輸入:
1
2 |
PUT |
修改文檔輸入:
1
2 |
PUT |
讀取文檔輸出:
1
2 3 4 5 6 7 8 |
{
"_index" : "test1", "_type" : "user", "_id" : "1", "_version" : 2, "found" : true, "_source":{"name" : "silence1"} } |
更新數據輸入:
1
2 |
POST /test1/user/1/_update?pretty {"doc" : {"name" : "silence3", "age":28}} |
讀取數據輸出:
1
2 3 4 5 6 7 8 |
{
"_index" : "test1", "_type" : "user", "_id" : "1", "_version" : 3, "found" : true, "_source":{"name":"silence3","age":28} } |
更新文檔輸入:
1
2 |
POST |
讀取文檔輸出:
1
2 3 4 5 6 7 8 |
{
"_index" : "test1", "_type" : "user", "_id" : "1", "_version" : 4, "found" : true, "_source":{"name":"silence3","age":29} } |
說明:需要POST使用script則必須在elasticsearch/config/elasticsearch.yml配置script.groovy.sandbox.enabled: true
修改(PUT)和更新(POST+_update)的區別在於修改使用提交的文檔覆蓋es中的文檔,更新使用提交的參數值覆蓋es中文檔對應的參數值
輸入:
1
2 |
DELETE /test1/user/_query?pretty
{"query" : {"match" : {"name" : "silence3"}}} |
輸出:
1
2 3 4 5 6 7 8 9 10 11 |
{
"_indices" : { "test1" : { "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 } } } } |
輸入: GET /test1/user/_count?pretty
輸出:
1
2 3 4 5 6 7 8 |
{
"count" : 0, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 } } |
輸入:
1
2 3 4 5 6 7 8 9 |
POST /test1/user/_bulk?pretty
{"index" : {"_id" : 1}} {"name" : "silence1"} {"index" : {"_id" : 2}} {"name" : "silence2"} {"index" : {}} {"name" : "silence3"} {"index" : {}} {"name" : "silence4"} |
輸入:
1
2 3 4 |
POST /test1/user/_bulk?pretty
{"update" : {"_id" : 1}} {"doc" : {"age" : 28}} {"delete" : {"_id" : 2}} |
通過文件導入數據: curl -XPOST "localhost:9200/test1/account/_bulk?pretty" --data-binary @accounts.json
查詢可以通過兩種方式進行,一種為使用查詢字符串進行提交參數查詢,一種為使用RESTAPI提交requesbody提交參數查詢
獲取所有文檔輸入: GET /test1/user/_search?q=*&pretty
1
2 3 4 |
POST /test1/user/_search?pretty
{ "query" : {"match_all" : {}} } |
輸出:
1
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
{
"took": 2, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 3, "max_score": 1, "hits": [ { "_index": "test1", "_type": "user", "_id": "1", "_score": 1, "_source": { "name": "silence1", "age": 28 } }, { "_index": "test1", "_type": "user", "_id": "AU_M2zgwLNdQvgqQS3MP", "_score": 1, "_source": { "name": "silence3" } }, { "_index": "test1", "_type": "user", "_id": "AU_M2zgwLNdQvgqQS3MQ", "_score": 1, "_source": { "name": "silence4" } } ] } } |
說明:
took: 執行查詢的時間(單位為毫秒)
timed_out: 執行不能超時
_shards: 提示有多少shard參與查詢以及查詢成功和失敗shard數量
hits: 查詢結果
hits.total: 文檔總數
_score, max_score: 為文檔與查詢條件匹配度和最大匹配度
輸入:
1
2 3 4 5 6 7 8 9 |
POST /test1/account/_search?pretty
{ "query" : {"match_all":{}}, "size": 2, "from" : 6, "sort" : { "age" : {"order" : "asc"} } } |
說明:
query: 用於定義查詢條件過濾
match_all: 表示查詢所有文檔
size: 表示查詢返回文檔數量,若未設置默認為10
from: 表示開始位置, es使用0作為開始索引,常與size組合進行分頁查詢,若未設置默認為0
sort: 用於設置排序屬性和規則
1
2 3 4 5 6 7 |
POST /test1/account/_search?pretty
{ "query": { "match_all": {} }, "_source":["firstname", "lastname", "age"] } |
輸出:
1
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
{
"took": 5, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 1000, "max_score": 1, "hits": [ { "_index": "test1", "_type": "account", "_id": "4", "_score": 1, "_source": { "firstname": "Rodriquez", "age": 31, "lastname": "Flores" } }, { "_index": "test1", "_type": "account", "_id": "9", "_score": 1, "_source": { "firstname": "Opal", "age": 39, "lastname": "Meadows" } } ] } } |
1
2 3 4 5 6 7 |
POST /test1/account/_search?pretty
{ "query": { "match": {"address" : "986 Wyckoff Avenue"} }, "size" : 2 } |
輸出:
1
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
{
"took": 1, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 216, "max_score": 4.1231737, "hits": [ { "_index": "test1", "_type": "account", "_id": "4", "_score": 4.1231737, "_source": { "account_number": 4, "balance": 27658, "firstname": "Rodriquez", "lastname": "Flores", "age": 31, "gender": "F", "address": "986 Wyckoff Avenue", "employer": "Tourmania", "email": "rodriquezflores@tourmania.com", "city": "Eastvale", "state": "HI" } }, { "_index": "test1", "_type": "account", "_id": "34", "_score": 0.59278774, "_source": { "account_number": 34, "balance": 35379, "firstname": "Ellison", "lastname": "Kim", "age": 30, "gender": "F", "address": "986 Revere Place", "employer": "Signity", "email": "ellisonkim@signity.com", "city": "Sehili", "state": "IL" } } ] } } |
說明:根據查詢結果可見在查詢結果中並非只查詢address包含”986 Wyckoff Avenue”的文檔,而是包含986,wychoff,Avenue三個詞中任意一個,這就是es分詞的強大之處
可見查詢結果中_score(與查詢條件匹配度)按從大到小的順序排列
此時你可能想要值查詢address包含”986 Wyckoff Avenue”的文檔,怎么辦呢?使用match_phrase
輸入:
1
2 3 4 5 6 |
POST /test1/account/_search?pretty
{ "query": { "match_phrase": {"address" : "986 Wyckoff Avenue"} } } |
可能你已經注意到, 以上query中只有一個條件,若存在多個條件,我們必須使用bool query將多個條件進行組合
輸入:
1
2 3 4 5 6 7 8 9 10 11 |
POST /test1/account/_search?pretty
{ "query": { "bool" : { "must":[ {"match_phrase": {"address" : "986 Wyckoff Avenue"}}, {"match" : {"age" : 31}} ] } } } |
說明: 查詢所有條件都滿足的結果
輸入:
1
2 3 4 5 6 7 8 9 10 11 |
POST /test1/account/_search
{ "query": { "bool" : { "should":[ {"match_phrase": {"address" : "986 Wyckoff Avenue"}}, {"match_phrase": {"address" : "963 Neptune Avenue"}} ] } } } |
說明: 查詢有一個條件滿足的結果
輸入:
1
2 3 4 5 6 7 8 9 10 11 |
POST /test1/account/_search
{ "query": { "bool" : { "must_not":[ {"match": {"city" : "Eastvale"}}, {"match": {"city" : "Olney"}} ] } } } |
說明: 查詢有條件都不滿足的結果
在Query SDL中可以將must, must_not和should組合使用
輸入:
1
2 3 4 5 6 7 8 9 10 11 12 13 |
POST /test1/account/_search
{ "query": { "bool" : { "must": [{ "match" : {"age":20} }], "must_not":[ {"match": {"city" : "Steinhatchee"}} ] } } } |
在使用Query 查詢時可以看到在查詢結果中都有_score值, _score值需要進行計算, 在某些情況下我們並不需要_socre值,在es中提供了Filters查詢,它類似於Query查詢,但是效率較高,原因:
1
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
POST /test1/account/_search?pretty
{ "query": { "filtered":{ "query": { "match_all" : {} }, "filter": { "range" : { "age" : { "gte" : 20, "lt" : 28 } } } } } } |
判斷使用filter還是使用query的最簡單方法就是是否關注_score值,若關注則使用query,若不關注則使用filter
es提供Aggregations支持分組和聚合查詢,類似於關系型數據庫中的GROUP BY和聚合函數,在ES調用聚合RESTAPI時返回結果包含文檔查詢結果和聚合結果,也可以返回多個聚合結果,從而簡化API調用和減少網絡流量使用
輸入:
1
2 3 4 5 6 7 8 9 |
POST /test1/account/_search?pretty
{ "size" : 0, "aggs" : { "group_by_gender" : { "terms" : {"field":"gender"} } } } |
輸出:
1
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
{
"took": 1, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 1000, "max_score": 0, "hits": [] }, "aggregations": { "group_by_gender": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "m", "doc_count": 507 }, { "key": "f", "doc_count": 493 } ] } } } |
說明:
size: 返回文檔查詢結果數量
aggs: 用於設置聚合分類
terms: 設置group by屬性值
輸入:
1
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
POST /test1/account/_search?pretty
{ "size" : 0, "aggs" : { "group_by_gender" : { "terms" : { "field":"state", "order" : {"avg_age":"desc"}, "size" : 3 }, "aggs" : { "avg_age" : { "avg" : {"field" : "age"} }, "max_age" : { "max" : {"field": "age"} }, "min_age" : { "min": {"field":"age"} } } } } } |
輸出:
1
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 |
{
"took": 9, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 1000, "max_score": 0, "hits": [] }, "aggregations": { "group_by_gender": { "doc_count_error_upper_bound": -1, "sum_other_doc_count": 992, "buckets": [ { "key": "de", "doc_count": 1, "max_age": { "value": 37 }, "avg_age": { "value": 37 }, "min_age": { "value": 37 } }, { "key": "il", "doc_count": 3, "max_age": { "value": 39 }, "avg_age": { "value": 36.333333333333336 }, "min_age": { "value": 32 } }, { "key": "in", "doc_count": 4, "max_age": { "value": 39 }, "avg_age": { "value": 36 }, "min_age": { "value": 34 } } ] } } } |
說明:根據state進行分類,並查詢每種分類所有人員的最大,最小,平均年齡, 查詢結果按平均年齡排序並返回前3個查詢結果
若需要按照分類總數進行排序時可以使用_count做為sort的field值
在聚合查詢時通過size設置返回的TOP數量,默認為10
在聚合查詢中可任意嵌套聚合語句進行查詢
輸入:
1
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
POST /test1/account/_search?pretty
{ "size" : 0, "aggs" : { "group_by_age" : { "range" : { "field": "age", "ranges" : [{ "from" : 20, "to" : 30 }, { "from": 30, "to" : 40 },{ "from": 40, "to": 50 }] }, "aggs":{ "group_by_gender" : { "terms" : {"field": "gender"}, "aggs" : { "group_by_balance" :{ "range" : { "field":"balance", "ranges" : [{ "to" : 5000 }, { "from" : 5000 } ] } } } } } } } } |
輸出:
1
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 |
{
"took": 1, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 1000, "max_score": 0, "hits": [] }, "aggregations": { "group_by_age": { "buckets": [ { "key": "20.0-30.0", "from": 20, "from_as_string": "20.0", "to": 30, "to_as_string": "30.0", "doc_count": 451, "group_by_gender": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "m", "doc_count": 232, "group_by_balance": { "buckets": [ { "key": "*-5000.0", "to": 5000, "to_as_string": "5000.0", "doc_count": 9 }, { "key": "5000.0-*", "from": 5000, "from_as_string": "5000.0", "doc_count": 223 } ] } }, { "key": "f", "doc_count": 219, "group_by_balance": { "buckets": [ { "key": "*-5000.0", "to": 5000, "to_as_string": "5000.0", "doc_count": 20 }, { "key": "5000.0-*", "from": 5000, "from_as_string": "5000.0", "doc_count": 199 } ] } } ] } }, { "key": "30.0-40.0", "from": 30, "from_as_string": "30.0", "to": 40, "to_as_string": "40.0", "doc_count": 504, "group_by_gender": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "f", "doc_count": 253, "group_by_balance": { "buckets": [ { "key": "*-5000.0", "to": 5000, "to_as_string": "5000.0", "doc_count": 26 }, { "key": "5000.0-*", "from": 5000, "from_as_string": "5000.0", "doc_count": 227 } ] } }, { "key": "m", "doc_count": 251, "group_by_balance": { "buckets": [ { "key": "*-5000.0", "to": 5000, "to_as_string": "5000.0", |