内容简介:Elasticsearch是一个基于Apache Lucene(TM)的开源搜索引擎(以下简称ES),是目前全文搜索引擎的首选。它可以快速存储、搜索和分析海量数据,Github,StackOverflow都在采用它。ES对照RMDB快速了解ES基本组成,它可以包含多个索引(indices)(数据库),每一个索引可以包含多个类型(types)(表),每一个类型包含多个文档(documents)(行),然后每个文档包含多个字段(Fields)(列),简化如下:索引
目录
-
- 3. 查看集群有哪些索引
- 6. 新增文档并建立索引
- 11. 查询索引的表和字段定义
- 12.查询DSL(Domain Specified Language,特定领域的语言 )
Elasticsearch是一个基于Apache Lucene(TM)的开源搜索引擎(以下简称ES),是目前全文搜索引擎的首选。它可以快速存储、搜索和分析海量数据,Github,StackOverflow都在采用它。
一、ES组成
ES对照RMDB快速了解ES基本组成,它可以包含多个索引(indices)(数据库),每一个索引可以包含多个类型(types)(表),每一个类型包含多个文档(documents)(行),然后每个文档包含多个字段(Fields)(列),简化如下:
索引 -> 数据库
类型 -> 表
文档 -> 行
字段 -> 列
二、常用查询命令
1. 查看_cat相关命令
GET /_cat/
结果:
➜ ~ curl -i -XGET http://192.168.11.119:9200/_cat/ HTTP/1.1 200 OK content-type: text/plain; charset=UTF-8 content-length: 493 >=^.^= /_cat/allocation /_cat/shards /_cat/shards/{index} /_cat/master /_cat/nodes /_cat/tasks /_cat/indices /_cat/indices/{index} /_cat/segments /_cat/segments/{index} /_cat/count /_cat/count/{index} /_cat/recovery /_cat/recovery/{index} /_cat/health /_cat/pending_tasks /_cat/aliases /_cat/aliases/{alias} /_cat/thread_pool /_cat/thread_pool/{thread_pools} /_cat/plugins /_cat/fielddata /_cat/fielddata/{fields} /_cat/nodeattrs /_cat/repositories /_cat/snapshots/{repository} /_cat/templates
2.查看集群健康
GET /_cat/health?v
结果:
➜ ~ curl -XGET http://192.168.11.119:9200/_cat/health\?v epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent 1533717572 08:39:32 elasticsearch yellow 1 1 315 315 0 0 315 0 - 50.0%
green:每个索引的primary shard和replica shard都是active状态的
yellow:每个索引的primary shard都是active状态的,但是部分replica shard不是active状态,处于不可用的状态
red:不是所有索引的primary shard都是active状态的,部分索引有数据丢失了
为什么现在会处于一个yellow状态?
我们现在就一台服务器,就启动了一个es进程,相当于就只有一个node。现在es中有一个index,就是kibana自己内置建立的index。由于默认的配置是给每个index分配5个primary shard和5个replica shard,而且primary shard和replica shard不能在同一台机器上(为了容错)。现在kibana自己建立的index是1个primary shard和1个replica shard。当前就一个node,所以只有1个primary shard被分配了和启动了,但是一个replica shard没有第二台机器去启动。
3. 查看集群有哪些索引
GET /_cat/indices\?v
结果:
➜ ~ curl -i -XGET http://192.168.11.119:9200/_cat/indices\?v HTTP/1.1 200 OK content-type: text/plain; charset=UTF-8 content-length: 8840 >health status index uuid pri rep docs.count docs.deleted store.size pri.store.size yellow open es_es_category_products u2TdPYcXS5yyFF8P3a3jYQ 5 1 95311 7103 156mb 156mb yellow open web_product_ar_new 8qhhh9C7QvuwEEu-YYrIgA 5 1 37610 77 55.6mb 55.6mb yellow open en_27_category_product VtVXVTuHQ3-xyNw4txpEXg 5 1 41206 20 68mb 68mb yellow open ar_27_category_product Id43cmuDQnKYkhaCepxrIg 5 1 41206 17 67.9mb 67.9mb yellow open it_28_category_product Gltx9R80Qn6PI22i6-Mflg 5 1 12659 25 22.5mb 22.5mb yellow open db_search WKYGbjjLSZmh0s_LyuT2tQ 5 1 230133 0 28.7mb 28.7mb yellow open de_28_category_product IUCYcmTIR6K4AzUpAWJmHg 5 1 12659 27 22.5mb 22.5mb
4. 创建索引
PUT /test_index?pretty
结果:
➜ ~ curl -i -XPUT http://192.168.11.119:9200/test_index\?pretty HTTP/1.1 200 OK content-type: application/json; charset=UTF-8 content-length: 60 { "acknowledged" : true, "shards_acknowledged" : true
5.删除索引
DELETE /test_index?pretty
6. 新增文档并建立索引
语法格式:
PUT /index/type/id { "json数据" }
index索引名、type类型名、id数据的id
PUT /test_index/user/1
{
"name": "小明",
"email": "[email protected]",
"tags": ["篮球","游泳"]
}
结果如下:
➜ ~ curl -i -XPUT http://192.168.11.119:9200/test_index/user/1 -d '{
"name": "小明",
"email": "[email protected]",
"tags": ["篮球","游泳"]
}'
>HTTP/1.1 201 Created
Location: /test_index/user/1
Warning: 299 Elasticsearch-5.5.2-b2f0c09 "Content type detection for rest requests is deprecated. Specify the content type using the [Content-Type] header." "Wed, 08 Aug 2018 08:58:29 GMT"
content-type: application/json; charset=UTF-8
content-length: 143
>{"_index":"test_index","_type":"user","_id":"1","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"created":true}%
6.查询新增的文档
GET /索引/类型/字段值
例如:
➜ ~ curl -i -XGET http://192.168.11.119:9200/test_index/user/1\?pretty
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 232
{
"_index" : "test_index",
"_type" : "user",
"_id" : "1",
"_version" : 1,
"found" : true,
"_source" : {
"name" : "小明",
"email" : "[email protected]",
"tags" : [
"篮球",
"游泳"
]
}
}
7.修改文档
修改分为全部修改或部分修改,全部修改就是直接替换,需要带上全部字段才能修改,例如:
➜ ~ curl -i -XPUT http://192.168.11.119:9200/test_index/user/1 -d '{
"name": "小明",
"email": "[email protected]",
"tags": ["篮球","游泳","足球"]
}'
HTTP/1.1 200 OK
Warning: 299 Elasticsearch-5.5.2-b2f0c09 "Content type detection for rest requests is deprecated. Specify the content type using the [Content-Type] header." "Wed, 08 Aug 2018 09:15:45 GMT"
content-type: application/json; charset=UTF-8
content-length: 144
{"_index":"test_index","_type":"user","_id":"1","_version":2,"result":"updated","_shards":{"total":2,"successful":1,"failed":0},"created":false}
注意全部修改用的是PUT方法.
部分修改就是只更新部分,用的POST方法,参数部分增加了一个doc的key,例如:
➜ ~ curl -i -XPOST http://192.168.11.119:9200/test_index/user/1/_update -d '{
"doc":{
"email": "[email protected]"
}
}'
HTTP/1.1 200 OK
Warning: 299 Elasticsearch-5.5.2-b2f0c09 "Content type detection for rest requests is deprecated. Specify the content type using the [Content-Type] header." "Wed, 08 Aug 2018 09:18:26 GMT"
content-type: application/json; charset=UTF-8
content-length: 128
{"_index":"test_index","_type":"user","_id":"1","_version":3,"result":"updated","_shards":{"total":2,"successful":1,"failed":0}}
8.删除文档
DELETE /test_index/user/1
例如:
➜ ~ curl -i -XDELETE http://192.168.11.119:9200/test_index/user/2 HTTP/1.1 200 OK content-type: application/json; charset=UTF-8 content-length: 141 {"found":true,"_index":"test_index","_type":"user","_id":"2","_version":2,"result":"deleted","_shards":{"total":2,"successful":1,"failed":0}}
9.查询字符串
GET /test_index/user
例如:
➜ ~ curl -i -XGET http://192.168.11.119:9200/test_index/user/_search\?pretty
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 793
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 1.0,
"hits" : [
{
"_index" : "test_index",
"_type" : "user",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"name" : "小王",
"email" : "[email protected]",
"tags" : [
"游泳"
]
}
},
{
"_index" : "test_index",
"_type" : "user",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"name" : "小明",
"email" : "[email protected]",
"tags" : [
"篮球",
"游泳",
"足球"
]
}
}
]
}
}
查询返回值参数说明
took:耗费了几毫秒 timed_out:是否超时,这里是没有 _shards:数据拆成了5个分片,所以对于搜索请求,会打到所有的primary shard(或者是它的某个replica shard也可以) hits.total:查询结果的数量,3个document hits.max_score:score的含义,就是document对于一个search的相关度的匹配分数,越相关,就越匹配,分数也高 hits.hits:包含了匹配搜索的document的详细数据
搜索名字为bruce的用户,而且按照email倒序
➜ ~ curl -i -XGET http://192.168.11.119:9200/test_index/user/_search\?pretty\&q=name:'bruce'&sort=email:desc
[1] 26574
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 479
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 1.1727304,
"hits" : [
{
"_index" : "test_index",
"_type" : "user",
"_id" : "4",
"_score" : 1.1727304,
"_source" : {
"name" : "Bruce",
"email" : "[email protected]",
"tags" : [
"Hello"
]
}
}
]
}
}
[1] + 26574 done curl -i -XGET
通过这个例子发现这样搜索是不区分大小写的.适用于临时的在命令行使用一些工具,比如curl,快速的发出请求,来检索想要的信息;但是如果查询请求很复杂,是很难去构建,在实际的生产环境中,几乎很少使用查询字符串.
11. 查询索引的表和字段定义
查询es所有的表和字段定义
GET /_mapping
查询某个索引的表定义
GET /test_index/_mapping
查询某个索引的表的字段定义
GET /test_index/user/_mapping
例如:
➜ ~ curl -i -XGET http://192.168.11.119:9200/test_index/_mapping\?pretty HTTP/1.1 200 OK content-type: application/json; charset=UTF-8 content-length: 1267 { "test_index" : { "mappings" : { "user" : { "properties" : { "email" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } }, "name" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } }, "tags" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } } } }, "role" : { "properties" : { "flag" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } }, "name" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } } } } } } }
12.查询DSL(Domain Specified Language,特定领域的语言 )
http request body:请求体,可以用json的格式来构建查询语法,比较方便,可以构建各种复杂的语法,比查询字符串肯定强大多了
- 12.1查询所有文档
➜ ~ curl -i -XGET http://192.168.11.119:9200/test_index/user/_search\?pretty -d '
{
"query": {
"match_all": {
}
}
}
'
HTTP/1.1 200 OK
Warning: 299 Elasticsearch-5.5.2-b2f0c09 "Content type detection for rest requests is deprecated. Specify the content type using the [Content-Type] header." "Wed, 08 Aug 2018 12:58:15 GMT"
content-type: application/json; charset=UTF-8
content-length: 1895
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 6,
"max_score" : 1.0,
"hits" : [
{
"_index" : "test_index",
"_type" : "user",
"_id" : "5",
"_score" : 1.0,
"_source" : {
"name" : "bruce",
"email" : "[email protected]",
"tags" : [
"游泳1"
]
}
},
{
"_index" : "test_index",
"_type" : "user",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"name" : "Alex",
"email" : "[email protected]",
"tags" : [
"吃饭"
]
}
}
]
}
}
注意match_all是包含在query字典里的,query处于root节点位置
- 12.2查询包含输入字符的文档
query还是处于root节点,增加一个键值sort排序与query同级,示例:
➜ ~ curl -i -XGET http://192.168.11.119:9200/test_index/user/_search\?pretty -d ' { "query": { "match": { "name" : "br" } }, "sort": [ { "email" : "desc" } ] } ' HTTP/1.1 200 OK Warning: 299 Elasticsearch-5.5.2-b2f0c09 "Content type detection for rest requests is deprecated. Specify the content type using the [Content-Type] header." "Wed, 08 Aug 2018 13:03:30 GMT" content-type: application/json; charset=UTF-8 content-length: 193 { "took" : 1, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 0, "max_score" : null, "hits" : [ ] } }
查询包含Br字符的文档(行),并对结果以email倒序。第一次运行上面语句时报错 Fielddata is disabled on text fields by default. Set fielddata=true on [email] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory."
,经查询资料,应该是5.x后对 排序 、聚合相关操作用单独的数据结构fileddata缓存到内存里,需调接口开启使用到的字段, 官方解释
, 执行下面的操作开启:
➜ ~ curl -i -XPUT http://192.168.11.119:9200/test_index/_mapping/user\?pretty -d ' { "properties": { "email": { "type": "text", "fielddata": true } } }' HTTP/1.1 200 OK Warning: 299 Elasticsearch-5.5.2-b2f0c09 "Content type detection for rest requests is deprecated. Specify the content type using the [Content-Type] header." "Wed, 08 Aug 2018 13:11:11 GMT" content-type: application/json; charset=UTF-8 content-length: 28 { "acknowledged" : true }
很多查询出来结果集很大,需要做分页,用DSL很简单,和query同级增加from和size键值,分表表示起始值和步长,示例
curl -i -XGET http://192.168.11.119:9200/test_index/user/_search?pretty -d ' { "query": { "match_all": { } }, "from" : 1, "size" : 2, "_source" : ["email"], "sort": [ { "email" : "asc" } ] } '
- 12.3查询过滤器
搜索商品名包含Rhinestone,售卖价格小于3大于等于1的商品,结果按售卖价升序,构造DSL语句:
curl -i -XGET http://192.168.11.119:9200/en_es_category_products/product/_search?pretty -d ' { "query": { "bool": { "must" : { "match" : { "product_name" : "Rhinestone" } }, "filter" : { "range" : { "store_price" : { "gte" : 1 "lt" : 3 } } } } }, "_source" : [ "product_id", "product_name", "store_price", "icon" ], "sort": [ { "store_price" : "asc" } ] } '
range操作符包含:
* gt :: 大于 * gte:: 大于等于 * lt :: 小于 * lte:: 小于等于
查询结果:
HTTP/1.1 200 OK Warning: 299 Elasticsearch-5.5.2-b2f0c09 "Content type detection for rest requests is deprecated. Specify the content type using the [Content-Type] header." "Wed, 08 Aug 2018 13:15:52 GMT" content-type: application/json; charset=UTF-8 content-length: 1141 { "took" : 2, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 2, "max_score" : null, "hits" : [ { "_index" : "en_es_category_products", "_type" : "product", "_id" : "22100", "_score" : null, "_source" : { "product_id" : 22100, "icon" : "http://patpatdev.s3.amazonaws.com/Product/22100/1688I-SL-003-00008-001.jpg/1464845443.jpg", "store_price" : "2.99", "product_name" : "U-shape Silver Faux Perarl & Rhinestone Clip" }, "sort" : [ 2.99 ] }, { "_index" : "en_es_category_products", "_type" : "product", "_id" : "354460", "_score" : null, "_source" : { "product_id" : 354460, "icon" : "http://patpatdev.s3.us-west-1.amazonaws.com/product/000766000119/5b0e5b0e49e8f.jpg", "store_price" : "2.99", "product_name" : "Pretty Star Decor Rhinestone Stud Hairband for Women" }, "sort" : [ 2.99 ] } ] } }
注意参数嵌套了好几层,很容易写错,query、_source、sort都处于root级,query/bool下包含must、filter两级
以上所述就是小编给大家介绍的《Elasticsearch学习》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!
猜你喜欢:- 一文读懂监督学习、无监督学习、半监督学习、强化学习这四种深度学习方式
- 学习:人工智能-机器学习-深度学习概念的区别
- 统计学习,机器学习与深度学习概念的关联与区别
- 混合学习环境下基于学习行为数据的学习预警系统设计与实现
- 学习如何学习
- 深度学习的学习历程
本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
深入理解Java虚拟机(第2版)
周志明 / 机械工业出版社 / 2013-9-1 / 79.00元
《深入理解Java虚拟机:JVM高级特性与最佳实践(第2版)》内容简介:第1版两年内印刷近10次,4家网上书店的评论近4?000条,98%以上的评论全部为5星级的好评,是整个Java图书领域公认的经典著作和超级畅销书,繁体版在台湾也十分受欢迎。第2版在第1版的基础上做了很大的改进:根据最新的JDK 1.7对全书内容进行了全面的升级和补充;增加了大量处理各种常见JVM问题的技巧和最佳实践;增加了若干......一起来看看 《深入理解Java虚拟机(第2版)》 这本书的介绍吧!