Elasticsearch学习

栏目: 服务器 · Apache · 发布时间: 7年前

内容简介:Elasticsearch是一个基于Apache Lucene(TM)的开源搜索引擎(以下简称ES),是目前全文搜索引擎的首选。它可以快速存储、搜索和分析海量数据,Github,StackOverflow都在采用它。ES对照RMDB快速了解ES基本组成,它可以包含多个索引(indices)(数据库),每一个索引可以包含多个类型(types)(表),每一个类型包含多个文档(documents)(行),然后每个文档包含多个字段(Fields)(列),简化如下:索引

目录

    1. 3. 查看集群有哪些索引
    2. 6. 新增文档并建立索引
    3. 11. 查询索引的表和字段定义
    4. 12.查询DSL(Domain Specified Language,特定领域的语言 )

Elasticsearch是一个基于Apache Lucene(TM)的开源搜索引擎(以下简称ES),是目前全文搜索引擎的首选。它可以快速存储、搜索和分析海量数据,Github,StackOverflow都在采用它。

一、ES组成

ES对照RMDB快速了解ES基本组成,它可以包含多个索引(indices)(数据库),每一个索引可以包含多个类型(types)(表),每一个类型包含多个文档(documents)(行),然后每个文档包含多个字段(Fields)(列),简化如下:

索引 -> 数据库

类型 ->

文档 ->

字段 ->

二、常用查询命令

1. 查看_cat相关命令

GET /_cat/

结果:

➜  ~ curl -i -XGET http://192.168.11.119:9200/_cat/
HTTP/1.1 200 OK
content-type: text/plain; charset=UTF-8
content-length: 493

>=^.^=
/_cat/allocation
/_cat/shards
/_cat/shards/{index}
/_cat/master
/_cat/nodes
/_cat/tasks
/_cat/indices
/_cat/indices/{index}
/_cat/segments
/_cat/segments/{index}
/_cat/count
/_cat/count/{index}
/_cat/recovery
/_cat/recovery/{index}
/_cat/health
/_cat/pending_tasks
/_cat/aliases
/_cat/aliases/{alias}
/_cat/thread_pool
/_cat/thread_pool/{thread_pools}
/_cat/plugins
/_cat/fielddata
/_cat/fielddata/{fields}
/_cat/nodeattrs
/_cat/repositories
/_cat/snapshots/{repository}
/_cat/templates

2.查看集群健康

GET /_cat/health?v

结果:

➜  ~ curl -XGET http://192.168.11.119:9200/_cat/health\?v
epoch      timestamp cluster       status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1533717572 08:39:32  elasticsearch yellow          1         1    315 315    0    0      315             0                  -                 50.0%

green:每个索引的primary shard和replica shard都是active状态的

yellow:每个索引的primary shard都是active状态的,但是部分replica shard不是active状态,处于不可用的状态

red:不是所有索引的primary shard都是active状态的,部分索引有数据丢失了

为什么现在会处于一个yellow状态?

我们现在就一台服务器,就启动了一个es进程,相当于就只有一个node。现在es中有一个index,就是kibana自己内置建立的index。由于默认的配置是给每个index分配5个primary shard和5个replica shard,而且primary shard和replica shard不能在同一台机器上(为了容错)。现在kibana自己建立的index是1个primary shard和1个replica shard。当前就一个node,所以只有1个primary shard被分配了和启动了,但是一个replica shard没有第二台机器去启动。

3. 查看集群有哪些索引

GET /_cat/indices\?v
结果:

➜  ~ curl -i -XGET http://192.168.11.119:9200/_cat/indices\?v
HTTP/1.1 200 OK
content-type: text/plain; charset=UTF-8
content-length: 8840

>health status index                                    uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   es_es_category_products                  u2TdPYcXS5yyFF8P3a3jYQ   5   1      95311         7103      156mb          156mb
yellow open   web_product_ar_new                       8qhhh9C7QvuwEEu-YYrIgA   5   1      37610           77     55.6mb         55.6mb
yellow open   en_27_category_product                   VtVXVTuHQ3-xyNw4txpEXg   5   1      41206           20       68mb           68mb
yellow open   ar_27_category_product                   Id43cmuDQnKYkhaCepxrIg   5   1      41206           17     67.9mb         67.9mb
yellow open   it_28_category_product                   Gltx9R80Qn6PI22i6-Mflg   5   1      12659           25     22.5mb         22.5mb
yellow open   db_search                                WKYGbjjLSZmh0s_LyuT2tQ   5   1     230133            0     28.7mb         28.7mb
yellow open   de_28_category_product                   IUCYcmTIR6K4AzUpAWJmHg   5   1      12659           27     22.5mb         22.5mb

4. 创建索引

PUT /test_index?pretty

结果:

➜  ~ curl -i -XPUT http://192.168.11.119:9200/test_index\?pretty
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 60
{
  "acknowledged" : true,
  "shards_acknowledged" : true

5.删除索引

DELETE /test_index?pretty

6. 新增文档并建立索引

语法格式:

PUT /index/type/id
{
    "json数据"
}

index索引名、type类型名、id数据的id

PUT /test_index/user/1
{
"name": "小明",
"email": "[email protected]",
"tags": ["篮球","游泳"]
}

结果如下:

➜  ~ curl -i -XPUT http://192.168.11.119:9200/test_index/user/1 -d '{
"name": "小明",
"email": "[email protected]",
"tags": ["篮球","游泳"]
}'

>HTTP/1.1 201 Created
Location: /test_index/user/1
Warning: 299 Elasticsearch-5.5.2-b2f0c09 "Content type detection for rest requests is deprecated. Specify the content type using the [Content-Type] header." "Wed, 08 Aug 2018 08:58:29 GMT"
content-type: application/json; charset=UTF-8
content-length: 143

>{"_index":"test_index","_type":"user","_id":"1","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"created":true}%

6.查询新增的文档

GET /索引/类型/字段值

例如:

➜  ~ curl -i -XGET http://192.168.11.119:9200/test_index/user/1\?pretty
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 232
{
"_index" : "test_index",
"_type" : "user",
"_id" : "1",
"_version" : 1,
"found" : true,
"_source" : {
"name" : "小明",
"email" : "[email protected]",
"tags" : [
"篮球",
"游泳"
]
}
}

7.修改文档

修改分为全部修改或部分修改,全部修改就是直接替换,需要带上全部字段才能修改,例如:

➜  ~  curl -i -XPUT http://192.168.11.119:9200/test_index/user/1 -d '{
"name": "小明",
"email": "[email protected]",
"tags": ["篮球","游泳","足球"]
}'
HTTP/1.1 200 OK
Warning: 299 Elasticsearch-5.5.2-b2f0c09 "Content type detection for rest requests is deprecated. Specify the content type using the [Content-Type] header." "Wed, 08 Aug 2018 09:15:45 GMT"
content-type: application/json; charset=UTF-8
content-length: 144
{"_index":"test_index","_type":"user","_id":"1","_version":2,"result":"updated","_shards":{"total":2,"successful":1,"failed":0},"created":false}

注意全部修改用的是PUT方法.

部分修改就是只更新部分,用的POST方法,参数部分增加了一个doc的key,例如:

➜  ~ curl -i -XPOST http://192.168.11.119:9200/test_index/user/1/_update -d '{
"doc":{
"email": "[email protected]"
}
}'
HTTP/1.1 200 OK
Warning: 299 Elasticsearch-5.5.2-b2f0c09 "Content type detection for rest requests is deprecated. Specify the content type using the [Content-Type] header." "Wed, 08 Aug 2018 09:18:26 GMT"
content-type: application/json; charset=UTF-8
content-length: 128
{"_index":"test_index","_type":"user","_id":"1","_version":3,"result":"updated","_shards":{"total":2,"successful":1,"failed":0}}

8.删除文档

DELETE /test_index/user/1

例如:

➜  ~ curl -i -XDELETE http://192.168.11.119:9200/test_index/user/2
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 141
{"found":true,"_index":"test_index","_type":"user","_id":"2","_version":2,"result":"deleted","_shards":{"total":2,"successful":1,"failed":0}}

9.查询字符串

GET /test_index/user

例如:

➜  ~ curl -i -XGET http://192.168.11.119:9200/test_index/user/_search\?pretty
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 793
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 1.0,
"hits" : [
{
"_index" : "test_index",
"_type" : "user",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"name" : "小王",
"email" : "[email protected]",
"tags" : [
"游泳"
]
}
},
{
"_index" : "test_index",
"_type" : "user",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"name" : "小明",
"email" : "[email protected]",
"tags" : [
"篮球",
"游泳",
"足球"
]
}
}
]
}
}

查询返回值参数说明

took:耗费了几毫秒
timed_out:是否超时,这里是没有
_shards:数据拆成了5个分片,所以对于搜索请求,会打到所有的primary shard(或者是它的某个replica shard也可以)
hits.total:查询结果的数量,3个document
hits.max_score:score的含义,就是document对于一个search的相关度的匹配分数,越相关,就越匹配,分数也高
hits.hits:包含了匹配搜索的document的详细数据

搜索名字为bruce的用户,而且按照email倒序

➜  ~ curl -i -XGET http://192.168.11.119:9200/test_index/user/_search\?pretty\&q=name:'bruce'&sort=email:desc
[1] 26574
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 479
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 1.1727304,
"hits" : [
{
"_index" : "test_index",
"_type" : "user",
"_id" : "4",
"_score" : 1.1727304,
"_source" : {
"name" : "Bruce",
"email" : "[email protected]",
"tags" : [
"Hello"
]
}
}
]
}
}
[1] + 26574 done curl -i -XGET

通过这个例子发现这样搜索是不区分大小写的.适用于临时的在命令行使用一些工具,比如curl,快速的发出请求,来检索想要的信息;但是如果查询请求很复杂,是很难去构建,在实际的生产环境中,几乎很少使用查询字符串.

11. 查询索引的表和字段定义

查询es所有的表和字段定义

GET /_mapping

查询某个索引的表定义

GET /test_index/_mapping

查询某个索引的表的字段定义

GET /test_index/user/_mapping

例如:

➜  ~ curl -i -XGET http://192.168.11.119:9200/test_index/_mapping\?pretty
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 1267
{
  "test_index" : {
    "mappings" : {
      "user" : {
        "properties" : {
          "email" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          },
          "name" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          },
          "tags" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          }
        }
      },
      "role" : {
        "properties" : {
          "flag" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          },
          "name" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          }
        }
      }
    }
  }
}

12.查询DSL(Domain Specified Language,特定领域的语言 )

http request body:请求体,可以用json的格式来构建查询语法,比较方便,可以构建各种复杂的语法,比查询字符串肯定强大多了

  • 12.1查询所有文档
➜  ~ curl -i -XGET http://192.168.11.119:9200/test_index/user/_search\?pretty -d '
{
"query": {
"match_all": {
}
}
}
'
HTTP/1.1 200 OK
Warning: 299 Elasticsearch-5.5.2-b2f0c09 "Content type detection for rest requests is deprecated. Specify the content type using the [Content-Type] header." "Wed, 08 Aug 2018 12:58:15 GMT"
content-type: application/json; charset=UTF-8
content-length: 1895
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 6,
"max_score" : 1.0,
"hits" : [
{
"_index" : "test_index",
"_type" : "user",
"_id" : "5",
"_score" : 1.0,
"_source" : {
"name" : "bruce",
"email" : "[email protected]",
"tags" : [
"游泳1"
]
}
},
{
"_index" : "test_index",
"_type" : "user",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"name" : "Alex",
"email" : "[email protected]",
"tags" : [
"吃饭"
]
}
}
]
}
}

注意match_all是包含在query字典里的,query处于root节点位置

  • 12.2查询包含输入字符的文档

query还是处于root节点,增加一个键值sort排序与query同级,示例:

➜  ~ curl -i -XGET http://192.168.11.119:9200/test_index/user/_search\?pretty -d '
{
  "query": {
         "match": {
            "name" : "br"
          }
  },
  "sort": [
           {
             "email" : "desc"
           }
  ]
}
'
HTTP/1.1 200 OK
Warning: 299 Elasticsearch-5.5.2-b2f0c09 "Content type detection for rest requests is deprecated. Specify the content type using the [Content-Type] header." "Wed, 08 Aug 2018 13:03:30 GMT"
content-type: application/json; charset=UTF-8
content-length: 193
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 0,
    "max_score" : null,
    "hits" : [ ]
  }
}

查询包含Br字符的文档(行),并对结果以email倒序。第一次运行上面语句时报错 Fielddata is disabled on text fields by default. Set fielddata=true on [email] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory." ,经查询资料,应该是5.x后对 排序 、聚合相关操作用单独的数据结构fileddata缓存到内存里,需调接口开启使用到的字段, 官方解释 , 执行下面的操作开启:

➜  ~ curl -i -XPUT http://192.168.11.119:9200/test_index/_mapping/user\?pretty -d '
{
  "properties": {
    "email": {
      "type": "text",
      "fielddata": true
    }
  }
}'
HTTP/1.1 200 OK
Warning: 299 Elasticsearch-5.5.2-b2f0c09 "Content type detection for rest requests is deprecated. Specify the content type using the [Content-Type] header." "Wed, 08 Aug 2018 13:11:11 GMT"
content-type: application/json; charset=UTF-8
content-length: 28
{
  "acknowledged" : true
}

很多查询出来结果集很大,需要做分页,用DSL很简单,和query同级增加from和size键值,分表表示起始值和步长,示例

curl -i -XGET http://192.168.11.119:9200/test_index/user/_search?pretty -d '
{
  "query": {
   		"match_all": {
   		} 
  },
  "from" : 1,
  "size" : 2,
  "_source" : ["email"],
  "sort": [
  		{
  			"email" : "asc"
  		}
  ]
}
'
  • 12.3查询过滤器

搜索商品名包含Rhinestone,售卖价格小于3大于等于1的商品,结果按售卖价升序,构造DSL语句:

curl -i -XGET http://192.168.11.119:9200/en_es_category_products/product/_search?pretty -d '
{
  "query": {
   		"bool": {
   			"must" : {
   				"match" : {
   					"product_name" : "Rhinestone"
   				}
   			},
   			"filter" : {
   				"range" : {
   					"store_price" : {
   					   "gte" :  1
   						"lt" : 3
   					}
   				}
   			}
   		}
  },
  "_source" : [
  		"product_id",
  		"product_name",
  		"store_price",
  		"icon"
  ],
    "sort": [
  		{
  			"store_price" : "asc"
  		}
  ]
}
'

range操作符包含:

* gt :: 大于
* gte:: 大于等于
* lt :: 小于
* lte:: 小于等于

查询结果:

HTTP/1.1 200 OK
Warning: 299 Elasticsearch-5.5.2-b2f0c09 "Content type detection for rest requests is deprecated. Specify the content type using the [Content-Type] header." "Wed, 08 Aug 2018 13:15:52 GMT"
content-type: application/json; charset=UTF-8
content-length: 1141
{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : null,
    "hits" : [
      {
        "_index" : "en_es_category_products",
        "_type" : "product",
        "_id" : "22100",
        "_score" : null,
        "_source" : {
          "product_id" : 22100,
          "icon" : "http://patpatdev.s3.amazonaws.com/Product/22100/1688I-SL-003-00008-001.jpg/1464845443.jpg",
          "store_price" : "2.99",
          "product_name" : "U-shape Silver Faux Perarl & Rhinestone Clip"
        },
        "sort" : [
          2.99
        ]
      },
      {
        "_index" : "en_es_category_products",
        "_type" : "product",
        "_id" : "354460",
        "_score" : null,
        "_source" : {
          "product_id" : 354460,
          "icon" : "http://patpatdev.s3.us-west-1.amazonaws.com/product/000766000119/5b0e5b0e49e8f.jpg",
          "store_price" : "2.99",
          "product_name" : "Pretty Star Decor Rhinestone Stud Hairband for Women"
        },
        "sort" : [
          2.99
        ]
      }
    ]
  }
}

注意参数嵌套了好几层,很容易写错,query、_source、sort都处于root级,query/bool下包含must、filter两级


以上所述就是小编给大家介绍的《Elasticsearch学习》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

Hacking

Hacking

Jon Erickson / No Starch Press / 2008-2-4 / USD 49.95

While other books merely show how to run existing exploits, Hacking: The Art of Exploitation broke ground as the first book to explain how hacking and software exploits work and how readers could deve......一起来看看 《Hacking》 这本书的介绍吧!

XML 在线格式化
XML 在线格式化

在线 XML 格式化压缩工具

HEX CMYK 转换工具
HEX CMYK 转换工具

HEX CMYK 互转工具

HEX HSV 转换工具
HEX HSV 转换工具

HEX HSV 互换工具