内容简介:碰到一个比较头疼的问题,MySQL数据丢失。有两个办法,一个办法是让DBA找半年前的数据。另一个办法是保存了MySQL数据的ES里找。由于数据量过万,而且ES设置了一次查询数据量最大10000,想想用 scroll 取数据会比较好。
碰到一个比较头疼的问题,MySQL数据丢失。
有两个办法,一个办法是让DBA找半年前的数据。另一个办法是保存了 MySQL 数据的ES里找。
由于数据量过万,而且ES设置了一次查询数据量最大10000,想想用 scroll 取数据会比较好。
1 ElasticSearch 2.x
1.1 查询索引有多少数据
localhost:9200/_nodes/stats/indices/search?pretty
1.1 查看索引信息
curl -XGET 'http://127.0.0.1:9400/dev_index1_20190118/docs/_search?pretty'
1.2 使用游标
curl -XGET 'http://127.0.0.1:9400/dev_index1_20190118/docs/_search?scroll=10m' -d ' { "query": { "match_all": {}}, "sort" : ["_doc"], "size": 10000 }' >> es_scroll_data_20190118_1w.txt
1.3 不断取下一页
curl -XGET 'http://127.0.0.1:9400/_search?scroll=10m' -d ' { "scroll": "10m", "scroll_id" : "DnF1ZXJ5VGhlbkZldGNoBQAAAAAANKLTFjdfVno3Y2hGU1BpOUYtMW54OTV3TEEAAAAAADSi1BY3X1Z6N2NoRlNQaTlGLTFueDk1d0xBAAAAAAA0otYWN19WejdjaEZTUGk5Ri0xbng5NXdMQQAAAAAANKLVFjdfVno3Y2hGU1BpOUYtMW54OTV3TEEAAAAAADvJpxZzcU9YSExnLVRTNk5RY3JfMlNuWU9n" }' >> es_scroll_data_20190118_2w.txt
curl -XGET 'http://127.0.0.1:9400/_search?scroll=10m' -d ' { "scroll": "10m", "scroll_id" : "DnF1ZXJ5VGhlbkZldGNoBQAAAAAANKLTFjdfVno3Y2hGU1BpOUYtMW54OTV3TEEAAAAAADSi1BY3X1Z6N2NoRlNQaTlGLTFueDk1d0xBAAAAAAA0otYWN19WejdjaEZTUGk5Ri0xbng5NXdMQQAAAAAANKLVFjdfVno3Y2hGU1BpOUYtMW54OTV3TEEAAAAAADvJpxZzcU9YSExnLVRTNk5RY3JfMlNuWU9n" }' >> es_scroll_data_20190118_3w.txt
2 ElasticSearch 5.6.x
2.1 查询索引信息
localhost:9200/_nodes/stats/indices/search?pretty
curl -XGET 'http://127.0.0.1:9400/dev_index1_20190118/docs/_search?pretty'
2.2 使用游标
curl -XGET 'http://127.0.0.1:9400/dev_index1_20190118/docs/_search?scroll=10m' -d ' { "query": { "match_all": {}}, "sort" : ["_doc"], "size": 10000 }' >> es_scroll_data_20190118_1w.txt
2.3 不断取下一页
curl -XGET 'http://127.0.0.1:9400/_search?scroll=10m' -d ' { "scroll_id" : "DnF1ZXJ5VGhlbkZldGNoBQAAAAAANKLTFjdfVno3Y2hGU1BpOUYtMW54OTV3TEEAAAAAADSi1BY3X1Z6N2NoRlNQaTlGLTFueDk1d0xBAAAAAAA0otYWN19WejdjaEZTUGk5Ri0xbng5NXdMQQAAAAAANKLVFjdfVno3Y2hGU1BpOUYtMW54OTV3TEEAAAAAADvJpxZzcU9YSExnLVRTNk5RY3JfMlNuWU9n" }' >> es_scroll_data_20190118_2w.txt
curl -XGET 'http://127.0.0.1:9400/_search?scroll=10m' -d ' { "scroll_id" : "DnF1ZXJ5VGhlbkZldGNoBQAAAAAANKLTFjdfVno3Y2hGU1BpOUYtMW54OTV3TEEAAAAAADSi1BY3X1Z6N2NoRlNQaTlGLTFueDk1d0xBAAAAAAA0otYWN19WejdjaEZTUGk5Ri0xbng5NXdMQQAAAAAANKLVFjdfVno3Y2hGU1BpOUYtMW54OTV3TEEAAAAAADvJpxZzcU9YSExnLVRTNk5RY3JfMlNuWU9n" }' >> es_scroll_data_20190118_3w.txt ``` # 遇到的问题 ## 3.1 Unknown key for a VALUE_STRING in [scroll_id]. ```json { "error": { "root_cause": [ { "type": "parsing_exception", "reason": "Unknown key for a VALUE_STRING in [scroll_id].", "line": 3, "col": 19 } ], "type": "parsing_exception", "reason": "Unknown key for a VALUE_STRING in [scroll_id].", "line": 3, "col": 19 }, "status": 400 }
第二次使用的 scroll_id 和第一次返回的 scroll_id 不一致导致
3.2 Unknown key for a VALUE_STRING in [scroll]
{ "error": { "root_cause": [ { "type": "parsing_exception", "reason": "Unknown key for a VALUE_STRING in [scroll].", "line": 3, "col": 15 } ], "type": "parsing_exception", "reason": "Unknown key for a VALUE_STRING in [scroll].", "line": 3, "col": 15 }, "status": 400 }
第二次请求时 请求参数里多了 scroll 参数
3.3 Batch size is too large, size must be less than or equal to: [10000] but was [1000000]. Scroll batch sizes cost as much memory as result windows so they are controlled by the [index.max_result_window] index level setting.
{ "error": { "root_cause": [ { "type": "query_phase_execution_exception", "reason": "Batch size is too large, size must be less than or equal to: [10000] but was [1000000]. Scroll batch sizes cost as much memory as result windows so they are controlled by the [index.max_result_window] index level setting." } ], "type": "search_phase_execution_exception", "reason": "all shards failed", "phase": "query", "grouped": true, "failed_shards": [ { "shard": 0, "index": "dev_index1_20190118", "node": "8XqKY198S823M78QA43F8g", "reason": { "type": "query_phase_execution_exception", "reason": "Batch size is too large, size must be less than or equal to: [10000] but was [1000000]. Scroll batch sizes cost as much memory as result windows so they are controlled by the [index.max_result_window] index level setting." } } ] }, "status": 500 }
设置的 size 过大,超过10000,配置文件里 index.max_result_window 最大为10000
3.4 search_context_missing_exception
{ "error": { "root_cause": [ { "type": "search_context_missing_exception", "reason": "No search context found for id [3540965]" }, { "type": "search_context_missing_exception", "reason": "No search context found for id [3922089]" }, { "type": "search_context_missing_exception", "reason": "No search context found for id [3454995]" }, { "type": "search_context_missing_exception", "reason": "No search context found for id [3454996]" }, { "type": "search_context_missing_exception", "reason": "No search context found for id [3454994]" } ], "type": "search_phase_execution_exception", "reason": "all shards failed", "phase": "query", "grouped": true, "failed_shards": [ { "shard": -1, "index": null, "reason": { "type": "search_context_missing_exception", "reason": "No search context found for id [3540965]" } }, { "shard": -1, "index": null, "reason": { "type": "search_context_missing_exception", "reason": "No search context found for id [3922089]" } }, { "shard": -1, "index": null, "reason": { "type": "search_context_missing_exception", "reason": "No search context found for id [3454995]" } }, { "shard": -1, "index": null, "reason": { "type": "search_context_missing_exception", "reason": "No search context found for id [3454996]" } }, { "shard": -1, "index": null, "reason": { "type": "search_context_missing_exception", "reason": "No search context found for id [3454994]" } } ], "caused_by": { "type": "search_context_missing_exception", "reason": "No search context found for id [3454994]" } }, "status": 404 }
其实是超时了,scroll自动删除了
References
[1] 游标查询
[2] scroll
以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
高质量程序设计艺术
斯皮内利斯 / 韩东海 / 人民邮电出版社 / 2008-1 / 55.00元
在本书中,作者回归技术层面。从Apache web server、BSD版本的Unix system、ArgoUMl、ACE网络编程库等著名开源软件中选取了大量真实C、C++和java语言源代码,直观而深刻的阐述了代码中可能存在的各种质量问题,涉及可靠性、安全性、时间性和空间性、可移植性、可维护性以及浮点运算等方面,很多内容都市独辟蹊径,发前人所未发。正因如此,本书继作者的《代码阅读》之后在获Jo......一起来看看 《高质量程序设计艺术》 这本书的介绍吧!