内容简介::zap: Ultra relevant and instant full-text search APIMeiliSearch is a powerful, fast, open-source, easy to use, and deploy search engine. The search and indexation are fully customizable and handles features like typo-tolerance, filters, and synonyms. For
MeiliSearch
:zap: Ultra relevant and instant full-text search API
MeiliSearch is a powerful, fast, open-source, easy to use, and deploy search engine. The search and indexation are fully customizable and handles features like typo-tolerance, filters, and synonyms. For more details about those features, go to our documentation .
Meili helps the Rust community find crates on crates.meilisearch.com
Features
- Search as-you-type experience (answers < 50ms)
- Full-text search
- Typo tolerant (understands typos and spelling mistakes)
- Supports Kanji
- Supports Synonym
- Easy to install, deploy, and maintain
- Whole documents returned
- Highly customizable
- RESTfull API
Quick Start
Deploy the Server
Run it using Docker
docker run -it -p 7700:7700 --rm getmeili/meilisearch
Installation using Homebrew
brew update && brew install meilisearch meilisearch
Installation using APT
echo "deb [trusted=yes] https://apt.fury.io/meilisearch/ /" > /etc/apt/sources.list.d/fury.list apt update && apt install meilisearch-http meilisearch
Download the binary
curl -L https://install.meilisearch.com | sh ./meilisearch
Run it on heroku
Compile and run it from sources
If you have the Rust toolchain already installed, you can compile from the source
git clone https://github.com/meilisearch/MeiliSearch.git cd MeiliSearch cargo run --release
Create an Index and Upload Some Documents
We provide a movie dataset that you can use for testing purposes.
curl -L 'https://bit.ly/2PAcw9l' -o movies.json
MeiliSearch can serve multiple indexes, with different kinds of documents, therefore, it is required to create the index before sending documents to it.
curl -i -X POST 'http://127.0.0.1:7700/indexes' --data '{ "name": "Movies", "uid": "movies" }'
Now that the server knows about our brand new index, we can send it data. We provided you a small dataset that is available in the datasets/
directory.
curl -i -X POST 'http://127.0.0.1:7700/indexes/movies/documents' \ --header 'content-type: application/json' \ --data-binary @movies.json
Search for Documents
In command line
The search engine is now aware of our documents and can serve those via our HTTP server again. The jq
command-line tool can significantly help you read the server responses.
curl 'http://127.0.0.1:7700/indexes/movies/search?q=botman+robin&limit=2' | jq
{ "hits": [ { "id": "415", "title": "Batman & Robin", "poster": "https://image.tmdb.org/t/p/w1280/79AYCcxw3kSKbhGpx1LiqaCAbwo.jpg", "overview": "Along with crime-fighting partner Robin and new recruit Batgirl...", "release_date": "1997-06-20", }, { "id": "411736", "title": "Batman: Return of the Caped Crusaders", "poster": "https://image.tmdb.org/t/p/w1280/GW3IyMW5Xgl0cgCN8wu96IlNpD.jpg", "overview": "Adam West and Burt Ward returns to their iconic roles of Batman and Robin...", "release_date": "2016-10-08", } ], "offset": 0, "limit": 2, "processingTimeMs": 1, "query": "botman robin" }
With the Web Interface
MeiliSearch provides a simple web interface containing a search bar in order to quickly test the instant search experience with a given set of documents.
This web interface is available in your browser at the root of the server. The default URL is http://127.0.0.1:7700 .
Documentation
Now, that you have a running MeiliSearch, you can learn more and tune your search engine using the documentation .
How it works
MeiliSearch uses LMDB as the internal key-value store. The key-value store allows us to handle updates and queries with small memory and CPU overheads. The whole ranking system is data oriented and provides great performances.
You can read the deep dive if you want more information on the engine; it describes the whole process of generating updates and handling queries. Also, you can take a look at the typos and ranking rules if you want to know the default rules used to sort the documents.
Technical features
- Provides 6 default ranking criteria used to bucket sort documents
- Accepts custom criteria and can apply them in any custom order
- Support ranged queries , useful for paginating results
- Can distinct and filter returned documents based on context defined rules
- Searches for concatenated and splitted query words to improve the search quality.
- Can store complete documents or only user schema specified fields
- The default tokenizer can index latin and kanji based languages
- Returns the matching text areas , useful to highlight matched words in results
- Accepts query time search config like the searchable attributes
- Supports runtime incremental indexing
Performances
With a dataset composed of 100 353 documents with 352 attributes each and 3 of them indexed. So more than 300 000 fields indexed for 35 million stored we can handle more than 2.8k req/sec with an average response time of 9 ms on an Intel i7-7700 (8) @ 4.2GHz.
Requests are made using wrk and scripted to simulate real users' queries.
Running 10s test @ http://localhost:2230 2 threads and 25 connections Thread Stats Avg Stdev Max +/- Stdev Latency 9.52ms 7.61ms 99.25ms 84.58% Req/Sec 1.41k 119.11 1.78k 64.50% 28080 requests in 10.01s, 7.42MB read Requests/sec: 2806.46 Transfer/sec: 759.17KB
We also indexed a dataset containing something like 12 millions cities names in 24 minutes on a machine with 8 cores , 64 GB of RAM , and a 300 GB NMVe SSD.
The resulting database was 16 GB and search results were between 30 ms and 4 seconds for short prefix queries.
Notes
With Rust 1.32 the allocator has been changed to use the system allocator . We have seen much better performances when using jemalloc as the global allocator .
Contributing
We will be glad if you submit issues and pull requests. You can help to grow this project and start contributing by checking issues tagged "good-first-issue" . It is a good start!
Analytic Events
We send events to our Amplitude instance to be aware of the number of people who use MeiliSearch.
We only send the platform on which the server runs once by day. No other information is sent.
If you do not want us to send events, you can disable these analytics by using the MEILI_NO_ANALYTICS
env variable.
以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
利用Python进行数据分析
Wes McKinney / 唐学韬 / 机械工业出版社 / 2013-11-18 / 89.00
【名人推荐】 “科学计算和数据分析社区已经等待这本书很多年了:大量具体的实践建议,以及大量综合应用方法。本书在未来几年里肯定会成为Python领域中技术计算的权威指南。” ——Fernando Pérez 加州大学伯克利分校 研究科学家, IPython的创始人之一 【内容简介】 还在苦苦寻觅用Python控制、处理、整理、分析结构化数据的完整课程?本书含有大量的实践案例,......一起来看看 《利用Python进行数据分析》 这本书的介绍吧!