Datalevin is a port of Datascript in-memory Datalog database to Lightning Memory-Mapped Dat...

栏目: IT技术 · 发布时间: 4年前

内容简介:Datalevin is a port of

Datalevin is a port of Datascript in-memory Datalog database to Lightning Memory-Mapped Dat...

Datalevin

Simple durable Datalog database for everyone :minidisc:

:hear_no_evil: What and why

I love Datalog, why hasn't everyone use this already?

Datalevin is a port of Datascript in-memory Datalog database to Lightning Memory-Mapped Database (LMDB) .

The rationale is to have a simple and free Datalog query engine running on durable storage. It is our observation that many developers prefer the flavor of Datalog popularized by Datomic® over any flavor of SQL, once they get to use it. Perhaps it is because Datalog is more declarative and composable than SQL, e.g. the automatic implicit joins seem to be its killer feature.

Datomic® is an enterprise grade software, and its feature set may be an overkill for some use cases. One thing that may confuse casual users is its temporal features . To keep things simple and familiar, Datalevin does not store transaction history, and behaves the same way as most other databases: when data are deleted, they are gone.

Datalevin retains the library property of Datascript, and it is meant to be embedded in applications to manage state. Because data is persistent on disk in Datalevin, application state can survive application restarts, and data size can be larger than memory.

Datalevin relies on the robust ACID transactional database features of LMDB. Designed for concurrent read intensive workloads, LMDB is used in many projects, e.g. Cloudflare global configuration distribution. LMDB also performs well in writing large values (> 2KB). Therefore, it is fine to store documents in Datalevin.

Independent from Datalog, Datalevin can also be used as a key-value store for EDN data. A number of optimizations are put in place. For instance, it uses a transaction pool to reuse transactions, pre-allocates read/write buffers, and so on.

Datalevin uses cover index and has no write-ahead log, so once the data are written, they are indexed. There is no separate processes for indexing, compaction or any such database maintenance jobs that compete with your applications for resources.

:rocket: Status

Both Datascript and LMDB are mature and stable libraries. Building on top of them, Datalevin is extensively tested with property-based testing. Running the benchmark suite of Datascript, here is how it looks.

Datalevin is a port of Datascript in-memory Datalog database to Lightning Memory-Mapped Dat... Datalevin is a port of Datascript in-memory Datalog database to Lightning Memory-Mapped Dat...

Considering that we are comparing a disk store with a memory store, the query time of Datalevin is not bad.

Writes can be a few orders of magnitude slower, as expected, as Datalevin is writing to disk while Datascript is in memory. The bulk write speed is good, writing 100K datoms to disk in less than a second; the same data can also be transacted as a whole in less than 2 seconds.

If transacting one datom or a few datoms at a time, it is much slower. Each transaction syncs to disk, so it is inherently slow. Also, the current implementation does a lot of reads during transaction. Because LMDB does copy on write and never overwrites data that are being read, large write amplification can occur. The advice is to write data in larger batch and in fewer transactions.

In short, Datalevin is quite usable for small or medium projects right now.

:floppy_disk: Differences from Datascript

Datascript is developed by Nikita Prokopov that "is built totally from scratch and is not related by any means to" Datomic®. Although a port, Datalevin differs from Datascript in more ways than the difference in data durability:

  • Datalevin is not an immutable database, and there is no "database as a value" feature. Since history is not kept, transaction ids are not stored.

  • Datoms in a transaction are committed together as a batch, rather than being saved by with-datom one at a time.

  • Respects :db/valueType . Currently, most Datomic® value types are supported, except bigint, bigdec, uri and tuple. Values with unspecified type are treated as EDN blobs, and are de/serialized with nippy .

  • Has a value leading index (VAE) for datoms with :db.type/ref type attribute; The attribute and value leading index (AVE) is enabled for all datoms, so there is no need to specify :db/index . These are the same as Datomic® Cloud.

  • Attributes are stored in indices as integer ids, thus attributes in index access are returned in attribute creation order, not in lexicographic order (i.e. do not expect :b to come after :a ). This is the same as Datomic®.

  • Has no features that are applicable only for in-memory DBs, such as DB as an immutable data structure, DB serialization, DB pretty print, filtered DB, etc. For now, LMDB tools can be used to work with the database files.

:baby: Limitations

  • Attribute names have a length limitation: an attribute name cannot be more than 511 bytes long, due to LMDB key size limit.

  • Because keys are compared bitwise, for range queries to work as expected on an attribute, its :db/valueType should be specified.

  • The maximum individual value size is 4GB. In practice, value size is determined by LMDB's ability to find large enough continuous space on disk and Datelevin's ability to pre-allocate off-heap buffers in JVM for them.

  • The total data size of a Datalevin database has the same limit as LMDB's, e.g. 128TB on a modern 64-bit machine that implements 48-bit address spaces.

  • There's no network interface as of now, but this may change.

  • Currently only supports Clojure on JVM, but adding support for other Clojure-hosting runtime is possible in the future, since bindings for LMDB exist in almost all major languages and available on most platforms.

Alternatives

If you are interested in using the dialect of Datalog pioneered by Datomic®, here are your current options:

  • If you need time travel and rich features backed by the authors of Clojure, you should use Datomic® .

  • If you need an in-memory store, e.g. for single page applications in a browser, Datascript is for you.

  • If you need features such as bi-temporal graph queries, You may try Crux .

  • If you don't mind experimental storage backend, you may try Datahike .

  • There was also Eva , a distributed store, but it is no longer in active development.

  • If you need a simple durable store with a battle tested backend, give Datalevin a try.

Version: 0.1.13

License

Copyright © 2020 Juji Inc.

Licensed under Eclipse Public License (see LICENSE ).


以上所述就是小编给大家介绍的《Datalevin is a port of Datascript in-memory Datalog database to Lightning Memory-Mapped Dat...》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

Node即学即用

Node即学即用

[英] Tom Hughes-Croucher、[英] Mike Wilson / 郑达韡 / 人民邮电出版社 / 2013-2 / 39.00元

《Node即学即用》由休斯-克劳奇、威尔逊编著,《Node即学即用》讲解如何用Node构建可扩展因特网应用,是全面的实用指南,除了详细介绍Node提供的API外,还用大量篇幅介绍了服务器事件驱动开发的重要概念。内容涉及跨服务器的并发连接、非阻塞I/O和事件驱动的编程、如何支持各种数据库和数据存储工具、NodeAPI的使用示例等。适合对JavaScript及编程有一定程度了解的读者阅读。一起来看看 《Node即学即用》 这本书的介绍吧!

XML、JSON 在线转换
XML、JSON 在线转换

在线XML、JSON转换工具

正则表达式在线测试
正则表达式在线测试

正则表达式在线测试

RGB HSV 转换
RGB HSV 转换

RGB HSV 互转工具