Apache Druid vs. Time-Series Databases

栏目: IT技术 · 发布时间: 5年前

内容简介:We occasionally get questions regarding how Apache Druid differs from time-series databases (TSDB) such as InfluxDB or Prometheus, and when to use each technology. This short post serves to help answer these questions.The most important thing to keep in mi

We occasionally get questions regarding how Apache Druid differs from time-series databases (TSDB) such as InfluxDB or Prometheus, and when to use each technology. This short post serves to help answer these questions.

The most important thing to keep in mind is that Druid is not a TSDB and the comparisons aren’t always apples to apples. Although Druid draws ideas from a number of TSDB concepts, it is designed for a wider range of analytic use cases than those for which a TSDB is usually employed.

Specifically, Druid is primarily deployed to:

  • Power user-facing analytics applications that require low latency (sub-second) queries.
  • Enable ad-hoc and exploratory analytics, where numerous iterative queries are issued to understand a data pattern or anomaly.
  • Quickly ingest (up to millions of events per second) complex events (data points with hundreds of fields).
  • Support concurrent queries from hundreds or thousands of concurrent users.
  • Answer complex OLAP queries on very large data sets.

Most TSDBs are designed to collect and aggregate metrics from servers and devices. Metrics are series of timestamps, tags or attributes, and numbers. TSDBs typically store each metric series individually, partition/shard each series around time, and provide query capabilities for aggregating numbers. TSDBs work great if your use case is to simply aggregate numbers/counters, and if you don’t require any complex analytics for each series.

If you do need analytics – for instance you want to group on non-time-based tags and attributes, or want to slice and dice on your metrics arbitrarily, you’ll need a database that is architecturally designed for the workflow. For example, consider a use case with server metrics. If you want to simply compute the average network latency of your servers across your data center, most TSDBs would serve your use case well. However, if you also want to perform complex slice and dice operations against multiple dimensions, such as computing the 95th percentile network latency of your data centers, grouped on data center location and server type, and filtered by a specific set of IPs, TSDBs can struggle with these queries at scale.

Druid is a real-time analytics database that not only incorporates architecture designs from TSDBs such as time-based partitioning and fast aggregation, but also includes ideas from search systems and data warehouses, making it a great fit for all types of event-driven data. Druid is fundamentally an OLAP engine at heart, albeit one designed for more modern, event-driven architectures.

Because Druid also draws inspiration from data warehouses, it is able to perform complex analytic queries better due to its architecture, which offers the following advantages over those of a typical TSDB: Column oriented storage, ideal for multi-dimensional groupBys on columns that are not time-based. Fast search and filter through inverted indexes, for fast ad-hoc slice and dice. Minimal schema design, and native support for semi-structured and nested data.

To summarize, Druid tends to perform better on large data sets that require grouping and filtering on many columns by many concurrent users.

A great way to get hands-on with Druid is through a Free Imply Download or Imply Cloud Trial .


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

点石成金

点石成金

[美] 克鲁格 (Steve Krug) / 蒋芳 / 机械工业出版社 / 2015-1-1 / CNY 59.00

《点石成金:访客至上的Web和移动可用性设计秘笈(原书第3版)》是一本关于Web设计原则而不是Web设计技术的书。《点石成金:访客至上的Web和移动可用性设计秘笈(原书第3版)》作者是Web设计专家,具有丰富的实践经验。他用幽默的语言为你揭示Web设计中重要但却容易被忽视的问题,只需几个小时,你便能对照书中讲授的设计原则找到网站设计的症结所在,令你的网站焕然一新。一起来看看 《点石成金》 这本书的介绍吧!

JSON 在线解析
JSON 在线解析

在线 JSON 格式化工具

RGB转16进制工具
RGB转16进制工具

RGB HEX 互转工具

HTML 编码/解码
HTML 编码/解码

HTML 编码/解码