Introducing Apache Druid 0.18.0

栏目: IT技术 · 发布时间: 4年前

内容简介:The Apache Druid community released Druid 0.18 on April 20th, 2020. This release contains over 200 new features, performance enhancements, bug fixes, and major documentation improvements from 42 contributors.As always, you can visit theOn the path towards

The Apache Druid community released Druid 0.18 on April 20th, 2020. This release contains over 200 new features, performance enhancements, bug fixes, and major documentation improvements from 42 contributors.

As always, you can visit the Apache Druid download page to download the software and read the full release notes detailing every change. This Druid release is also available as part of theImply distribution, which includes Imply Pivot as well.

Support wider business intelligence use cases

On the path towards being a comprehensive analytics engine, Druid has picked up a few new capabilities in 0.18, including improved subqueries, JOINs, and GROUPING SETS

Improved subqueries

Queries that have multiple stages of calculations are easier to express with subqueries. Druid engine has improved support for subqueries in 0.18. You can find more details about subquery support in Druid documentation .

Subqueries look like this:

Introducing Apache Druid 0.18.0

Note that all subqueries in a single query share the same limit of 100,000 rows by default. This is because the intermediate table lives in the broker memory. The memory on the broker is a shared, limited resource. Thus you should avoid creating subqueries that output large result sets.

JOIN

In relational databases, normalized schemas are commonplace. This means that analytical engines must be able to merge data across tables. SQL JOINs provides this capability.

Prior to 0.18.0, Druid supported some JOIN features, such as Lookups or semi-joins in SQL. Druid 0.18.0 supports real joins for the first time ever in its history including support for INNER, LEFT, and CROSS joins.

With JOIN and subqueries, you can now express many common queries used in business intelligence use cases.

Introducing Apache Druid 0.18.0

In 0.18, JOIN support is limited to lookups, inline queries and subquery data sources. We are planning on adding support for more data sources in the future. Let us know what you want to see in the community .

GROUPING SETS

Queries that have multiple stages of calculations are easier to express with subqueries. Druid engine has improved support for subqueries in 0.18. You can find more details about subquery support in Druid documentation .

In 0.18, Druid has also introduced GROUPING SETS. For example, GROUP BY GROUPING SETS ( (country, city), (country), () ) will create a resultset that contains 3 sets of data, an aggregation (e.g., sum) breaking down at country-city level, then country-level, where the city is null, and followed by a grand total.

This is especially useful when you are doing multiple layers of roll up in reporting, from region, to city, to state level, for example. This is very efficient as the data only needs to be scanned once.

Please try out those features and let us know your feedback .

Query laning and dynamic prioritization

In multi-tenant environments with heterogeneous workloads, resource contention can lead to short-running queries having to wait in a queue for a long time. In Druid 0.18, we’ve introduced query laning and dynamic prioritization, giving you more control over how resources in a cluster are allocated. See more details here .

With laning, the broker examines and classifies a query and assigns it a lane. You also have the option to manually assign a lane. You can specify the maximum amount of resources a lane can use, ensuring some capacity is left to handle other workloads, such as short-running queries.

Automatic query prioritization determines the query priority based on how expensive a query is. It takes account of how much data is scanned, how far back in history the query is trying to read from, and a few other factors to estimate the cost of a query.

This is another step towards smarter resource allocation and it gives cluster administrators another tool in their toolbox.

SQL dynamic parameters

Druid now supports dynamic parameters for SQL. This can be used to enhance security. Because it will correctly escape parameter strings to avoid SQL injections. It also reduces the need to pass large blobs of data into the query planner. Because data blobs don’t need to be parsed by the planner during planning time, it’ll improve the query planning performance and reduce the memory requirement for query planning.

Roaring bitmaps as default

Druid supports two bitmap types, Roaring and CONCISE . Since Roaring bitmaps provide a better out-of-box experience (faster query speed in general), the default bitmap type has been switched to Roaring bitmaps.

Other items

For a full list of all new functionality in Druid 0.18.0, head over to the Apache Druid download page and check out the release notes!


以上所述就是小编给大家介绍的《Introducing Apache Druid 0.18.0》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

Programming in Haskell

Programming in Haskell

Graham Hutton / Cambridge University Press / 2007-1-18 / GBP 34.99

Haskell is one of the leading languages for teaching functional programming, enabling students to write simpler and cleaner code, and to learn how to structure and reason about programs. This introduc......一起来看看 《Programming in Haskell》 这本书的介绍吧!

JS 压缩/解压工具
JS 压缩/解压工具

在线压缩/解压 JS 代码

正则表达式在线测试
正则表达式在线测试

正则表达式在线测试

RGB HSV 转换
RGB HSV 转换

RGB HSV 互转工具