Introducing Apache Druid 0.18.0

栏目: IT技术 · 发布时间: 5年前

内容简介:The Apache Druid community released Druid 0.18 on April 20th, 2020. This release contains over 200 new features, performance enhancements, bug fixes, and major documentation improvements from 42 contributors.As always, you can visit theOn the path towards

The Apache Druid community released Druid 0.18 on April 20th, 2020. This release contains over 200 new features, performance enhancements, bug fixes, and major documentation improvements from 42 contributors.

As always, you can visit the Apache Druid download page to download the software and read the full release notes detailing every change. This Druid release is also available as part of theImply distribution, which includes Imply Pivot as well.

Support wider business intelligence use cases

On the path towards being a comprehensive analytics engine, Druid has picked up a few new capabilities in 0.18, including improved subqueries, JOINs, and GROUPING SETS

Improved subqueries

Queries that have multiple stages of calculations are easier to express with subqueries. Druid engine has improved support for subqueries in 0.18. You can find more details about subquery support in Druid documentation .

Subqueries look like this:

Introducing Apache Druid 0.18.0

Note that all subqueries in a single query share the same limit of 100,000 rows by default. This is because the intermediate table lives in the broker memory. The memory on the broker is a shared, limited resource. Thus you should avoid creating subqueries that output large result sets.

JOIN

In relational databases, normalized schemas are commonplace. This means that analytical engines must be able to merge data across tables. SQL JOINs provides this capability.

Prior to 0.18.0, Druid supported some JOIN features, such as Lookups or semi-joins in SQL. Druid 0.18.0 supports real joins for the first time ever in its history including support for INNER, LEFT, and CROSS joins.

With JOIN and subqueries, you can now express many common queries used in business intelligence use cases.

Introducing Apache Druid 0.18.0

In 0.18, JOIN support is limited to lookups, inline queries and subquery data sources. We are planning on adding support for more data sources in the future. Let us know what you want to see in the community .

GROUPING SETS

Queries that have multiple stages of calculations are easier to express with subqueries. Druid engine has improved support for subqueries in 0.18. You can find more details about subquery support in Druid documentation .

In 0.18, Druid has also introduced GROUPING SETS. For example, GROUP BY GROUPING SETS ( (country, city), (country), () ) will create a resultset that contains 3 sets of data, an aggregation (e.g., sum) breaking down at country-city level, then country-level, where the city is null, and followed by a grand total.

This is especially useful when you are doing multiple layers of roll up in reporting, from region, to city, to state level, for example. This is very efficient as the data only needs to be scanned once.

Please try out those features and let us know your feedback .

Query laning and dynamic prioritization

In multi-tenant environments with heterogeneous workloads, resource contention can lead to short-running queries having to wait in a queue for a long time. In Druid 0.18, we’ve introduced query laning and dynamic prioritization, giving you more control over how resources in a cluster are allocated. See more details here .

With laning, the broker examines and classifies a query and assigns it a lane. You also have the option to manually assign a lane. You can specify the maximum amount of resources a lane can use, ensuring some capacity is left to handle other workloads, such as short-running queries.

Automatic query prioritization determines the query priority based on how expensive a query is. It takes account of how much data is scanned, how far back in history the query is trying to read from, and a few other factors to estimate the cost of a query.

This is another step towards smarter resource allocation and it gives cluster administrators another tool in their toolbox.

SQL dynamic parameters

Druid now supports dynamic parameters for SQL. This can be used to enhance security. Because it will correctly escape parameter strings to avoid SQL injections. It also reduces the need to pass large blobs of data into the query planner. Because data blobs don’t need to be parsed by the planner during planning time, it’ll improve the query planning performance and reduce the memory requirement for query planning.

Roaring bitmaps as default

Druid supports two bitmap types, Roaring and CONCISE . Since Roaring bitmaps provide a better out-of-box experience (faster query speed in general), the default bitmap type has been switched to Roaring bitmaps.

Other items

For a full list of all new functionality in Druid 0.18.0, head over to the Apache Druid download page and check out the release notes!


以上所述就是小编给大家介绍的《Introducing Apache Druid 0.18.0》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

探索需求

探索需求

章柏幸、王媛媛、谢攀、杰拉尔德・温伯格、唐纳德・高斯 / 章柏幸、王媛媛、谢攀 / 清华大学出版社 / 2004-7-1 / 39.00元

本书将与您一起寻找"什么是客户真正想要的"这一问题的答案。 本书着眼于系统设计之前的需求过程,它是整个开发过程(如何设计人们想要的产品和系统)中最有挑战性的那部分。通过对一些需求分析中的常见误区和问题的分析和讨论,从和客户沟通开始,深入研究一些可能的需求,澄清用户和开发者期望值,最终给出了能够大幅度提高项目成功几率的一些建议方法。 本书由该领域内公认的两位作者合著,搜集了他们在大大小小......一起来看看 《探索需求》 这本书的介绍吧!

RGB转16进制工具
RGB转16进制工具

RGB HEX 互转工具

正则表达式在线测试
正则表达式在线测试

正则表达式在线测试

RGB CMYK 转换工具
RGB CMYK 转换工具

RGB CMYK 互转工具