ClickHouse的发版速度是众所周知的快
在最近,他们正式发出了18.12.13-2018-09-10版本
相关的CHANGELOG更是多的吓人
为了能够更好的使用新版特性,特做了详细的介绍
其中新特性部分,为人工翻译、校对,毕竟这部分内容是重点,后面为Google翻译
新特性列表(按照GitHub中CHANGELOG顺序)
支持Decimal
- Added the DECIMAL(digits, scale) data type (Decimal32(scale), Decimal64(scale), Decimal128(scale)). To enable it, use the setting allow_experimental_decimal_type. #2846 #2970 #3008 #3047
SELECT * FROM data_type_families WHERE name LIKE '%De%' ┌─name───────┬─case_insensitive─┬─alias_to─┐ │ Decimal32 │ 1 │ │ │ Decimal64 │ 1 │ │ │ Decimal128 │ 1 │ │ │ Decimal │ 1 │ │ └────────────┴──────────────────┴──────────┘
新的WITH ROLLUP修饰符GROUP BY,替代语法 GROUP BY ROLLUP(…)
- New WITH ROLLUP modifier for GROUP BY (alternative syntax: GROUP BY ROLLUP(…)). #2948
JOIN查询会把*解析为字段
- In requests with JOIN, the star character expands to a list of columns in all tables, in compliance with the SQL standard. You can restore the old behavior by setting - asterisk_left_columns_only to 1 on the user configuration level. Winter Zhang
JOIN支持table functions(remote/merge/numbers/url)
- Added support for JOIN with table functions. Winter Zhang
终端支持tab自动补全
- Autocomplete by pressing Tab in clickhouse-client. Sergey Shcherbin
终端支持ctrl c取消输入
- Ctrl+C in clickhouse-client clears a query that was entered. #2877
可指定默认的JOIN行为
- Added the join_default_strictness setting (values: “, ‘any’, ‘all’). This allows you to not specify ANY or ALL for JOIN. #2982
server log关联查询ID
- Each line of the server log related to query processing shows the query ID. #2482
2018.09.15 23:01:12.934700 [ 275 ] {b7a52046-e852-41c9-9e13-54533917bc56} <Debug> executeQuery: (from xx.xx.80.34:37066, user: user) select * from numbers(100)
2018.09.15 23:01:12.934984 [ 275 ] {b7a52046-e852-41c9-9e13-54533917bc56} <Trace> InterpreterSelectQuery: FetchColumns -> Complete
2018.09.15 23:01:12.935029 [ 86 ] <Trace> SystemLog (system.query_log): Flushing system log
2018.09.15 23:01:12.935036 [ 275 ] {b7a52046-e852-41c9-9e13-54533917bc56} <Debug> executeQuery: Query pipeline:
Expression
Expression
Limit
Numbers
2018.09.15 23:01:12.935391 [ 277 ] {b7a52046-e852-41c9-9e13-54533917bc56} <Trace> ThreadStatus: Thread 277 exited
2018.09.15 23:01:12.935449 [ 87 ] <Trace> SystemLog (system.query_thread_log): Flushing system log
2018.09.15 23:01:12.935517 [ 275 ] {b7a52046-e852-41c9-9e13-54533917bc56} <Information> executeQuery: Read 100 rows, 800.00 B in 0.001 sec., 146631 rows/sec., 1.12 MiB/sec.
2018.09.15 23:01:12.935557 [ 275 ] <Debug> MemoryTracker: Peak memory usage (total): 1.00 MiB.
2018.09.15 23:01:12.935572 [ 275 ] <Information> TCPHandler: Processed in 0.001 sec.
终端可以直接print日志
- Now you can get query execution logs in clickhouse-client (use the send_logs_level setting). With distributed query processing, logs are cascaded from all the servers. #2482
SELECT *
FROM system.settings
WHERE name = 'send_logs_level'
┌─name────────────┬─value─┬─changed─┬─description──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ send_logs_level │ none │ 0 │ Send server text logs with specified minumum level to client. Valid values: 'trace', 'debug', 'info', 'warning', 'error', 'none' │
└─────────────────┴───────┴─────────┴──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
1 rows in set. Elapsed: 0.002 sec.
xx.xx.xx.xx. :) set send_logs_level = 'trace';
SET send_logs_level = 'trace'
Ok.
0 rows in set. Elapsed: 0.001 sec.
xx.xx.xx.xx. :) select * from system.settings where name = 'send_logs_level' ;
SELECT *
FROM system.settings
WHERE name = 'send_logs_level'
[xx.xx.xx.xx.] 2018.09.15 23:30:51.158294 {dbecff24-3edc-4584-8379-02cb23aedcb2} [ 275 ] <Debug> executeQuery: (from 127.0.0.1:59056, user: user) select * from system.settings where name = 'send_logs_level'
[xx.xx.xx.xx.] 2018.09.15 23:30:51.159056 {dbecff24-3edc-4584-8379-02cb23aedcb2} [ 275 ] <Trace> InterpreterSelectQuery: FetchColumns -> Complete
[xx.xx.xx.xx.] 2018.09.15 23:30:51.159152 {dbecff24-3edc-4584-8379-02cb23aedcb2} [ 275 ] <Debug> executeQuery: Query pipeline:
Expression
Expression
Filter
One
┌─name────────────┬─value─┬─changed─┬─description──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ send_logs_level │ trace │ 1 │ Send server text logs with specified minumum level to client. Valid values: 'trace', 'debug', 'info', 'warning', 'error', 'none' │
└─────────────────┴───────┴─────────┴──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
[xx.xx.xx.xx.] 2018.09.15 23:30:51.159654 {dbecff24-3edc-4584-8379-02cb23aedcb2} [ 398 ] <Trace> ThreadStatus: Thread 398 exited
[xx.xx.xx.xx.] 2018.09.15 23:30:51.159803 {dbecff24-3edc-4584-8379-02cb23aedcb2} [ 275 ] <Information> executeQuery: Read 178 rows, 25.15 KiB in 0.001 sec., 127014 rows/sec., 17.53 MiB/sec.
# 分布式查询的日志也会打到当前终端
记录setting行为到query_log
- The system.query_log and system.processes (SHOW PROCESSLIST) tables now have information about all changed settings when you run a query (the nested structure of the Settings data). - Added the log_query_settings setting. #2482
SELECT * FROM system.query_log ORDER BY event_time DESC LIMIT 1 Row 1: ────── type: 1 event_date: 2018-09-15 event_time: 2018-09-15 23:42:17 query_start_time: 2018-09-15 23:42:17 query_duration_ms: 0 read_rows: 0 read_bytes: 0 written_rows: 0 written_bytes: 0 result_rows: 0 result_bytes: 0 memory_usage: 0 query: select * from system.metrics exception: stack_trace: is_initial_query: 1 user: user query_id: 0881d528-a79c-4bd7-8c0a-38ce270b95f1 address: �� M� port: 33384 initial_user: user initial_query_id: 0881d528-a79c-4bd7-8c0a-38ce270b95f1 initial_address: �� M� initial_port: 33384 interface: 2 os_user: client_hostname: client_name: client_revision: 0 client_version_major: 0 client_version_minor: 0 client_version_patch: 0 http_method: 1 http_user_agent: Go-http-client/1.1 quota_key: revision: 54407 thread_numbers: [] ProfileEvents.Names: [] ProfileEvents.Values: [] Settings.Names: ['max_threads','use_uncompressed_cache','background_pool_size','load_balancing','log_queries','readonly','max_memory_usage'] Settings.Values: ['48','0','64','random','1','1','32212254720']
记录线程数
- The system.query_log and system.processes tables now show information about the number of threads that are participating in query execution (see the thread_numbers column). #2482
SELECT thread_numbers FROM system.query_log ORDER BY event_time DESC LIMIT 10 ┌─thread_numbers─┐ │ [] │ │ [88] │ │ [] │ │ [77,173] │ │ [] │ │ [] │ │ [77,174] │ │ [] │ │ [77,171] │ │ [] │ └────────────────┘ SELECT thread_numbers FROM system.processes LIMIT 10 ┌─thread_numbers─┐ │ [77] │ └────────────────┘
增加进程统计信息
-
Added ProfileEvents counters that measure the time spent on reading and writing over the network and reading and writing to disk, the number of network errors, and the time spent - waiting when network bandwidth is limited. #2482
-
Added ProfileEventscounters that contain the system metrics from rusage (you can use them to get information about CPU usage in userspace and the kernel, page faults, and context -
switches), as well as taskstats metrics (use these to obtain information about I/O wait time, CPU wait time, and the amount of data read and recorded, both with and without page c ache). -#2482
-
The ProfileEvents counters are applied globally and for each query, as well as for each query execution thread, which allows you to profile resource consumption by query in detail. #2482
SELECT * FROM system.processes LIMIT 10 Row 1: ────── is_initial_query: 1 user: user query_id: 7e65449f-8899-4c5f-8859-20eeae32d1b1 address: 127.0.0.1 port: 38610 initial_user: user initial_query_id: 7e65449f-8899-4c5f-8859-20eeae32d1b1 initial_address: 127.0.0.1 initial_port: 38610 interface: 1 os_user: root client_hostname: xx.xx.xx.xx. client_name: ClickHouse client client_version_major: 18 client_version_minor: 12 client_version_patch: 13 client_revision: 54407 http_method: 0 http_user_agent: quota_key: elapsed: 0.000800632 is_cancelled: 0 read_rows: 0 read_bytes: 0 total_rows_approx: 0 written_rows: 0 written_bytes: 0 memory_usage: 880 peak_memory_usage: 880 query: select * from system.processes limit 10 thread_numbers: [181] ProfileEvents.Names: ['Query','SelectQuery','ReadCompressedBytes','CompressedReadBufferBlocks','CompressedReadBufferBytes','IOBufferAllocs','IOBufferAllocBytes','ContextLock','RWLockAcquiredReadLocks'] ProfileEvents.Values: [1,1,36,1,10,1,57,3,1] Settings.Names: ['max_threads','use_uncompressed_cache','background_pool_size','load_balancing','log_queries','max_memory_usage'] Settings.Values: ['48','0','64','random','1','64424509440']
新增每个查询执行线程的信息
- Added the system.query_thread_log table, which contains information about each query execution thread. Added the log_query_threads setting. #2482
SELECT * FROM system.query_thread_log ORDER BY event_time DESC LIMIT 3 Row 1: ────── event_date: 2018-09-15 event_time: 2018-09-15 22:29:17 query_start_time: 2018-09-15 22:29:17 query_duration_ms: 4 read_rows: 2178 read_bytes: 1446750 written_rows: 0 written_bytes: 0 memory_usage: 27136 peak_memory_usage: 44996168 thread_name: ParalInputsProc thread_number: 252 os_thread_id: 19255 master_thread_number: 74 master_os_thread_id: 9227 query: select * from system.query_thread_log order by event_time desc limit 10 is_initial_query: 1 user: user query_id: 55740e27-e796-4b0f-a9ff-530363f91d76 address: �� port: 52456 initial_user: user initial_query_id: 55740e27-e796-4b0f-a9ff-530363f91d76 initial_address: �� initial_port: 52456 interface: 1 os_user: root client_hostname: xx.xx.xx.xx. client_name: ClickHouse client client_revision: 54407 client_version_major: 18 client_version_minor: 12 client_version_patch: 13 http_method: 0 http_user_agent: quota_key: revision: 54407 ProfileEvents.Names: ['FileOpen','ReadBufferFromFileDescriptorRead','ReadBufferFromFileDescriptorReadBytes','ReadCompressedBytes','CompressedReadBufferBlocks','CompressedReadBufferBytes','IOBufferAllocs','IOBufferAllocBytes','MarkCacheMisses','CreatedReadBufferOrdinary','DiskReadElapsedMicroseconds','ContextLock','RealTimeMicroseconds','UserTimeMicroseconds','SystemTimeMicroseconds','SoftPageFaults'] ProfileEvents.Values: [80,118,231033,229113,46,1101082,80,42862276,40,40,1135,2,4328,890,2671,678]
新增 system.metrics and system.events
- The system.metrics and system.events tables now have built-in documentation. #3016
SELECT * FROM system.metrics ┌─metric───────────────────────────────────┬──────value─┬─description────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ Query │ 1 │ Number of executing queries │ │ Merge │ 0 │ Number of executing background merges │ │ PartMutation │ 0 │ Number of mutations (ALTER DELETE/UPDATE) │ │ ReplicatedFetch │ 0 │ Number of data parts fetching from replica │ │ ReplicatedSend │ 0 │ Number of data parts sending to replicas │ │ ReplicatedChecks │ 0 │ Number of data parts checking for consistency SELECT * FROM system.events LIMIT 10 ┌─event───────────────────────────────────┬────value─┬─description────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ Query │ 127 │ Number of queries started to be interpreted and maybe executed. Does not include queries that are failed to parse, that are rejected due to AST size limits; rejected due to quota limits or limits on number of simultaneously running queries. May include internal queries initiated by ClickHouse itself. Does not count subqueries. │ │ SelectQuery │ 124 │ Same as Query, but only for SELECT queries. │ │ FileOpen │ 54504 │ Number of files opened. │ │ Seek │ 47 │ Number of times the 'lseek' function was called.
新增arrayEnumerateDense函数
- Added the arrayEnumerateDense function. Amos Bird
SELECT arrayEnumerateDense([(1, 2), (3, 4), (1, 2), (1, 2), (2, 3), (2, 3)]) ┌─arrayEnumerateDense(array(tuple(1, 2), tuple(3, 4), tuple(1, 2), tuple(1, 2), tuple(2, 3), tuple(2, 3)))─┐ │ [1,2,1,1,3,3] │ └──────────────────────────────────────────────────────────────────────────────────────────────────────────┘
新增arrayCumSumNonNegativ/arrayDifference函数
-
Added the arrayCumSumNonNegative and arrayDifference functions. Aleksey Studnev
-
似乎还没上,无法使用。
新增retention函数
- Added the retention aggregate function. Sundy Li
SELECT
uid,
retention(date = '2018-08-06', date = '2018-08-07', date = '2018-08-08') AS r
FROM retention_test
WHERE date IN ('2018-08-06', '2018-08-07', '2018-08-08')
GROUP BY uid
ORDER BY uid ASC
LIMIT 3
┌─uid─┬─r───────┐
│ 0 │ [1,1,1] │
│ 1 │ [1,1,1] │
│ 2 │ [1,1,1] │
└─────┴─────────┘
states函数可以使用加号云算法
- Now you can add (merge) states of aggregate functions by using the plus operator, and multiply the states of aggregate functions by a nonnegative constant. #3062 #3034
CREATE TABLE add_aggregate
(
a UInt32,
b UInt32
)
ENGINE = Memory
INSERT INTO add_aggregate VALUES(1, 2);
INSERT INTO add_aggregate VALUES(3, 1);
SELECT minMerge(x)
FROM
(
SELECT minState(a) + minState(b) AS x
FROM add_aggregate
)
┌─minMerge(x)─┐
│ 1 │
└─────────────┘
虚拟列
- Tables in the MergeTree family now have the virtual column _partition_id. #3089
SELECT _partition_id FROM test.partition_id ORDER BY _partition_id ASC ┌─_partition_id─┐ │ 197004 │ │ 197004 │ │ 197007 │ │ 197007 │ │ 197010 │ │ 197010 │ │ 201809 │ │ 201809 │ │ 201809 │ └───────────────┘ SELECT * FROM test.partition_id ORDER BY d ASC ┌──────────d─┬───x─┐ │ 1970-04-11 │ 1 │ │ 1970-04-11 │ 1 │ │ 1970-07-20 │ 2 │ │ 1970-07-20 │ 2 │ │ 1970-10-28 │ 3 │ │ 1970-10-28 │ 3 │ │ 2018-09-13 │ 100 │ │ 2018-09-14 │ 100 │ │ 2018-09-15 │ 100 │ └────────────┴─────┘
Bug修复:
- 修复了
Dictionary表的问题(抛出Size of offsets doesn't match size of column或Unknown compression method异常)。此错误出现在版本18.10.3中。 #2913 - 修复了合并
CollapsingMergeTree表时如果其中一个数据部分为空(这些部分是在合并期间形成或者ALTER DELETE所有数据都已删除)的错误,并且该vertical算法用于合并。 #3049 - 在修正了比赛条件
DROP或TRUNCATE用于Memory与同时表SELECT,这可能导致服务器崩溃。此错误出现在1.1.54388版本中。 #3038 - 修复了在返回错误时插入
Replicated表时数据丢失的可能性Session is expired(可以通过ReplicatedDataLoss度量检测到数据丢失)。版本1.1.54378中发生此错误。 #2939 #2949 #2964 - 修复了一段时间内的段错误
JOIN ... ON。 #3000 - 修复了
WHERE表达式完全由限定列名称组成时的错误搜索列名称,例如WHERE table.column。 #2994 - 修复了在执行分布式查询时发生的“未找到列”错误,如果从远程服务器请求包含带有子查询的IN表达式的单个列。 #3087
- 修复了
Block structure mismatch in UNION stream: different number of columns分布式查询发生的错误,如果其中一个分片是本地分区而另一个分片不是,并且PREWHERE触发了移动优化。 #2226 #3037 #3055 #3065 #3073 #3090 #3093 - 修复了
pointInPolygon非凸多边形的某些情况的函数。 #2910 - 修正了
nan与整数比较时的错误结果。 #3024 - 修复了
zlib-ng库中可能导致段错误的错误。 #2854 - 修复了插入带有
AggregateFunction列的表时的内存泄漏,如果聚合函数的状态不简单(单独分配内存),以及单个插入请求是否导致多个小块。 #3084 - 修复了同时创建和删除相同
Buffer或MergeTree表格时的竞争条件。 - 修复了比较由某些非平凡类型组成的元组(例如元组)时出现段错误的可能性。 #2989
- 修复了运行某些
ON CLUSTER查询时出现段错误的可能性。 张冬 - 修复了数组元素
arrayDistinct函数中的错误Nullable。 #2845 #2937 - 该
enable_optimize_predicate_expression选项现在可以正确支持案例SELECT *。 张冬 - 修复了重新初始化ZooKeeper会话时的段错误。 #2917
- 修复了使用ZooKeeper时潜在的阻塞问题。
- 修复了在a中添加嵌套数据结构的错误代码
SummingMergeTree。 - 在为聚合函数的状态分配内存时,正确地考虑了对齐,这使得在实现聚合函数的状态时可以使用需要对齐的操作。 晨星-XC
安全修复:
- 安全使用ODBC数据源。与ODBC驱动程序的交互使用单独的
clickhouse-odbc-bridge进程。第三方ODBC驱动程序中的错误不再导致服务器稳定性或漏洞问题。 #2828 #2879 #2886 #2893 #2921 - 修复了
catBoostPool表函数中文件路径的错误验证问题。 #2894 - 系统表(的内容
tables,databases,parts,columns,parts_columns,merges,mutations,replicas,和replication_queue)根据用户的配置访问数据库过滤(allow_databases)。 张冬
向后不兼容的变化:
- 在使用JOIN的请求中,星形字符扩展为所有表中的列列表,符合 SQL 标准。您可以通过
asterisk_left_columns_only在用户配置级别设置为1 来恢复旧行为。
构建更改:
- 现在大多数集成测试都可以通过提交来运行。
- 代码样式检查也可以通过提交运行。
- 在
memcpy上CentOS7 / Fedora的建设时,实施正确选择。 Etienne Champetier - 使用clang进行构建时,
-Weverything除了常规警告外,还添加了一些警告-Wall-Wextra -Werror。 #2957 - 调试构建使用
jemalloc调试选项。 - 用于与ZooKeeper交互的库的接口被声明为abstract。 #2950
以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网
猜你喜欢:- 解析 Angular 7 的十大特性
- 最全的 Swift 4 新特性解析
- java8新特性Optional深度解析
- java8新特性Optional深度解析
- java8新特性function和lambda深度解析
- npm5 新版功能特性解析及与 yarn 评测对比
本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
Python Algorithms
Magnus Lie Hetland / Apress / 2010-11-24 / USD 49.99
Python Algorithms explains the Python approach to algorithm analysis and design. Written by Magnus Lie Hetland, author of Beginning Python, this book is sharply focused on classical algorithms, but it......一起来看看 《Python Algorithms》 这本书的介绍吧!