内容简介:Miller 5.3.0 已发布,Miller 是一个类似 sed、awk、cut、join 和 sort 工具,用来处理基于命名索引的数据,如 CSV、TSV 和 tabular JSON 。 该版本包括文档改进和 bug 修复,以及如下特性: Comment strings ...
Miller 5.3.0 已发布,Miller 是一个类似 sed、awk、cut、join 和 sort 工具,用来处理基于命名索引的数据,如 CSV、TSV 和 tabular JSON 。
该版本包括文档改进和 bug 修复,以及如下特性:
Comment strings in data files:
mlr --skip-commentsallows you to filter out input lines starting with#, for all file formats. Likewise,mlr --skip-comments-with Xlets you specify the comment-stringX. Comments are only supported at start of data line.mlr --pass-commentsandmlr --pass-comments-with Xallow you to forward comments to program output as they are read.The count-similar verb lets you compute cluster sizes by cluster labels.
While Miller DSL arithmetic gracefully overflows from 64-integer to double-precision float (see also here), there are now the integer-preserving arithmetic operators
.+.-.*././/for those times when you want integer overflow.There is a new bitcount function: for example,
echo x=0xf0000206 | mlr put '$y=bitcount($x)'producesx=0xf0000206,y=7.Issue 158:
mlr -Tis an alias for--nidx --fs tab, andmlr -tis an alias formlr --tsvlite.The mathematical constants π and e have been renamed from
PIandEtoM_PIandM_E, respectively. (It's annoying to get a syntax error when you try to define a variable namedEin the DSL, whenAthroughDwork just fine.) This is a backward incompatibility, but not enough of us to justify calling this release Miller 6.0.0.
下载地址:
效果示例:
before
county,tiv_2011,tiv_2012,line,construction SEMINOLE,22890.55,20848.71,Residential,Wood MIAMI DADE,1158674.85,1076001.08,Residential,Masonry PALM BEACH,1174081.5,1856589.17,Residential,Masonry MIAMI DADE,2850980.31,2650932.72,Commercial,Reinforced Masonry HIGHLANDS,23006.41,19757.91,Residential,Wood HIGHLANDS,49155.16,47362.96,Residential,Wood DUVAL,1731888.18,2785551.63,Residential,Masonry ST. JOHNS,29589.12,35207.53,Residential,Wood
after
$ mlr --icsv --opprint --barred \ put '$tiv_delta = $tiv_2012 - $tiv_2011; unset $tiv_2011, $tiv_2012' \ then sort -nr tiv_delta flins.csv +------------+-------------+----------------+ | county | line | tiv_delta | +------------+-------------+----------------+ | Duval | Residential | 1053663.450000 | | Palm Beach | Residential | 682507.670000 | | St. Johns | Residential | 5618.410000 | | Highlands | Residential | -1792.200000 | | Seminole | Residential | -2041.840000 | | Highlands | Residential | -3248.500000 | | Miami Dade | Residential | -82673.770000 | | Miami Dade | Commercial | -200047.590000 | +------------+-------------+----------------+
【声明】文章转载自:开源中国社区 [http://www.oschina.net]
以上所述就是小编给大家介绍的《Miller 5.3.0 发布,CSV 和 JSON 处理工具》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
精通数据科学:从线性回归到深度学习
唐亘 / 人民邮电出版社 / 2018-5-8 / 99.00元
数据科学是一门内涵很广的学科,它涉及到统计分析、机器学习以及计算机科学三方面的知识和技能。本书深入浅出、全面系统地介绍了这门学科的内容。 本书分为13章,最初的3章主要介绍数据科学想要解决的问题、常用的IT工具Python以及这门学科所涉及的数学基础。第4-7章主要讨论数据模型,主要包含三方面的内容:一是统计中最经典的线性回归和逻辑回归模型;二是计算机估算模型参数的随机梯度下降法,这是模型工......一起来看看 《精通数据科学:从线性回归到深度学习》 这本书的介绍吧!