- 授权协议: LGPL
- 开发语言: Ruby
- 操作系统: 跨平台
- 软件首页: http://www.kiba-etl.org/
- 软件文档: https://github.com/thbar/kiba
软件介绍
Kiba 是一个轻量级的 Ruby 的 ETL 框架。
作业定义 xxx.etl:
# declare a ruby method here, for quick reusable logic
def parse_french_date(date)
Date.strptime(date, '%d/%m/%Y')
end
# or better, include a ruby file which loads reusable assets
# eg: commonly used sources / destinations / transforms, under unit-test
require_relative 'common'
# declare a pre-processor: a block called before the first row is read
pre_process do
# do something
end
# declare a source where to take data from (you implement it - see notes below)
source MyCsvSource, 'input.csv'
# declare a row transform to process a given field
transform do |row|
row[:birth_date] = parse_french_date(row[:birth_date])
# return to keep in the pipeline
row
end
# declare another row transform, dismissing rows conditionally by returning nil
transform do |row|
row[:birth_date].year < 2000 ? row : nil
end
# declare a row transform as a class, which can be tested properly
transform ComplianceCheckTransform, eula: 2015
# before declaring a definition, maybe you'll want to retrieve credentials
config = YAML.load(IO.read('config.yml'))
# declare a destination - like source, you implement it (see below)
destination MyDatabaseDestination, config['my_database']
# declare a post-processor: a block called after all rows are successfully processed
post_process do
# do something
end执行作业:bundle exec kiba my-data-processing-script.etl
菜鸟侦探挑战数据分析
[日] 石田基广 / 支鹏浩 / 人民邮电出版社 / 2017-1 / 42
本书以小说的形式展开,讲述了主人公俵太从大学文科专业毕业后进入征信所,从零开始学习数据分析的故事。书中以主人公就职的征信所所在的商业街为舞台,选取贴近生活的案例,将平均值、t检验、卡方检验、相关、回归分析、文本挖掘以及时间序列分析等数据分析的基础知识融入到了生动有趣的侦探故事中,讲解由浅入深、寓教于乐,没有深奥的理论和晦涩的术语,同时提供了大量实际数据,使用免费自由软件RStudio引领读者进一步......一起来看看 《菜鸟侦探挑战数据分析》 这本书的介绍吧!
