An Intro into the Lambda Architecture

栏目: IT技术 · 发布时间: 5年前

内容简介:The Lambda Architecture itself is a software design pattern, aiming to unify data processing. Its design enables it to process substantial quantities of data by applying both methods of batch and stream processing. A combination of these methods is used as

The Lambda Architecture itself is a software design pattern, aiming to unify data processing. Its design enables it to process substantial quantities of data by applying both methods of batch and stream processing. A combination of these methods is used as the patterns architecture approaches typical obstacles like latency, throughput and fault-tolerance.

It is used for high availability online applications, where, due to time delays, data validity is required. Generating precise and complete views by using batch processing and providing views of online data is done simultaneously.

Functionality

The Lambda Architecture has three main components, which are responsible for two main tasks. To interact and process newly incoming data and to react to queries on the existing data source. The incoming data sets will be handed off to the batch and the speed layer for further processing.

Batch Layer

The batch layer is responsible for taking care of the master data set. The master data set consists of an append-only, immutable set which only contains raw data. This is done by using a distributed processing system, which may handle massive amounts of data at once.

It gains its accuracy by being able to process all available data whilst generating views. By precomputing views based on the complete data set it is able to eliminate any error in the raw data. The output is typically generated by using map-reduce.

Map-reduce is a technique which takes a large data set and divides it into subsets. A specific function is then performed on each subset. These subsets are combined to form the output.

This output is usually stored in a read-only database, where updates fully delete the existing precomputed views. The batch layer allows the processing of older data sets. By analysing these it is possible to optimize the processing function used in the map-reduce action.

Speed Layer

The speed layer processes data streams in real-time. Therefore it neither guarantees its data to accurate nor to have fixed corrupt data. It attempts to minimize latency whilst granting real-time views into the most recent data. Thus its main purpose is to fill any gaps in the data caused by the batch layer’s lag in providing views based on the most recent data. The output of the speed layer may be thrown away after the calculations of the batch layers are finished.

Serving Layer

The serving layer combines the output from both batch and speed layer. As the initial entry point, it receives queries and responds to them. The complete data set is already available as it can use precomputed views or build them based on the processed data.


以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

在线

在线

王坚 / 中信出版集团股份有限公司 / 2016-9-1 / CNY 58.00

互联网成为基础设施,数据成为生产资料,计算成为公共服务。 移动互联网带来的真正影响,是人们的大部分时间都消耗在在线社会上了。 50多万年前的关键词是光明与黑暗,50多年前的关键词是数字和模拟,而今天的关键词是在线与离线。 移动互联网是比传统互联网在线程度更深的互联网。手机操作系统一旦做到了在线就会带来绝佳的用户体验。苹果手机不仅淘汰了传统手机,而且带来了一个新的时代。 对于......一起来看看 《在线》 这本书的介绍吧!

JS 压缩/解压工具
JS 压缩/解压工具

在线压缩/解压 JS 代码

RGB转16进制工具
RGB转16进制工具

RGB HEX 互转工具

随机密码生成器
随机密码生成器

多种字符组合密码