An Intro into the Lambda Architecture

栏目: IT技术 · 发布时间: 4年前

内容简介：The Lambda Architecture itself is a software design pattern, aiming to unify data processing. Its design enables it to process substantial quantities of data by applying both methods of batch and stream processing. A combination of these methods is used as

The Lambda Architecture itself is a software design pattern, aiming to unify data processing. Its design enables it to process substantial quantities of data by applying both methods of batch and stream processing. A combination of these methods is used as the patterns architecture approaches typical obstacles like latency, throughput and fault-tolerance.

It is used for high availability online applications, where, due to time delays, data validity is required. Generating precise and complete views by using batch processing and providing views of online data is done simultaneously.

Functionality

The Lambda Architecture has three main components, which are responsible for two main tasks. To interact and process newly incoming data and to react to queries on the existing data source. The incoming data sets will be handed off to the batch and the speed layer for further processing.

Batch Layer

The batch layer is responsible for taking care of the master data set. The master data set consists of an append-only, immutable set which only contains raw data. This is done by using a distributed processing system, which may handle massive amounts of data at once.

It gains its accuracy by being able to process all available data whilst generating views. By precomputing views based on the complete data set it is able to eliminate any error in the raw data. The output is typically generated by using map-reduce.

Map-reduce is a technique which takes a large data set and divides it into subsets. A specific function is then performed on each subset. These subsets are combined to form the output.

This output is usually stored in a read-only database, where updates fully delete the existing precomputed views. The batch layer allows the processing of older data sets. By analysing these it is possible to optimize the processing function used in the map-reduce action.

Speed Layer

The speed layer processes data streams in real-time. Therefore it neither guarantees its data to accurate nor to have fixed corrupt data. It attempts to minimize latency whilst granting real-time views into the most recent data. Thus its main purpose is to fill any gaps in the data caused by the batch layer’s lag in providing views based on the most recent data. The output of the speed layer may be thrown away after the calculations of the batch layers are finished.

Serving Layer

The serving layer combines the output from both batch and speed layer. As the initial entry point, it receives queries and responds to them. The complete data set is already available as it can use precomputed views or build them based on the processed data.

以上就是本文的全部内容，希望本文的内容对大家的学习或者工作能带来一定的帮助，也希望大家多多支持码农网

查看所有标签

猜你喜欢:

An Intro into the Lambda Architecture

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

啊哈C语言！逻辑的挑战（修订版）

啊哈磊 / 电子工业出版社 / 2017-1 / 49

《啊哈C语言！逻辑的挑战（修订版）》是一本非常有趣的编程启蒙书，《啊哈C语言！逻辑的挑战（修订版）》从中小学生的角度来讲述，没有生涩的内容，取而代之的是生动活泼的漫画和风趣幽默的文字。配合超萌的编程软件，《啊哈C语言！逻辑的挑战（修订版）》从开始学习与计算机对话到自己独立制作一个游戏，由浅入深地讲述编程的思维。同时，与计算机展开的逻辑较量一定会让你觉得很有意思。你可以在茶余饭后阅读《啊哈C语言！逻......一起来看看《啊哈C语言！逻辑的挑战（修订版）》这本书的介绍吧!

码农工具