Submitted by wtanaka
on Sun, 2016-12-18 13:58 (
Apache Beam is a programming model and an associated SDK that lets you write software programs that work with large or possibly unbounded data sets. Beam code is reminiscent of using lazy evaluation in functional programming languages to operate on infinitely large data structures.
For example, this code reads lines forever from a Kafka topic, groups the lines into windows of 10 seconds, counts the number of instances of each word in the group, and outputs the words with their counts to another Kafka topic.