Stream Processing Framework in Python – Faust

I was looking for a highly scalable streaming framework in python. I was using spark streaming till now for reading data from streams with heavy through puts. But somehow I felt spark a little heavy as the minimum system requirement is high.

Last day I was researching on this and found one framework called Faust. I started exploring the framework and my initial impression is very good.

This framework is capable of running in distributed way. So we can run the same program in multiple machines. This will enhance the performance.

I tried executing the sample program present in their website and it worked properly. The same program is pasted below. I have used CDH Kafka 4.1.0. The program worked seamlessly.

To execute the program, I have used the following command.

python worker -l info

The above program reads the data from Kafka and prints the message. This framework is not just about reading messages in parallel from streaming sources. This has integrations with an embedded key-value data store RockDB. This is opensourced by Facebook and is written in C++.