Advanced Spark Structured Streaming - Aggregations, Joins, Checkpointing
I wrote a blog post demonstrating advanced Spark Structured Streaming topics.
An overview of the content is:
- setting up a Kafka server
- producing messages with Kafka
- consuming tweets with Spark Structured Streaming
- watermarking messages
- parsing JSON data
- performing aggregattion queries on the stream of data
- analyzing execution plans of queries
- upserting data to Snowflake
- checkpointing a structured stream
You can find the full blog post here.
A small preview:
This post is licensed under CC BY 4.0 by the author.
Comments powered by Disqus.