data-engineering 25

How I halved the runtime of my PostgreSQL dbt model using DuckDB Mar 3, 2024
Modern data engineering stack Mar 18, 2023
Important skills for data engineers Mar 18, 2023
General guidelines for design of batch jobs Mar 18, 2023
Data pipeline design anti-patterns Mar 18, 2023
Data extraction and transformation design patterns Mar 18, 2023
My view on responsibilities of a modern data engineer Jan 25, 2020
Apache Spark Presentation May 13, 2018
My articles for Sonra Intelligence May 11, 2018
Loading Data into Snowflake Data Warehouse and performance of joins Mar 16, 2018
My favorite features of Snowflake Data Warehouse Mar 14, 2018
Using Spark Structured Streaming to upsert Kafka messages into a database Feb 11, 2018
Clustering keys Snowflake Feb 11, 2018
Advanced Spark Structured Streaming - Aggregations, Joins, Checkpointing Feb 11, 2018
Writing UDAFs on Snowflake Feb 7, 2018
Apache Airflow for data pipelines and ETL management Feb 1, 2018
Ingesting realtime tweets using Apache Kafka, Tweepy and Python Nov 11, 2017
Implementing the Speed Layer of Lambda Architecture using Spark Structured Streaming Nov 11, 2017
Implementing the Serving Layer of Lambda Architecture using Redshift Nov 11, 2017
Implementing the Batch Layer of Lambda Architecture using S3, Redshift and Apache Kafka Nov 11, 2017
Introduction to Lambda Architecture Nov 10, 2017
Windows functions in PostgresQL Nov 3, 2017
T-SQL Window functions syntax Sep 30, 2017
Spark vs Pandas benchmark: Why you should use Spark 2.1 only for really big data Aug 27, 2017
How to fix 'Task not serializable' issues in Apache Spark Jun 12, 2017

Trending Tags

sql snowflake principles spark envoy postgresql python airflow aws dbt