Architecting and implementing a data lakehouse-based Parquet export system to replace deprecated infrastructure, migrating Parquet Exporter notebooks to production Spark jobs w/ monitoring using Prometheus and Grafana.
Building scalable data pipelines using Apache Spark and Databricks to process Delta tables from S3, generating hourly parquet exports and orchestrating workflows through Airflow DAGs for reliable automation.