• Developed proof of concept for data streaming pipeline (ETL) to process 25TB monthly. Successfully delivered end-to-end implementation using AWS, Apache Spark, and Databricks Delta Lake.
• Led recovery of a critical server outage, housing time-critical data relied upon by ~700 customers. Created emergency pipeline using AWS and Databricks Autoloader to firefight failure within 72 hours. Personally onboarded 20 engineers onto new system, saving >800 engineer hours.
• Used Spark UI, Structured Streaming Metrics, and Ganglia to analyze, monitor, and document pipeline metrics: data accuracy and completeness, data processing time, throughput, error and rejection rates, and resource utilization.
• Delivered 75 SQL views and bulk data download feature to enable 12 stakeholders to access machine data