Real-Time Fraud Signals Pipeline
repoKafka -> Spark Structured Streaming -> Delta -> dbt -> Streamlit. Exactly-once processing, anomaly detection, dbt tests.
- Apache Kafka
- Spark Structured Streaming
- Delta Lake
- dbt
- Streamlit
- Python
Senior Data Engineer · Analytics Engineering · Fraud & Revenue Data Platforms
I build production-grade data platforms that reduce fraud loss, improve revenue visibility, and turn ambiguous business problems into reliable analytics systems.
Kafka -> Spark Structured Streaming -> Delta -> dbt -> Streamlit. Exactly-once processing, anomaly detection, dbt tests.
Medallion (Bronze/Silver/Gold) lakehouse over synthetic CDR data. Airflow + MinIO + Iceberg + Great Expectations + dbt.
CLI that uses Claude to suggest Spark/Snowflake query rewrites and partition strategies. Benchmarked against a 50-query corpus.
I'm a senior data and analytics engineer with six years at AT&T building production data platforms on Azure and Databricks. My work spans PySpark ingestion, dbt analytics models, BI delivery, streaming systems, and data quality. I specialize in turning ambiguous fraud, revenue, and operational problems into reliable pipelines and decision-ready analytics.
I'm open to senior data engineering, analytics engineering, and data platform consulting conversations.