From legacy bottlenecks to platforms that move in minutes.
Senior Data Engineer with 5+ years designing and delivering end-to-end data platforms across fintech, e-commerce, and enterprise environments — scalable, cost-efficient pipelines on AWS, Snowflake, Spark, dbt & Python, with a consistent track record of turning slow, legacy processes into fast, reliable ones.
Leading the modernization of a business-critical platform — migrating ELT from Spark-based AWS Glue to Airbyte + dbt + Redshift, integrating ~10 GB/day from Oracle, SQL Server, OpenEdge & IBM DB2.
Re-engineered legacy Spark pipelines to eliminate memory bottlenecks — cutting a critical pipeline from 8 h → 30 min 16× faster.
Replaced legacy SSIS packages with Airflow DAGs for observability, maintainability & scheduling control.
Rolling out Claude — skills, agents & MCP servers as a developer-productivity and incident-response accelerator across dbt, Terraform, SQL & Airflow.
Consolidated large on-prem & Amazon RDS Oracle databases into Snowflake — including ~3 TB tables · 50M+ rows/day — via ETL, ELT & Reverse-ETL.
Redesigned the data model from scratch (~20 models across ~50 tables) — reporting 10 h → 5 min.
Built Snowflake stored procedures and high-volume load patterns with Snowflake + Airflow + Airbyte; enhanced pipelines with SnowPark via Notebooks, Worksheets & Streamlit.
Built end-to-end pipelines for a conflict-of-interest application — ~10 sources at ~40 GB/day into a .NET app.
Architected the data layer with AWS Neptune (entity-relationship logic) and OpenSearch (fast per-company lookups).
Implemented a data lake with Apache Spark + Iceberg on AWS Glue & Athena; orchestrated with Step Functions, Glue & Lambda. Earlier: documented IBM DataStage ETL and automated XML extraction in Python.
Built data marts in Snowflake for BI, Data Science & Operations — ~60 GB/day · peak ~150M rows.
Migrated batch ingestion from Apache NiFi to Spark on AWS Glue — 6 h → 20 min.
Built ETL/ELT for the legal & regulatory reporting behind RappiPay's banking-license process in Colombia — contributing to a new bank that remains in operation today. Added Apache Kafka streaming for near-real-time statements.
Built Python, R & Bash geospatial pipelines — a census-cleaning script that did in 1 day what took weeks, and COVID-19 automation 15 h → 15 min.
Developed on-demand geospatial tooling for internal clients and supported ArcGIS / ArcGIS Online through demos and technical talks.
Beyond the pipelines.
Away from the data platforms, I'm happiest in motion — traveling to new places, hiking trails, and playing sport. I currently train CrossFit at an intermediate level and run regularly. Sport keeps the same discipline I bring to engineering: show up, measure, improve.
Books I recommend.
A short shelf on stoicism, mindset, and the practice of staying present.