Stan Kudrow • Data Platform Engineer

I'm a data platform engineer with 10+ years of experience designing and building the systems that move, transform, and surface data at scale — across AI, fintech, media, and high-scale analytics environments.

These days I'm drawn to the intersection of data engineering and AI: agentic pipelines, schema governance with human-in-the-loop oversight, and evaluation frameworks that make it safe to iterate quickly on complex transformations.

I care about building platform foundations that teams can actually trust — reliable, observable, and easy to extend, so engineers spend less time firefighting and more time building.

Staff Data Engineer · Figure

2021 – Present

Architected and owned the company's core data platform — a multi-stage lakehouse with standardized ingestion, automated schema evolution, and self-serve onboarding for new data sources. Later led the development of agentic systems that apply LLMs to propose and validate data transformations end-to-end, turning a manual, error-prone process into one that scales without proportional headcount.

Apache Iceberg Airflow Kubernetes Pulumi LLMs / Agents

Software Engineer · Microsoft

2020 – 2021

Defined the engineering patterns and infrastructure standards for a large-scale data platform migration, establishing the technical foundation the team built on. Drove the performance work that made the new platform a clear step forward — not just a lateral move.

PySpark Azure Terraform

Lead Data Engineer · Vice Media

2017 – 2020

Led the modernization of Vice Media's data infrastructure — introducing the orchestration tooling, building the company's foundational data lakes, and driving the migration away from legacy warehouses. Left the data org with a platform that could scale with the business rather than against it.

Airflow AWS Snowflake

Agentic ETL Proposal Pipeline

Built an agentic pipeline that runs inference over raw tables to surface schema improvement opportunities — type narrowing, timestamp normalization, redundant column detection — generates natural-language justifications for each proposal, and routes approved changes directly into Airflow and Iceberg workflows. Turned schema governance from a manual review process into a continuously running, self-improving system.

New database onboarding: 5 days → 1 day

Apache Iceberg Airflow Python LLMs / Agents

Agentic Evaluation Framework

Built an evaluation system that leverages agents to automatically seed test data into Iceberg tables, propose schema and transformation changes, execute full pipelines, and validate outputs. This approach made it dramatically faster and safer to iterate on complex data transformations. The verifiable output loop also unlocked agentic coding workflows, giving AI agents a reliable signal to iterate against.

Enabled safe, automated iteration on LLM-driven pipelines at scale

Apache Iceberg Airflow Python LLMs / Agents

Unified Data Platform

Designed and built a platform to unify data scattered across 100+ microservices into a single, queryable warehouse. The system automated schema evolution, infrastructure provisioning, and onboarding — so adding a new service went from a week of manual work to a two-hour self-serve process.

New service onboarding: 1 week → 2 hours

BigQuery Python Airflow PostgreSQL

Custom Event Tracking Platform

Replaced a costly third-party event tracking platform with a custom-built system on AWS — Kinesis for ingestion, API Gateway and Lambda for collection, and Redshift for storage and analysis. Gave the analytics team full ownership of their event data pipeline while dramatically cutting infrastructure spend.

Infrastructure costs reduced by 75%

AWS Kinesis Lambda API Gateway Redshift Node.js

About

Experience