Selected systems / technical notes

Selected systems and technical notes.

Representative examples across production AI, data platforms, analytics, privacy-aware infrastructure, commerce automation, and ML data systems.

These notes are based on public career history and sanitized project descriptions. Company names provide employment context only; confidential implementation details are omitted.

View selected systems Back to overview

Technical Focus

Production AI and data systems

Recurring problem spaces from applied engineering work: governed analytics, durable data platforms, reliable LLM workflows, self-service reporting, and commerce data systems.

Applied AI and Analytics Systems

Production AI and analytical systems that turn ambiguous business, editorial, commerce, and operational signals into governed, reviewable decisions.

Conversational analytics, anomaly investigation, classification, and defect-detection workflows
LLM workflows with structured outputs, guardrails, review gates, and observable failure modes
Retrieval, classification, multimodal verification, and privacy-aware NLP patterns

Data Platform Strategy

Practical architecture and sequencing for fragmented pipelines, warehouse models, and ownership boundaries.

Architecture and sequencing across BigQuery, dbt, Airflow, streaming, and PySpark
Data quality, lineage, and model contracts for shared datasets
Operating model for roadmap, support, and stakeholder intake

LLM Workflow Reliability

Auditable AI workflows with deterministic checks, structured outputs, tool use, human review, and production observability.

Guardrails for confidence, safety, retries, and failure investigation
Evaluation harnesses for SQL generation, anomaly narratives, page verification, and classification
Review gates for multimodal, classification, agentic, and privacy-sensitive systems

Analytics Engineering and Self-Service

Semantic models, dashboard contracts, and enablement practices that let analysts and business teams answer repeat questions with less ad hoc support.

Reusable dbt models and metrics with tested definitions
Self-service paths that keep sensitive logic governed
Training and documentation for engineers, analysts, and operators

Commerce and Operational Data Systems

Commerce and operational reporting systems where inconsistent source signals, delayed updates, and reconciliation make the data hard to trust.

Reconciliation tests between source systems and reporting contracts
Data models that balance explainability, operational ownership, and reporting trust
Workflow checks that make source-system drift visible before it compounds

Selected Work

Production systems across AI, data, and ML infrastructure

The examples below emphasize constraints, architecture, role, and outcomes rather than confidential implementation detail.

AI Analytics

Conversational Analytics Agent

A governed natural-language analytics agent that turns business questions into validated BigQuery analysis across Ads, Editorial, Commerce, and operations workflows.

Context: At Hearst Magazines, business teams needed faster analytical answers from shared warehouse data without bypassing metric ownership, access expectations, or reviewable SQL.
Problem: Naive question-to-SQL was not enough: schema names were ambiguous, metric definitions lived across multiple layers, and users expected follow-up questions, not one-shot query generation.
Constraints: The workflow had to stay governed: no unsafe SQL execution, no unverified answers, and graceful handling of zero-result or ambiguous questions.
Architecture: Built a LangGraph workflow on Vertex AI/Gemini and BigQuery with schema metadata retrieval using embeddings, keyword search, business-term matching, and fuzzy search. Added dry-run validation, SQL safety checks, retries, error-driven tool use, zero-result investigation, and AI judge verification.
Role: Led the architecture and implementation path from prototype behavior to a governed internal workflow, including retrieval design, agent state, validation loops, and single-turn and multi-turn interaction patterns.
Outcome: Conceived and built the initial workflow, then expanded it from prototype behavior into a governed analytics agent used across Ads, Editorial, and Commerce workflows while preserving source-of-truth data boundaries and answer traceability.
Demonstrates: One slice of broader applied AI work: metadata-grounded retrieval, SQL safety, stateful agents, evaluation discipline, and governed self-service analytics.

Focus

Stack

Applied AI Systems

AI Anomaly Analysis and Classification Systems

Applied AI and statistical analysis workflows that classify content, investigate performance anomalies, and surface operational changes with structured outputs, rules, and reviewable evidence.

Context: At Hearst Magazines, editorial, commerce, and business teams needed faster ways to understand performance changes and classify content without turning every exception into a manual analytics request.
Problem: Performance anomalies, content classification edge cases, and traffic-channel shifts required a mix of statistical detection, business context, and reviewable AI outputs rather than a single model or dashboard.
Constraints: The workflows had to avoid noisy conclusions, preserve business-rule overrides, support human review, and expose enough evidence for operators and stakeholders to trust the result.
Architecture: Led technical direction for a multi-stage LLM anomaly-analysis platform with deterministic guardrails, and built LLM-based article classification using structured outputs, business-rule overrides, and content signals. Complemented the AI workflows with statistical tests, seasonality-aware anomaly detection, and traffic channel shift detection.
Role: Owned architecture and sequencing across AI workflow design, deterministic validation, model-output structure, business-rule integration, and the handoff path from detected signal to actionable investigation.
Outcome: Created reusable patterns for turning performance changes and content ambiguity into reviewed, structured signals that business, editorial, and commerce teams could act on.
Demonstrates: Applied AI beyond chat: anomaly reasoning, LLM classification, statistical detection, business-rule integration, and production workflows where evidence and reviewability matter.

Focus

Stack

Data Platform

Shared Data Platform and Self-Service Analytics

A shared analytics foundation for high-volume digital media and commerce data, built to reduce repeated requests and increase safe self-service.

Context: At Hearst Magazines, a large analytics environment processed 10TB+ of daily workloads, including 5TB+ of clickstream data, while many teams depended on repeated custom SQL and a small group of specialists.
Problem: Analytics demand was growing faster than the platform operating model. Definitions drifted, pipeline ownership was fragmented, and business users needed governed self-service instead of ad hoc ticket queues.
Constraints: The work had to improve reliability without stopping delivery: existing reporting could not break, teams had different skill levels, and source systems spanned batch, streaming, warehouse, and transformation layers.
Architecture: Led platform architecture across BigQuery, Airflow, dbt, Kinesis, and PySpark. Established modeling standards, semantic-layer patterns, data quality checks, ownership practices, and reusable datasets for common analytical paths.
Role: Set roadmap and standards while remaining hands-on in implementation, stakeholder intake, model design, pipeline delivery, training, and migration planning.
Outcome: Reduced analytics backlog by 60%, delivered 100+ pipelines and data models in six months, and helped engineers and analysts adopt safer self-service practices.
Demonstrates: Data platform leadership at scale: technical architecture, operating model, education, and delivery discipline moving together.

Focus

Stack

AI Workflow Automation

AI Commerce Defect Detection

A reviewable AI workflow that detects commerce catalog and retailer-page defects before operational issues compound.

Context: At Hearst Magazines, commerce operations depended on product availability, retailer content, and catalog state staying aligned across systems that changed outside direct control.
Problem: Manual review did not scale, rules alone missed visual and contextual failures, and operators needed actionable signals rather than noisy alerts.
Constraints: The system had to tolerate unstable web pages, partial extraction, retailer variation, unavailable products, visual ambiguity, and the need for human-reviewable evidence.
Architecture: Combined async web extraction, deterministic rules validation, structured outputs, and gated multimodal review. Gemini screenshot verification produced existence, availability, and confidence signals rather than opaque pass/fail labels.
Role: Designed the workflow boundaries, validation stages, confidence schema, and review path so AI would be used where visual reasoning added value and deterministic checks would handle known cases.
Outcome: Created a defect-detection loop that lets operators prioritize likely catalog and retailer-page issues instead of manually inspecting every product page.
Demonstrates: Practical multimodal AI system design: use rules where possible, use LLM vision where useful, and expose confidence and evidence for operational decisions.

Focus

Stack

Privacy Data Infrastructure

Consumer-Scale Event and NLP Privacy Systems

High-scale event processing and privacy-oriented data systems supporting safe analytics over large consumer-product datasets.

Context: At Meta, event and product datasets supported analytics, product decisions, and privacy-sensitive workflows across rapidly changing consumer systems.
Problem: Teams needed safer analytics over high-volume data while reducing storage cost, detecting sensitive information earlier, and preserving continuity during an organizational pivot.
Constraints: The systems had to handle 5B+ daily events, support analytics across 60+ NoSQL collections, maintain backward compatibility, and avoid exposing sensitive personal information through analytical workflows.
Architecture: Worked on real-time NLP-based PII detection, safe analytics patterns over NoSQL-derived datasets, cumulative table design, and a backward-compatible data model that could support changing product and organizational requirements.
Role: Contributed to data modeling, pipeline design, privacy-aware analytics infrastructure, and migration support inside large-scale product data environments.
Outcome: Supported privacy-aware analytics at consumer scale, enabled safer access patterns across broad NoSQL-derived data, and reduced storage by 65% through cumulative table design.
Demonstrates: Experience with high-scale event systems, privacy-sensitive data engineering, storage-efficient modeling, and migration work where compatibility matters.

Focus

Stack

ML Infrastructure

Fintech and E-Commerce ML Data Infrastructure

Machine-learning data infrastructure for fraud, risk, and marketplace workflows, built to shorten model iteration cycles and improve production responsiveness.

Context: Point Predictive and 1stDibs both operated in environments where model performance, data availability, and response time directly affected fraud, risk, marketplace, and analytics workflows.
Problem: Model lifecycle steps were too slow, derived datasets were not yet centralized, and scaling constraints limited how quickly the team could improve and serve analytical signals.
Constraints: The platform needed reliable orchestration, warehouse-backed derived data, streaming ingestion, batch processing, and model-supporting datasets without disrupting active business workflows.
Architecture: Built infrastructure across AWS Step Functions, Redshift, PySpark on EMR, and Kinesis Firehose. Helped establish the first derived-data warehouse and production paths for model lifecycle data, and worked on ML platform modernization including SageMaker migration and AWS ML service adoption.
Role: Worked across data engineering and ML infrastructure, connecting ingestion, transformation, warehouse modeling, and model-supporting datasets into a more durable platform.
Outcome: Reduced model lifecycle time from weeks to hours, improved model performance by 20%, delivered 5x faster response, and increased scalability by 10x.
Demonstrates: Ability to build ML data infrastructure where orchestration, derived data, model iteration, and service responsiveness are all part of the same system.

Focus

Stack

Financial Data Systems

Post-Trade and Financial Data Systems

Financial data and post-trade systems spanning reference data, derivatives clearing migration, reporting infrastructure, cloud migration, and developer tooling.

Context: Earlier financial-technology work at Barclays spanned post-trade systems, reference data, derivatives workflows, client reporting, and platform modernization.
Problem: Financial data systems required accuracy, auditability, migration discipline, and user-facing reporting tools while supporting complex securities and derivatives workflows.
Constraints: Work had to fit regulated environments, on-prem infrastructure, SQL Server-backed systems, legacy integration points, data quality expectations, operational reporting needs, and production change-management practices.
Architecture: Built ETL pipelines for Enterprise Security Master data, supported derivatives clearing migration and post-trade technology, contributed to cloud migration and DevOps work from on-prem systems, and built self-service tooling including a SQL generator, visualization platform, and client reporting infrastructure.
Role: Contributed as an engineer across delivery, migration, automation, reporting, and tooling efforts, with earlier algo-trading internship work providing exposure to market-facing systems.
Outcome: Delivered financial-data pipelines and workflow tools that improved reporting access, migration readiness, and operational support across post-trade and reference-data domains.
Demonstrates: Foundation in disciplined financial data engineering: ETL reliability, regulated workflows, reporting infrastructure, and practical tools for technical and business users.

Focus

Stack

Contact

Continue the conversation

Email or LinkedIn are usually best for notes, context, or continuing a conversation. Calendly is there when a scheduled chat is easier.

Find a time Email LinkedIn