Sushant Shambharkar

Building AI Infrastructure,
Data Platforms &
Production ML Systems

10+ years building distributed systems, real-time data platforms, machine learning infrastructure, and AI applications across finance, cloud, analytics, and AdTech.

From streaming pipelines and feature stores to RAG platforms and multi-agent systems — I focus on systems that survive production realities: scale, latency, reliability, observability, and business impact.

GoogleSenior Software Engineer
Goldman SachsSenior Software Engineer
Angel OneTech Lead — AI Labs
AmagiTech Lead — AdTech
SlintelLead Software Engineer
PersistentSoftware Engineer

Career & Impact

10+

Years Building Production Systems

IIT Bombay

M.Tech Computer Science

99.51

GATE Percentile

30+

Engineers Mentored

100+

Technical Interviews

Business Outcomes

35%

Improvement in customer engagement

Angel One

60%

Reduction in advisor response latency

Angel One

30%

Increase in ad CTR

Amagi

20%

Increase in viewer retention

Amagi

200%

Increase in dashboard creation efficiency

Slintel

3x

Faster dashboard performance

Slintel

25%

Reduction in production failures

Google

10x

Improvement in processing performance

Persistent

About

I build systems at the intersection of data engineering and AI engineering. Over the last decade I have worked on streaming platforms, ML infrastructure, real-time decision systems, large-scale NLP systems, retrieval architectures, and production AI applications.

I care less about demos and benchmarks and more about reliability, observability, scalability, latency, and measurable business outcomes. The hardest problems in production AI are data problems — quality, drift, lineage, freshness. I treat AI pipelines like data pipelines: schema-enforced, tested, monitored.

Currently I lead AI engineering at Angel One, building agentic platforms and ML infrastructure that process millions of decisions daily. Previously I built B2B intent intelligence at Slintel, real-time ad optimization at Amagi, experimentation infrastructure at Google Cloud, and NLP platforms at Goldman Sachs.

Core Expertise

AI Systems

  • RAG Pipelines
  • Agentic AI
  • LLMOps
  • AI Evaluation
  • Prompt Engineering

Data Platforms

  • Apache Kafka
  • Apache Flink
  • Delta Lake
  • Snowflake
  • Databricks

Cloud & Infrastructure

  • Kubernetes
  • Docker
  • Terraform
  • AWS
  • GCP

Engineering

  • Python
  • FastAPI
  • Distributed Systems
  • System Design
  • MLOps

How I Think About Engineering

Reliability Before Novelty

Production systems must survive node failures, data skew, latency spikes, and bad deployments before they earn the right to use the latest model architecture.

AI Systems Are Data Systems

The hardest problems in production AI are data problems — quality, drift, lineage, freshness. Treat AI pipelines like data pipelines: schema-enforced, tested, monitored.

Latency Is A Feature

Every millisecond added to an AI pipeline is a tax on user experience. Design for sub-50ms p99 inference from day one. Caching, batching, and quantization are not afterthoughts.

Observability First

If you cannot measure it, you cannot debug it. Every system I build ships with structured logging, distributed tracing, and real-time metrics dashboards before the first user hits it.

Measure Before Optimizing

Benchmarks and intuition are not substitutes for production measurements. Instrument, collect baselines, then optimize the bottlenecks the data reveals.

Automation Over Repetition

Manual deployments, manual evaluation, manual rollbacks — every manual step is a risk. CI/CD for ML, automated guardrails, and self-healing infrastructure are table stakes.

Case Studies

Production AI systems I designed, built, and shipped.

Real-Time Ad Decisioning Platform

AI-powered real-time ad optimization system for streaming media, processing millions of ad decisions per second.

PythonTensorFlowKubernetesRedis

Context & Problem

Traditional ad insertion systems couldn't optimize for viewer engagement in real-time, leading to poor ad performance and viewer churn.

Approach & Architecture

Built an ML-powered decision engine that analyzes viewer behavior, content context, and advertiser goals to optimize ad placement in real-time.

Lessons Learned

  • Latency is everything in ad tech - decisions must be made in <50ms
  • A/B testing at scale requires careful statistical rigor
  • Feature engineering for real-time systems is fundamentally different from batch

B2B Intent Intelligence Platform

NLP-powered system for analyzing customer behavior and generating actionable intelligence for B2B sales teams.

PythonTransformersElasticsearchFastAPI

Context & Problem

Sales teams spent hours manually analyzing customer interactions to identify opportunities, missing critical signals in the noise.

Approach & Architecture

Developed a transformer-based system that automatically extracts intent signals, sentiment, and opportunity scores from customer communications.

Lessons Learned

  • Context window limitations require careful text chunking strategies
  • Domain-specific fine-tuning outperforms generic models by 30%+
  • Explainability is crucial for user trust in AI recommendations

Leadership & Community

30+ Engineers Mentored

Guided junior and mid-level engineers across multiple organizations

100+ Technical Interviews

Conducted interviews for engineering roles across Google, Goldman Sachs, and startups

Speaker on AI & NLP

Frequent speaker and mentor on GenAI, NLP, and distributed ML systems

Open Source Contributor

Published LangGraph templates and built autonomous AI agent infrastructure

Open Source

OpenCode

Active

AI coding agent with MCP server architecture. Multi-agent orchestration, tool execution, and autonomous code generation.

GoMCPAgentic AI
GitHub

Dotfiles & Homelab

Active

Personal infrastructure-as-code: Kubernetes homelab, CI/CD pipelines, monitoring stack, and self-hosted AI workloads.

KubernetesTerraformAnsibleDocker
GitHub

LangGraph Templates

Published

Open-source templates for enterprise-grade GenAI agents using LangGraph patterns.

PythonLangGraphLLM
GitHub

Featured Writing

Jan 10, 20252 min read

Building Production RAG Systems: Lessons from the Trenches

A deep dive into building retrieval-augmented generation systems that actually work at scale, covering architecture decisions, embedding strategies, and production pitfalls.

#RAG#LLM#Production ML
Jan 20252 min read

Agentic AI: From Research to Production

How to take agentic AI systems from research prototypes to production deployments with proper reliability, monitoring, and observability.

Current Focus

Building

  • OpenCode AI coding agent with MCP server architecture
  • Multi-agent orchestration framework for production workflows
  • Personal AI knowledge base with RAG pipelines

Learning

  • Multi-agent orchestration frameworks
  • Advanced prompt engineering & LLM fine-tuning
  • Kubernetes operators & custom controllers

Reading

  • Building LLM Applications with Prompt Engineering
  • Designing Machine Learning Systems (Chip Huyen)
  • Reinforcement Learning: An Introduction

Activity

June 2026
OpenCode AI AgentJun 15
sushantdev.com v2 — Living PlatformJun 14
Building Production RAG PipelinesJun 10
Multi-Agent OrchestrationJun 8
Survey of Agent FrameworksJun 5

Available for technical leadership

Looking for an AI infrastructure lead who can architect and deliver production systems?