Why I Build AI Systems
I build AI systems that have to work outside notebooks: agentic retrieval stacks, evaluators for SWE
agents, cloud-native data platforms, and developer tooling with Docker-backed isolation. My strongest
work lives where language models, runtime evidence, and production constraints meet.
I use AI-assisted coding tools such as Codex and Claude Code as force multipliers, while staying hands-on
in architecture, implementation, debugging, and evaluation.
95%
retrieval accuracy on agentic RAG
50%
MRR lift from chunking and ranking tests
470K+
records processed in PySpark analytics pipelines
25K/min
CDC throughput in containerized banking data pipeline
What I Build
AI Systems
Agentic RAG, LangGraph orchestration, prompt design, vector search, embedding evaluation, and LLM-driven
product features that need measurable behavior.
Data Platforms
PySpark pipelines, lakehouse patterns, Databricks, Snowflake, Kafka, Airflow, and cloud-native data
services across AWS, Azure, and GCP.
Developer Tooling
Preflight merge validation, Docker sandboxes, eval harnesses, runtime evidence collection, and
reproducible testing for AI-assisted engineering workflows.
Selected Stack
Python
SQL
LangGraph
LangChain
OpenAI API
AWS Bedrock
Vertex AI
Docker
Kubernetes
PySpark
Databricks
Snowflake
Kafka
Airflow
dbt
MLflow
PowerBI
Tableau
Current North Star
Design AI workflows that are trustworthy under budget pressure: systems that can retrieve, reason,
validate, and stop themselves when the evidence says a path is weak.
Experience Snapshot
Graduate training at UIUC, production AI engineering in industry, and independent systems work focused on
evaluation, orchestration, and reliable execution.
Jul 2025 - Present
AI Engineering Intern | Data Science Research Services, UIUC
- Architected agentic RAG with LangGraph decision trees and Qwen3-32B, reaching 95% retrieval accuracy.
- Improved retrieval quality by 50% through A/B tests on chunking, hybrid search, and custom scoring.
- Built async ETL and ingestion pipelines across Azure, AWS Bedrock, MongoDB, and PostgreSQL vector stores.
- Created operational dashboards over 8,700+ events to translate system behavior into decisions.
Apr 2023 - Jun 2024
Associate Data Scientist | Apptware Pvt. Ltd.
- Built PySpark and SQL pipelines over 470,000+ records and delivered decision-ready PowerBI dashboards.
- Optimized low-latency inference pipelines on AWS Lambda, cutting STT latency by 80% for IVR workloads.
- Developed Pix2Pix and YOLO pipelines across healthcare and edge-device use cases.
- Deployed always-on data extraction and analytics services on AWS for 20+ enterprise clients.
Sep 2022 - Apr 2023
Data Science Intern | Apptware Pvt. Ltd.
- Built GPT-4 plus SQL tool-calling workflows with Chainlit and LangChain.
- Fine-tuned Falcon-7B and LLaMA with LoRA and QLoRA for applied NLP systems.
- Used OCR, topic models, and few-shot classification to structure unstructured enterprise documents.
Education
M.S. Information Science, UIUC (3.96 GPA)
B.E. Computer Engineering, SPPU (3.86 GPA)
Focus: distributed systems, AI design, and cloud architecture.
Current Focus
Building AI systems that can evaluate themselves under constraints: runtime evidence, early stopping,
trace quality, and budget-aware agent control.
Hands-On Engineering
I use Codex and Claude Code to move faster, but I still own the architecture, implementation, evaluation
setup, and systems tradeoffs directly.
RAG eval
Docker
AWS/GCP/Azure
Vector DBs
PySpark
Agentic systems
A/B testing
Kubernetes
Featured Project Architectures
A selection of systems where I combined AI components, data infrastructure, and explicit control over how
information moves through a workflow.
Featured architecture
Multi-Agent Insurance Processing System
- Designed a six-agent LangGraph workflow with OpenAI function calling and structured routing for insurance assistance.
- Separated supervisor logic, domain tools, RAG access, and observability to keep behavior auditable and reproducible.
- Used agent boundaries intentionally so the system could escalate, clarify, and constrain tool usage instead of hallucinating free-form behavior.
High-throughput data platform
Containerized Banking Data Architecture
- Built a CDC pipeline from PostgreSQL through Debezium, Kafka, MinIO/S3, Snowflake, dbt, and Airflow.
- Designed for continuous throughput of roughly 25,000 transactions and account updates per minute.
- Added Docker-based orchestration and GitHub Actions to keep local development and pipeline automation aligned.
Current Systems Work
The projects below reflect the direction I want to keep pushing: reliable AI systems, better evaluation
signals, and developer tools that act on evidence instead of hype.
Azure + lakehouse + CV
Automated Insurance Claim Audit Pipeline
- Built an end-to-end Azure Databricks pipeline using Bronze -> Silver -> Gold layers under a constrained 4GB cluster budget.
- Used PySpark, Hive Metastore, MLflow, and an Xception-based CV model that reached 94% accuracy.
- Translated multimodal insurance inputs into business-ready decisions with explicit rule handling and model monitoring hooks.
Solo developer | Docker-backed validation
Preflight
Building a CLI-first merge validation system that provisions isolated sandboxes, deploys candidate code
changes, runs targeted checks, collects runtime evidence, and emits structured merge recommendations.
The goal is to block unsafe merges with localizable, evidence-backed findings rather than static-only
guesswork.
preflight run --sandbox-backend docker
--enable-load-probe
--enable-failure-injection
Research implementation | official harness aware
TracePop / SWE-Agent Evaluation
Working on early termination policies for SWE-agent trajectories under bounded test-time budgets.
Exploring cheap read-only prefixes, early-signal tests, LLM-as-judge signals, and non-LLM models using
trajectory features such as file exploration breadth, reasoning depth, no-progress patterns, and
ideation behavior.
Stage I: prefix culling -> stable pool ->
revision -> trace compression