CS + AI/ML student

Devaansh Pathak

AI/ML and systems builder interested in reliable software, thoughtful evaluation, and practical research tools.

I work across LLM agents, reinforcement learning environments, evaluation systems, AI infrastructure, and applied engineering. This site collects my projects, writing, publications, and research notes as they develop.

View projects Research notes

/DevaanshPathak /in/devaanshpa Email CV

Profile

Research-minded engineering

I like problems where models, tools, data, and systems meet, especially when behavior needs to be measured carefully rather than only demoed.

I use this space as a working record of what I am building and learning: research prototypes, software projects, implementation notes, and longer-form writeups. The common thread is a preference for systems that can be inspected, tested, and improved over time.

Interests

Technical interests

A few areas I keep returning to while building projects and reading research.

Reliable LLM systems

Reinforcement learning environments

Evaluation pipelines and benchmarks

AI infrastructure and tooling

Full-stack product engineering

Failure analysis and debugging

Current research thread

SRE-Zero

An environment-grounded benchmark for evaluating reliable tool-using agents in simulated incident-response workflows. The project focuses on sequential decisions, safe tool use, partial evidence, remediation quality, and operational reliability metrics.

LLM AgentsRL EnvironmentsEvaluationAI Systems

Project page

Writing