Hi, my name is Harshit Soora
I'm a software engineer from College Park, Maryland — building scalable cloud, AI and HPC systems.
Let's work together.
From cloud-native infrastructure and AI inference platforms to HPC workflow tooling and full-stack web applications. I help ambitious teams ship reliable systems at scale — the more interesting the constraints, the better.
About my approachExperience
A few places I've shipped work.
-
Software Engineer
Joining Google as a Software Engineer following my graduate studies at UMD. More to come.
-
Software Developer · Graduate Research Assistant
Architecting a Workflow Manager for Open OnDemand (700+ HPC sites), letting researchers visually design DAG-based computational pipelines with conditional branching and real-time job reporting. Leading development under NSF grant #2411376 with the Ohio Supercomputer Center and Texas A&M.
-
Software Developer Intern
Built an AI-powered SOAP notes platform for healthcare clients — NLP-based auto-suggestions and real-time duplicate detection cut development time ~40%. Architected an automated approval workflow on Azure with email orchestration and role-based routing, with 95% code coverage end-to-end and savings of ~$100K in manual processing.
-
Software Engineer II — Cloud & AI
Architected AI inference infrastructure for Maxine Audio NIMs on Triton Inference Server & NVCF. Designed a bidirectional streaming SDK in C++ that cut model serving latency ~40% and enabled real-time audio. Co-invented an AI-powered audio editor with prosody preservation hitting ~20ms latency. Engineered GeForce NOW cross-platform stacks across Android, iOS, macOS and Linux, improving streaming QoS ~30%. Built observability with Kafka, Prometheus and Kibana, enabling 50–60% faster debug.
Open Source.
Side projects, research and experiments I've put out into the world. A few favourites below.
-
UniBench.
A universal evaluation framework for software engineering agents on issue resolution and test generation tasks. Proposes 4 novel metrics — null-resolution, edit-resilience, error-resolution@x and line-level fault localization — to measure fine-grained code reasoning beyond pass@k.
The pipeline runs CodeActAgent over 2,294 GitHub issue–PR pairs, comparing 15+ benchmarks (SWE-Bench, HumanEval, BigCodeBench, MBPP).
Get UniBench -
Vulineage.
A web tool that reconstructs Docker image tag hierarchies and overlays vulnerability timelines, helping teams pick safer base images 3–4× faster than existing tools.
Built with interactive D3.js dashboards (icicle plots, force-directed graphs, time-series) on a Python backend hosted on AWS.
Open Vulineage -
Carla-DDPG.
Bachelor's thesis on safe strategies for autonomous driving. Hierarchical program-controlled RL with maneuver-specific DDPG agents (lane-swap, obstacle avoidance) — 58.2% sample-efficiency improvement over flat DRL.
Validated on 10 NHTSA pre-crash scenarios in CARLA with 5 LTL safety specifications across 18,477 clauses.
Get Carla-DDPG
Publications
Things I've written.
Peer-reviewed work and pre-prints — drawn from my research at IIT Kharagpur and the University of Maryland.
-
From SWE-Bench to Universal Benchmarking: Evaluating Code-Acting LLM Agents.
Proposes a universal evaluation framework for code-acting LLM agents on issue resolution and test-generation tasks, introducing four novel metrics — null-resolution, edit-resilience, error-resolution@x and line-level fault localization — that measure fine-grained reasoning beyond pass@k. Pipeline runs CodeActAgent over 2,294 GitHub issue–PR pairs across 15+ benchmarks.
Read on arXiv -
Hierarchical Program-Triggered Reinforcement Learning Agents For Automated Driving.
Hierarchical program-controlled reinforcement learning for autonomous driving — a structured controller invokes maneuver-specific DDPG agents (lane-swap, obstacle avoidance) to deliver a 58.2% sample-efficiency improvement over flat DRL. Validated on 10 NHTSA pre-crash scenarios in CARLA with 5 formally verified LTL safety specifications across 18,477 clauses.
Read on arXiv
Coursework & Projects
What I've studied.
Graduate and undergraduate coursework that's shaped how I build things — across AI, systems, and theory — alongside the class projects that came out of them.
- 01 Natural Language Processing
- 02 Reinforcement Learning for Robotics
- 03 Generative AI Agents (Agentic AI)
- 04 Information Visualization
- 05 Parallel Computing
- 06 Computer Vision
- 07 Algorithms
- 08 Software Engineering
- 09 Computer Networks
- 10 Database Management Systems
- 11 Operating System
- 12 Object Oriented System Design
- 13 Distributed Systems & Cloud Computing
Class projects
Built along the way.
-
Vulineage — Docker Image Security Visualizer
Web-based security visualization that reconstructs Docker image tag hierarchies and overlays vulnerability timelines. Built interactive D3.js dashboards (icicle plots, force-directed graphs, time-series) on a Python backend hosted on AWS, enabling real-time querying of Docker tag lineage and CVE distributions.
-
Safe Strategies for Autonomous Driving (Carla-DDPG)
Hierarchical Program Controlled RL with maneuver-specific DDPG agents (lane-swap, obstacle avoidance), achieving 58.2% sample-efficiency improvement over flat DRL. Validated on 10 NHTSA pre-crash scenarios in CARLA with 5 formally verified LTL safety specifications across 18,477 clauses.
-
Ceph-Inspired Distributed Storage System
Architected a fault-tolerant distributed storage system with a Metadata Server (MDS) and Monitor servers, primary–backup replication across four Object Storage Devices on AWS S3. Implemented a Cassandra-inspired consistency model with passive replication, heartbeat monitoring and Dynamo's gossip protocol — 99.9% data availability with sub-2-second failure detection.
-
Secure File Transfer over Cloud
Engineered a secure cloud platform implementing hybrid encryption combining Elliptic Curve Cryptography with pseudo-randomly generated symmetric keys to prevent man-in-the-middle attacks. Custom N-part file sharding with layered encryption improved security 3–4% over existing approaches.