General Information

Full Name Vaibhav Mathur
Date of Birth 11th July 1995
Languages English, Hindi


  • Jan 2021 - Dec 2022
    Master's in Computer Engineering
    New York University
  • Jun 2013 - Apr 2017
    Bachelor's in Information Technology
    Netaji Subhas Institute of Technology (Univeristy of Delhi)


  • June 2021 - Present
    Gradutate Research Assistant
    CILVR Lab, New York University, New York City, US
    • Exploring Reinforcement Learning (RL) and Self Supervised Learning (SSL) methods for Robot Learning under Prof. Lerrel Pinto
    • Developed a novel Inverse RL method called Regularised Optimal Transport that adaptively combines an Imitation Learning prior with RL exploration capable of learning many robotics tasks using a single expert demo with a 7.8x speedup to reach a 90% success rate over existing baselines.
    • Created a framework for Vision based learning using RL, sim2real and Domain Randomisation methods to adapt a policy trained in Simulation to the real world to complete tasks such as Pick and Place and PegInsertion.
  • Feb 2019 - Nov 2020
    Software Development Engineer
    Zomato, Gurgaon, India
    • Built Zomato’s in-house Distributed and Highly Scalable Monitoring and Alerting Platform. The platform reduced incident response time by 80% and allowed for Preemptive Incident Detection and Alerting.
    • Created the Dynamic Inter-service Discovery and Communication for microservices using Envoy and a Golang microservice for AWS ECS Container discovery enabling migration to a microservice-based architecture.
    • Set up Zomato’s Distributed Relational Database. It distributed query load and reduced query failure rates by 87%.
  • Sept 2013 - Feb 2019
    Software Engineer
    Hong Kong Shanghai Bank (HSBC), Pune, India
    • Led team’s service migration to HSBC’s Internal Cloud Platform.
    • Created the Microservice Deployment Service acting as an internal CICD tool in Spring Boot. Reduced service deployment incidents by 53% and decreased the service deployment time to <1 min.


  • Collab-Editor
    • Browser-based Simple Collaborative Text Editor.
    • Implemented it using Replicated Growable Array, a type of Conflict-Free Replicated Data Type (CRDT) to achieve eventual consistency even if the user is not connected to the internet.
  • Sim2Real
    • Transfering an RL policy learnt in simulation to a real Xarm7 robot.
    • Used Asymmetric Actor Critic model to learn the policy in Mujoco simulation.
    • Used Domain Randomisation as means for Domain Adaptation to the real world.
    • Used Hindsight Experience Replay (HER) to allow more efficient learning in a Sparse Reward setting.
  • Distributed Training
    • Created a Distributed Training Pipeline for learning Self-Supervised embeddings for various downstream tasks from images using Pytorch Distributed Data-Parallel.
  • Hydra
    • A P2P platform to generate, share, contribute to ML Datasets and train Deep Learning models.
    • Implemented a Distributed Hash Table for Node Discovery, Multi-tracker approach using Raft to create a fault-tolerant archive of datasets. Used All-Reduce to run Synchronous Stochastic Gradient Descent for distributed training.
  • MARL
    • Implemented Multi-Agent Reinforcement Learning (MARL) using 2 Independent DQNs with information sharing to master the game of Knights Archers Zombies.

Other Interests

  • Hobbies: Playing and watching(and agonizing) Soccer, solving Rubick's Cubes, Hiking.