|Full Name||Vaibhav Mathur|
|Date of Birth||11th July 1995|
- Jan 2021 - Dec 2022
Master's in Computer Engineering
New York University
- Jun 2013 - Apr 2017
Bachelor's in Information Technology
Netaji Subhas Institute of Technology (Univeristy of Delhi)
- June 2021 - Present
Gradutate Research Assistant
CILVR Lab, New York University, New York City, US
- Exploring Reinforcement Learning (RL) and Self Supervised Learning (SSL) methods for Robot Learning under Prof. Lerrel Pinto
- Developed a novel Inverse RL method called Regularised Optimal Transport that adaptively combines an Imitation Learning prior with RL exploration capable of learning many robotics tasks using a single expert demo with a 7.8x speedup to reach a 90% success rate over existing baselines.
- Created a framework for Vision based learning using RL, sim2real and Domain Randomisation methods to adapt a policy trained in Simulation to the real world to complete tasks such as Pick and Place and PegInsertion.
- Feb 2019 - Nov 2020
Software Development Engineer
Zomato, Gurgaon, India
- Built Zomato’s in-house Distributed and Highly Scalable Monitoring and Alerting Platform. The platform reduced incident response time by 80% and allowed for Preemptive Incident Detection and Alerting.
- Created the Dynamic Inter-service Discovery and Communication for microservices using Envoy and a Golang microservice for AWS ECS Container discovery enabling migration to a microservice-based architecture.
- Set up Zomato’s Distributed Relational Database. It distributed query load and reduced query failure rates by 87%.
- Sept 2013 - Feb 2019
Hong Kong Shanghai Bank (HSBC), Pune, India
- Led team’s service migration to HSBC’s Internal Cloud Platform.
- Created the Microservice Deployment Service acting as an internal CICD tool in Spring Boot. Reduced service deployment incidents by 53% and decreased the service deployment time to <1 min.
- Browser-based Simple Collaborative Text Editor.
- Implemented it using Replicated Growable Array, a type of Conflict-Free Replicated Data Type (CRDT) to achieve eventual consistency even if the user is not connected to the internet.
- Transfering an RL policy learnt in simulation to a real Xarm7 robot.
- Used Asymmetric Actor Critic model to learn the policy in Mujoco simulation.
- Used Domain Randomisation as means for Domain Adaptation to the real world.
- Used Hindsight Experience Replay (HER) to allow more efficient learning in a Sparse Reward setting.
- Created a Distributed Training Pipeline for learning Self-Supervised embeddings for various downstream tasks from images using Pytorch Distributed Data-Parallel.
- A P2P platform to generate, share, contribute to ML Datasets and train Deep Learning models.
- Implemented a Distributed Hash Table for Node Discovery, Multi-tracker approach using Raft to create a fault-tolerant archive of datasets. Used All-Reduce to run Synchronous Stochastic Gradient Descent for distributed training.
- Implemented Multi-Agent Reinforcement Learning (MARL) using 2 Independent DQNs with information sharing to master the game of Knights Archers Zombies.
- Hobbies: Playing and watching(and agonizing) Soccer, solving Rubick's Cubes, Hiking.