I’m a principal scientist at NVIDIA Research, where I work on deep generative models and multimodal large language models. I also publish papers, contribute deep learning models to the community, and serve in various roles in NeurIPS, ICLR, ICCV and CVPR.
From 2020 to 2022, I was a research scientist at Google Research, where I co-created FILM interpoaltion that powers user-facing features in Google Photos, Pixel and featured in Google I/O; and TryOnDiffusion that launched on Google Search and featured in 23 of Google’s biggest moments in 2023.
Previously, I was a research scientist at NVIDIA Applied Deep Learning Research, led by Bryan Catanzaro. I co-founded and led the real-time graphics technology that exploits the temporal domain to increase the performance of GPUs in video games, which culminated in the creation of DLSS 3.0. I’ve also pulished on topics spanning semantic segmentation, video prediction, image inpainting, and frame interpolation, that were featured in Fortune, Forbes, FastCompany, and have also been integrated into NVIDIA’s NGX Technology.
Prior to that, I was the lead inventor and researcher of the team that launched the Siemens FastSpine.v2 technology that automatically traces, detects and numbers the human spine in 3D CT/MRI images.
I received my PhD in Electrical Engineering from Vanderbilt University. My thesis was on image processing algorithms for image-guided surgery techniques. I hold a MS degree in Computer Vision and Robotics, received from Heriot-Watt Univerisity.
PhD in Electrical Engineering, 2014
Vanderbilt University
MS in Computer Vision and Robotics, 2009
Heriot-Watt University