What you can expect
As a Video AI Engineer, you’ll enhance video codecs, video generation, and real-time 3D reconstruction to improve video quality, immersion, and performance in Zoom products. You will work across our stack, developing software ranging from Web Server to business application layers for our distributed, cloud-hosted backend. Working alongside leading experts in the field, you’ll deliver happiness to our users and grow your knowledge base every day.
‍
About the Team
Our global team, including members in China and Singapore, focuses on improving video quality, generative video capabilities, and real-time 3D reconstruction in Zoom products. As a Video AI Engineer, you’ll work with experts to advance video rendering, 3D reconstruction, and AI-powered communications while growing your skills.
‍
Responsibilities
- Building and developing video and generative video processing applications on both desktop and mobile systems
- Participating in research and performance evaluation of video processing, video generation, and 3D reconstruction algorithms
- Designing and developing algorithms in Zoom’s video and 3D reconstruction processing pipelines at both module and system levels
- Implementing video, neural rendering, and 3D Gaussian Splatting algorithms with modular, well-organized, and production-ready code
- Optimizing video, generative, and 3D reconstruction algorithms to achieve real-time performance on corresponding platforms
- Customizing, integrating, and shipping deep learning models—including video generative models and 3D neural rendering models—across Mac, Windows, iOS, and Android
- Setting up test environments, developing test tools, and designing unit tests for runtime verification of video and 3D pipeline components
‍
‍
What we’re looking for
- Hold either a PhD or Master in Electrical Engineering, Computer Science, Applied Mathematics, or related fields
- Have experience with C/C++ or Objective-C, and Python, talking avatar/head/portrait(with released projects and top conference papers)
- Have hands-on experience with video generation or video diffusion models, neural rendering techniques (e.g., NeRF, 3D Gaussian Splatting), and 3D reconstruction systems.
- Have hands-on experience with machine learning techniques such as generative models, diffusion models, discriminative models, or transfer learning.
- Have excellent communication, oral, written, and interpersonal(in both Mandarin and English), analytical and troubleshooting skills.
- Have familiarity with multi-threaded programming and communication mechanisms
- Have understanding of multimedia stream data processing flows, ideally including 3D scene or point cloud pipelines
- Must be fluent in Mandarin
‍