What you can expect

As an Audio AI Engineer, you will research and develop algorithms for accent conversion, voice conversion, speech synthesis, and speech recognition on low-latency streaming architectures. You’ll prototype and refine end-to-end audio models that enhance intelligibility and naturalness while maintaining speaker identity. Working closely with product and platform teams, you’ll help bring these models into real-time communication systems. You will also evaluate and optimize model performance across dimensions such as quality, latency, and scalability. Staying current with advances in speech processing, you’ll contribute to innovation through patents and internal knowledge sharing.

‍

About the Team

Zoom's Audio team develops real-time audio features based on AI algorithms. Members of the team are spread worldwide, including the U.S., China and Singapore.

‍

Responsibilities

Researching, designing, and developing algorithms for accent conversion, voice conversion, speech synthesis, and automatic speech recognition, focusing on low-latency streaming architectures
Prototyping end-to-end audio models that enhance intelligibility and naturalness while preserving speaker identity and expressiveness.
Collaborating closely with product and platform teams to integrate models into real-time video and audio communication systems.
Analyzing and optimizing model performance across speech quality, latency, robustness, and scalability dimensions.
Staying current with the latest developments in speech processing research, and contribute to the community through patents, and internal knowledge sharing.

‍

What we’re looking for

Hold a PhD or equivalent experience in a relevant field in Streaming, Voice Conversion, TTS, or ASR.
Show proficiency in deep learning frameworks like PyTorch or TensorFlow.
Demonstrate effective programming skills in Python, C/C++, or similar languages.
Have an understanding of sequence modeling architectures (Transformers, RNNs, diffusion models, or conformers).
Demonstrate experience developing and deploying low-latency, real-time speech or audio models with streaming architectures and optimized pipelines.
Show familiarity with model compression and acceleration techniques, including quantization, pruning, and distillation.
Exhibit experience working with real-time audio systems in networked communication environments.
Publish in top-tier conferences such as ICASSP, INTERSPEECH, NeurIPS, and ICLR.
Must be fluent in Mandarin

‍

Latest jobs

Amazon

Network Engineer I, Just Walk Out Tech

📍

Hyderabad, India

December 14, 2025

view job ->

Amazon

Mechatronics & Robotics Technician

📍

Pflugerville, TX

December 14, 2025

view job ->

Amazon

AWS Finance analyst , AWS Finance

📍

Hyderabad, India

December 14, 2025

view job ->

Amazon

Tax Analyst I, Federal Tax

📍

Hyderabad, India

December 14, 2025

view job ->

Amazon

2026 Engineering Operations Intern, Data Center Engineering Operations

📍

Frankfurt, Germany

December 14, 2025

view job ->

Amazon

Software Development Engineer

📍

San Francisco, CA

December 14, 2025

view job ->

Amazon

Financial Analyst I, FOAA - Payroll

📍

Bengaluru, India

December 14, 2025

view job ->

Amazon

Software Development Engineer Internship - Vaga para mulheres, IES- LATECH

📍

São Paulo, Brazil

December 14, 2025

view job ->

Amazon

Mechatronics & Robotics Tech

📍

Maryville, TN

December 14, 2025

view job ->

Amazon

Equipment Coordinator

📍

Tracy, CA

December 14, 2025

view job ->

PhD Audio AI Engineer (Speech Conversion, TTS & ASR)

Zoom

Latest jobs

Search by city

Cities - 🇺🇸 North America

Cities - 🌏 Asia

Cities - 🌎 South America

Search by area

Cities - 🇪🇺 Europe

Cities - 🇬🇧 UK

Cities - 🇦🇺 Australia

Cities - 🌍 Middle East

Search by experience level

Search by role type

2026 internships

2026 graduate roles

2026 apprenticeships

No experience roles
(0 yrs)

1 year experience roles
(1 yrs)

2 years experience roles
(2 yrs)

3 years experience roles
(3 yrs)

PhD Audio AI Engineer (Speech Conversion, TTS & ASR)

Zoom

Latest jobs

Search by city

Cities - 🇺🇸 North America

Cities - 🌏 Asia

Cities - 🌎 South America

Search by area

Cities - 🇪🇺 Europe

Cities - 🇬🇧 UK

Cities - 🇦🇺 Australia

Cities - 🌍 Middle East

Search by experience level

Search by role type

2026 internships

2026 graduate roles

2026 apprenticeships

No experience roles(0 yrs)

1 year experience roles(1 yrs)

2 years experience roles(2 yrs)

3 years experience roles(3 yrs)

No experience roles
(0 yrs)

1 year experience roles
(1 yrs)

2 years experience roles
(2 yrs)

3 years experience roles
(3 yrs)