
As an intern on this team, you will:
- Collaborate with engineers to analyze and optimize inference performance for language, vision, and multimodal models.
- Develop tools and utilities that help understand bottlenecks across hardware platforms and workloads.
- Contribute to the design and evaluation of new techniques for efficient model serving and deployment.
- Learn about distributed systems, GPU computing, and large-scale ML workflows in a production environment.
- Work in an environment that values creativity, curiosity, and collaboration.
This internship will provide hands-on experience with the infrastructure behind Apple's AI technologies and offer insight into how research ideas are transformed into scalable, reliable systems.