Apply to role >
📍
New York, NY

Research Intern – Multimodal Foundation Model for Vision

Internship
Technology
Software Eng
October 27, 2025

Sony

Global electronics, entertainment & gaming conglomerate
view website >

Research Intern – Multimodal Foundation Model for Vision

Sony AI is seeking research interns to join us. Our team mainly focuses on fundamental and applied research, with a focus on building next-generation foundation models for vision in a responsible manner. The role of a research intern is to develop efficient and effective methodologies and prototype solutions. You will work with a productive team of world-class scientists and engineers to tackle the most challenging problems in foundation models and generative AI, including low-cost yet powerful vision foundation models (VFM), vision-language models (VLM), unified models, automatic model compression, optimization and deployment on cloud and edge. You will see your ideas not only published in papers, but also improve the experience of billions of customers.

Roles and Responsibilities

Required Qualifications and Skills

Working Location

Location flexible (Tokyo, Europe , US)