What to Expect
Consider before submitting an application:
This position is expected to start around January 2026 and continue through the Winter/Spring term (ending approximately May 2026) or continuing into Summer 2026 if available and there is an opportunity to do so. We ask for a minimum of 12 weeks, full-time and on-site, for most internships. Our internship program is for students who are actively enrolled in an academic program. Recent graduates seeking employment after graduation and not returning to school should apply for full-time positions, not internships.
International Students: If your work authorization is through CPT, please consult your school on your ability to work 40 hours per week before applying. You must be able to work 40 hours per week on-site. Many students will be limited to part-time during the academic year.
What You’ll Do
- Work with ML team and various other data scientists to create highly scalable API services that's based on ML and statistic models
- Influence API design and implementation for model inference services, ensuring scalable, reliable, and efficient integration of machine learning models into production systems
- Find new ways to improve in-house batch processing framework and workflow orchestration
- Work closely with other teams from across the organization and help them get end-to-end deployment on existing infra and to serve their model for real-time business use-cases
- Learn and execute on best practices in ML modeling, handling, and usage in an enterprise setting
What You’ll Bring
- Currently pursuing a degree in Computer Science, Engineering or a related field of study and graduating in 2026
- Able to work on site in Fremont, CA
- Prior academic training in Software Engineering with a focus on building and scaling batch and real-time infra
- Strong knowledge of Python, Airflow and SQL and comfort with data wrangling
- You have done production level model deployments and push a docker image to a API to serve ML models
- Transformer/VLM/LLM, CNNs, residual networks, GAN, clustering, sequence models) and frameworks/libraries (e.g., Tensorflow, PyTorch, Scikit-learn, Jax)
- Excellent experience in deployment of machine learning models in production environments, particularly in nlp/video/media-focused applications.