What You’ll Do:

Join the Inference team to ship production features that improve latency, reliability, and cost for model serving on our GPU platform. As an IC1, you’ll implement well-scoped changes, learn our operational practices, and grow quickly with mentorship from experienced engineers.

‍

About the role:

Implement well-scoped features and fixes in Python/Go/C++ for model-serving services (e.g., Triton, vLLM, TensorRT-LLM, Ray Serve).
Write tests, code comments, and short design docs; participate in code reviews.
Add basic metrics and dashboards; assist with alarms and runbooks.
Follow on-call runbooks and learn incident response in a guided rotation.
Contribute to performance experiments (e.g., request batching, concurrency, caching) with guidance.

‍

Who You Are:

BS/MS in CS, EE, or related field, or equivalent practical experience.
Foundations in data structures, algorithms, and networked services.
Experience with Python or Go (C++ a plus) and Linux fundamentals; Git/CI basics.
Exposure to containers and Kubernetes (coursework or projects welcome).
Curiosity about GPU inference concepts (micro-batching, KV cache, streaming).

‍

Preferred:

Internship or project that deployed a microservice or ML inference demo.
Coursework/research with PyTorch or TensorFlow; simple CUDA projects a plus.
Familiarity with Grafana/Prometheus/OpenTelemetry or similar tooling.

‍

Latest jobs

Amazon

Network Engineer I, Just Walk Out Tech

📍

Hyderabad, India

December 14, 2025

view job ->

Amazon

Mechatronics & Robotics Technician

📍

Pflugerville, TX

December 14, 2025

view job ->

Amazon

AWS Finance analyst , AWS Finance

📍

Hyderabad, India

December 14, 2025

view job ->

Amazon

Tax Analyst I, Federal Tax

📍

Hyderabad, India

December 14, 2025

view job ->

Amazon

2026 Engineering Operations Intern, Data Center Engineering Operations

📍

Frankfurt, Germany

December 14, 2025

view job ->

Amazon

Software Development Engineer

📍

San Francisco, CA

December 14, 2025

view job ->

Amazon

Financial Analyst I, FOAA - Payroll

📍

Bengaluru, India

December 14, 2025

view job ->

Amazon

Software Development Engineer Internship - Vaga para mulheres, IES- LATECH

📍

São Paulo, Brazil

December 14, 2025

view job ->

Amazon

Mechatronics & Robotics Tech

📍

Maryville, TN

December 14, 2025

view job ->

Amazon

Equipment Coordinator

📍

Tracy, CA

December 14, 2025

view job ->

Software Engineer, Inference AI/ML

Coreweave

What You’ll Do:

About the role:

Who You Are:

Latest jobs

Search by city

Cities - 🇺🇸 North America

Cities - 🌏 Asia

Cities - 🌎 South America

Search by area

Cities - 🇪🇺 Europe

Cities - 🇬🇧 UK

Cities - 🇦🇺 Australia

Cities - 🌍 Middle East

Search by experience level

Search by role type

2026 internships

2026 graduate roles

2026 apprenticeships

No experience roles
(0 yrs)

1 year experience roles
(1 yrs)

2 years experience roles
(2 yrs)

3 years experience roles
(3 yrs)

Software Engineer, Inference AI/ML

Coreweave

What You’ll Do:

About the role:

Who You Are:

Latest jobs

Search by city

Cities - 🇺🇸 North America

Cities - 🌏 Asia

Cities - 🌎 South America

Search by area

Cities - 🇪🇺 Europe

Cities - 🇬🇧 UK

Cities - 🇦🇺 Australia

Cities - 🌍 Middle East

Search by experience level

Search by role type

2026 internships

2026 graduate roles

2026 apprenticeships

No experience roles(0 yrs)

1 year experience roles(1 yrs)

2 years experience roles(2 yrs)

3 years experience roles(3 yrs)

No experience roles
(0 yrs)

1 year experience roles
(1 yrs)

2 years experience roles
(2 yrs)

3 years experience roles
(3 yrs)