As an Engineer on the Data Lake Team, you will have ownership over defining the evolutionary, technical vision for our large-scale data processing systems. You will be responsible for designing, implementing, and optimizing mission-critical data pipelines and storage solutions, leveraging technologies like EMR, Spark, and Flink. You’ll work on ensuring scalability, performance, and reliability while mentoring team members and driving technical excellence.

‍

Please note: this role is based in Boston, MA and requires a hybrid, in-office component.

‍

‍Team Tech Stack:

Python (Node or Java)
Apache Spark, Apache Flink
Airflow
Kafka, Apache Pulsar
MySQL
Kubernetes
AWS (including EMR, S3, Redshift)

‍

How You'll Make a Difference

Implement scalable, fault-tolerant data pipelines using distributed processing frameworks like Apache Spark and Flink on AWS EMR, optimizing for throughput and latency
Design batch and real-time, event-driven data workflows to process billions of data points daily, leveraging streaming technologies like Kafka and Flink. - Optimize distributed compute clusters and storage systems (e.g., S3, HDFS) to handle petabyte-scale datasets efficiently, ensuring resource efficiency and cost-effectiveness.
Develop robust failure recovery mechanisms, including checkpointing, replication, and automated failover, to ensure high availability in distributed environments
Optimize data storage and processing systems to handle petabyte-scale datasets efficiently, ensuring performance and cost-effectiveness.
Collaborate with cross-functional teams to deliver actionable datasets that power analytics and AI capabilities.
Implement data governance policies and security measures to maintain data quality and compliance.
Own the technical direction of highly visible data systems, improving monitoring, failure recovery, and performance.
Mentor engineers, review technical documentation, and articulate phased approaches to achieving the team’s technical vision.
Contribute to the evolution of internal data processing tools and frameworks, enhancing their scalability and usability

‍

Must Have:

4+ years of experience in software development, with at least 2 years focused on data engineering and distributed systems.
Hands on with Python and SQL, with experience in backend development.
Experience with distributed data processing frameworks such as Apache Spark and Flink.
Proven track record of designing and implementing scalable ETL/ELT pipelines, ideally using AWS services like EMR.
Strong knowledge of cloud platforms, particularly AWS (e.g., EMR, S3, Redshift), and optimizing data workflows in the cloud.
Experience with data pipeline orchestration tools like Airflow.
Familiarity with real-time data streaming technologies such as Kafka or Pulsar.
Understanding of data modeling, database design, and data governance best practices.
Excellent problem-solving skills and the ability to thrive in a fast-paced, collaborative environment.
Strong communication skills with experience mentoring or leading engineering teams.
Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent experience.
You’ve already experimented with AI in work or personal projects, and you’re excited to dive in and learn fast. You’re hungry to responsibly explore new AI tools and workflows, finding ways to make your work smarter and more efficient.

‍

Latest jobs

Amazon

Network Engineer I, Just Walk Out Tech

📍

Hyderabad, India

December 14, 2025

view job ->

Amazon

Mechatronics & Robotics Technician

📍

Pflugerville, TX

December 14, 2025

view job ->

Amazon

AWS Finance analyst , AWS Finance

📍

Hyderabad, India

December 14, 2025

view job ->

Amazon

Tax Analyst I, Federal Tax

📍

Hyderabad, India

December 14, 2025

view job ->

Amazon

2026 Engineering Operations Intern, Data Center Engineering Operations

📍

Frankfurt, Germany

December 14, 2025

view job ->

Amazon

Software Development Engineer

📍

San Francisco, CA

December 14, 2025

view job ->

Amazon

Financial Analyst I, FOAA - Payroll

📍

Bengaluru, India

December 14, 2025

view job ->

Amazon

Software Development Engineer Internship - Vaga para mulheres, IES- LATECH

📍

São Paulo, Brazil

December 14, 2025

view job ->

Amazon

Mechatronics & Robotics Tech

📍

Maryville, TN

December 14, 2025

view job ->

Amazon

Equipment Coordinator

📍

Tracy, CA

December 14, 2025

view job ->

Software Engineer II, Data Lake

Klaviyo

Latest jobs

Search by city

Cities - 🇺🇸 North America

Cities - 🌏 Asia

Cities - 🌎 South America

Search by area

Cities - 🇪🇺 Europe

Cities - 🇬🇧 UK

Cities - 🇦🇺 Australia

Cities - 🌍 Middle East

Search by experience level

Search by role type

2026 internships

2026 graduate roles

2026 apprenticeships

No experience roles
(0 yrs)

1 year experience roles
(1 yrs)

2 years experience roles
(2 yrs)

3 years experience roles
(3 yrs)

Software Engineer II, Data Lake

Klaviyo

Latest jobs

Search by city

Cities - 🇺🇸 North America

Cities - 🌏 Asia

Cities - 🌎 South America

Search by area

Cities - 🇪🇺 Europe

Cities - 🇬🇧 UK

Cities - 🇦🇺 Australia

Cities - 🌍 Middle East

Search by experience level

Search by role type

2026 internships

2026 graduate roles

2026 apprenticeships

No experience roles(0 yrs)

1 year experience roles(1 yrs)

2 years experience roles(2 yrs)

3 years experience roles(3 yrs)

No experience roles
(0 yrs)

1 year experience roles
(1 yrs)

2 years experience roles
(2 yrs)

3 years experience roles
(3 yrs)