We are hiring a Data Engineer to help us make data driven decisions. Data Engineering is at the heart of Lyft’s products and decision-making. As a Data Engineer at Lyft, you will be tasked with developing robust data infrastructure—encompassing data transport, collection, and storage—and providing services that enable our leadership to make informed, risk-reducing decisions.
You will help architect, build, and launch scalable data pipelines to support Lyft’s growing data processing and analytics needs. Your efforts will allow access to business and user behavior insights, using huge amounts of Lyft data to fuel several teams such as Analytics, Data Science, Engineering, and many others.
Our technology stack is based on the latest technologies such as AWS, Kubernetes and Apache Airflow. You will work with incredibly passionate and talented colleagues from software engineering, machine learning and data science on projects that delight millions of passengers and drivers.
Responsibilities:
- Owner of the core data pipelines in mapping, responsible for scaling up data processing flow to meet the rapid data growth at Lyft
- Evolve data model and data schema based on business and engineering needs
- Implement systems tracking data quality and consistency
- Develop tools supporting self-service data pipeline management (ETL)
- SQL and MapReduce job tuning to improve data processing performance
- Write well-crafted, well-tested, readable, maintainable code
- Participate in code reviews to ensure code quality and distribute knowledge
- Participate in on-call rotations to ensure high availability and reliability of workflows and data
- Unblock, support and communicate with internal & external partners to achieve results
Experience:
- Bachelor's degree in Computer Science, Engineering, Mathematics, Statistics, or a related field.
- 2+ years of relevant professional experience
- Strong experience with Spark
- Experience with Hadoop (or similar) Ecosystem, S3, DynamoDB, MapReduce, Yarn, HDFS, Hive, Spark, Presto, Pig, HBase, Parquet
- Strong skills in a scripting language (Python, Ruby, Bash)
- Good understanding of SQL Engine and able to conduct advanced performance tuning
- Proficient in at least one of the SQL languages (MySQL, PostgreSQL, SqlServer, Oracle)
- Experience with workflow management tools (Airflow, Oozie, Azkaban, UC4)
- Comfortable working directly with data and business partners to bridge Lyft’s business goals with data engineering