
Key job responsibilities
Systems Engineers with AWS will dive deep into understanding the root cause of a customer issue, investigate why a metric is trending the wrong way and consult with senior engineers. We own our services and believe in making out-of-hours support as painless as possible. To achieve this, we implement Operational Excellence best practices and strive to automate manual processes.
โ
Systems Engineers will possess and/or develop a broad range of skills. They utilise their Linux skills to troubleshoot, innovate fixes and workarounds, keep software up-to-date and provide data and metrics that help manage the capacity and efficiency of our services. They may draw on their networking knowledge to identity and troubleshoot network connectivity issues. They communicate clearly and collaborate with others to deliver results. They are self-starters, comfortable dealing with ambiguity and change. They are customer-obsessed, always looking to understand customer pain points and find resolutions quickly and completely.
A day in the life
Youโll spend a majority of your time operating and improving one of our largest software systems. Over the course of a week, you will review the operational health of the services in your teamโs care, and on locating any anomaly, will write up an actionable bug report. As a responsible engineer, youโve learned never to make changes to production systems without a plan, so you reviewed then executed changes following a change management process to one of the production systems in your care. You will also help resolve your teamโs backlog of operational issues. You round off the week by writing a cool script that you shared with your team which helps get to root cause faster of a hard problem that you diagnosed earlier.
โ
You will be required to occasionally participate in an โon-callโ rotations to resolve incidents occurring out-of-hours.
โ
- 1+ years of experience with Linux, using the command line and basic administration, and computer networking fundamentals
- 1+ years of experience troubleshooting operations for highly scalable Linux systems running a large variety of back-end micro-service applications.
- Knowledge of scripting in a language such as Bash, Python, or Ruby, etc., with a focus on automation and solving systems issues.
โ
โ
- Bachelor's degree in Computer Science or other technical degree or related experience
- Proficient troubleshooting and anticipating problems that affect the performance, reliability, or availability of software systems
- 1+ Years of Operations experience working with CI/CD Pipelines and deployment systems like; Terraform, Github Actions, Jenkins, or others
- Experience with Infrastructure as Code, (such as CDK, CloudFormation, Puppet, Chef, Ansible, or similar)
โ