Role Description
- We are looking for a candidate to join a multi-functional SRE team.
- You should be having cloud engineering experience in such area acting as the SME on operation automation and monitoring, identifying TOIL within the teams existing systems and processes, recommending, and implementing automated solutions to reduce TOIL and improve the efficiency and effectiveness of the team.
β
β
Your key responsibilities
- Working as part of Agile team to define target state infrastructure architecture of applications from reliability standpoint
- Develop, improve, and maintain internal operations tools, such as deployment, monitoring, statistics, platform management tools, etc.
- Automation and optimization of application build and deployment process and perform deployment on testing and production environments.
- CI/CD pipeline setup and management
- Approach support with a proactive attitude, desire to seek root cause, in-depth analysis, and strive to reduce inefficiencies and manual efforts.
β
β
Your skills and experience
β
Must have
- Good knowledge on GCP
- Hands on in defining and creation of CUJ, SLO, SLI, Error Budgeting based on NFR.
- Strong Knowledge on IAAC β Terraform, GitHub, Docker Images
- Strong hands on in scripting like Bash, PowerShell, Python, Ansible
- Good knowledge on containers like Kubernetes
- Design and implementation of automated workflows
- Experience of reducing TOIL in an SDLC or IT operations environment
- Good understanding of SCM Tools: Git, GitHub, SonarQube
- Having fair understanding of ITSM process
- Proactive and analytical mindset
β
β
Nice to have
- Fair understanding of build and release tools like Maven, Ant, Gradle, Puppet , Jenkins, TeamCity, udeploy
- Knowledge on Microservices
- Any programming languages like Java, C#.
- Understanding on CI/CD pipelines
- Understanding of architecture and implementation of three tier web applications
β