Job Description
- Optimize and monitor the performance, reliability, and security of AI/ML systems.
- Applying DevSecOps best practices and standards for data quality, code quality, version control, CI/CD, and documentation.
- Developing, testing, deploying, and monitoring scalable and reliable machine learning pipelines using Databricks.
- Troubleshooting and resolving issues related to data, model, and infrastructure performance and availability.
- Researching and evaluating new technologies and frameworks for improving the efficiency and effectiveness of machine learning workflows.
Qualifications
Must Have
- Bachelor's degree or higher in computer science, mathematics, or a related field.
- Experience with DevOps, MLOps, or LLMOps tools and methodologies, such as Git, Docker, Kubernetes, Github Action, Airflow, Kubeflow, Terraform, etc.
- Proficiency in Python, SQL, and at least one of the following frameworks and libraries: TensorFlow/PyTorch, Scikit-learn, etc.
- Experience working with APIs.
- Familiarity developing with Databricks products and services.
- Working knowledge of CI/CD.
- Strong communication, documentation, and white-boarding skills.
- Strong interest in collaboration, learning, and driving value through ML.
- Business-level proficiency in English (written and spoken).
Strongly Preferred
- Databricks certifications.
- Experience with Databricks, Mlflow, Spark, Delta Lake, and other components of the Databricks Unified Data Analytics Platform.
- Working knowledge of at least one major cloud ecosystem (AWS, Azure, or GCP).
- Azure certifications.
- Knowledge in LLM actual use cases.
- Experience working with a geographically distributed and cross-functional team including systems integrators and third-party companies.
- Business-level proficiency in Japanese (written and spoken).
About this role
As an MLOps Engineer, you will automate and streamline the process of integrating and maintaining machine learning models for both traditional and generative AI.