DevOps Engineer for Machine Learning Software Systems
at SenzMate in Sri Lanka, published on 27 Feb. 2024
SenzMate IoT Intelligence eliminates the inequality in access for the best quality state of the art IoT and AI technologies all around the world. We’ve always had a core mission to invent and innovate with the most advanced technologies, helping our clients become enterprises of the future and leaders in their industries. For the last 7 years, the essence of our company and culture has been built by the incredible people of SenzMate – where our many humanitarian contributions reflect our values.
Job Overview
As a DevOps Engineer in our team, you will play a crucial role in scaling our production ML systems to high data volumes. Your responsibilities will involve providing robust infrastructure and tools for our Machine Learning Engineers to develop, evaluate, and deploy their solutions in a rapid but robust manner.
Responsibilities:
- Identify and/or build tools for accelerated experimentation and evaluation processes for ML-heavy software systems.
- Establish and set up model Life-Cycle Management with tools like MLflow, etc.
- Implement and manage version control, continuous integration, and continuous deployment (CI/CD) systems for Machine Learning models and related software components.
- Maintain and implement monitoring, logging and alerting solutions for our infrastructure and applications.
- Using software and systems knowledge, ensure that SenzMate’s applications are reliable, scalable, and efficient.
- Continuously optimize and automate infrastructure provisioning and deployment processes.
- Collaborate closely with the engineering team and operations to troubleshoot and resolve issues.
- Document and communicate MLOps processes, guidelines and procedures to ensure consistency and knowledge sharing across the organization.
Requirements:
- Bachelor's degree in Computer Science, Engineering, or a related field.
- 2+ years of experience in a DevOps engineering role.
- Proven experience in a B2B environment, previous MLOps experience in a large scale system is a bonus.
- Experience working in Unix/Linux environments.
- In-depth knowledge of cloud platforms (AWS, Azure, or Google Cloud).
- Fluent in one or more programming languages e.g. Go, Python, Java (we use Python).
- Experience with containerization and orchestration tools (Docker, Kubernetes).
- Strong understanding of CI/CD concepts and tools (Jenkins, GitLab CI, CircleCI).
- Knowledge of infrastructure automation tools (Terraform, Ansible).
- Experience with monitoring & alerting tooling (Datadog, Sentry).