bg

Description & Requirements

Overview

Seeking skilled MLOps Engineers to deploy and maintain scalable ML pipelines. The ideal candidates will be responsible for building, deploying, and maintaining production scale ML pipelines, platform upgrades, model deployment, model fine-tuning & automation of ML lifecycle. This role requires a good understanding of AI/ML frameworks, Cloud, and modern automation/monitoring tools combined with a passion for learning new technology & solving complex problems.


Key Responsibilities & Skills

ML Pipeline Automation:
  • Design, develop, and implement CI/CD pipelines for machine learning models.
  • Automate model validation, and deployment processes.
  • Automate data ingestion, preprocessing, and model retraining workflows to enable continuous improvement of ML systems.
  • Implement version control for data, models, and code.
  • Optimize models for latency, memory usage, and cost, collaborating with data scientists to meet production requirements.
  • Work with data engineers to optimize data pipelines for ML.
Infrastructure Management:
  • Manage and optimize infrastructure for machine learning workloads.
  • Implement containerization and orchestration technologies (Docker, Kubernetes).
  • Model Monitoring and Logging.
  • Implement monitoring systems to track model performance.
  • Detect and address model drift and anomalies.
  • Create dashboards and alerts for key performance indicators.
Collaboration and Communication:
  • Collaborate with data scientists, machine learning engineers, and DevOps engineers.
  • Communicate effectively with stakeholders about ML deployment and performance.
  • Document processes and best practices.
Governance and Compliance:
  • Implement controls to ensure compliance with relevant regulations and policies.
  • Collaborate with product managers, developers, and stakeholders to translate business requirements into technical architecture.
  • Document and present architecture, along with pros & cons to senior IT leadership.
  • Deploy systems to detect and address model drift and anomalies.
Collaboration & Innovation:
  • Work closely with cross-functional teams (e.g., data science, DevOps, security) to ensure seamless deployment and operation of solutions.
  • Identify opportunities to leverage emerging technologies to improve existing systems.
  • Contribute to proofs-of-concept (PoCs) and pilot projects to validate new ideas.

Qualifications

Experience & Technical Skills:
  • Strong programming skills in Linux shell scripting & Python.
  • Experience with machine learning libraries (e.g., PyTorch).
  • Experience with CI/CD tools (e.g., Jenkins, GitLab CI).
  • Experience with containerization and orchestration technologies (e.g., Kubernetes).
  • Experience with cloud platforms (AWS, Azure).
  • Understanding of basic machine learning concepts and workflows.
  • Strong understanding of Linux environments.
Soft Skills:
  • Exceptional problem-solving and analytical skills.
  • Strong communication skills to articulate complex technical concepts to non-technical stakeholders.
  • Ability to work collaboratively in a fast-paced, agile environment.
  • Ability to learn and adopt new technology.
Preferred Qualifications:
  • Experience with enterprise-scale operations & deployments in industries like finance, healthcare, life science.