DVC, MLflow, Airflow

Machine Learning Engineering and MLOps for Batch Scoring

Learn how to design and build end-to-end solutions for banking, telecom and retail industries with DVC, MLflow and Airflow
Write your awesome label here.

What's included? 

  • end-to-end hands-on project
  • code review
  • office hours
  • 1-1 mentorship*
  • intensive program*

  Hands-On Project

From a prototype in Jupyter Notebook to fully automated pipelines and MLOps workflow. 

Integrate DVC, MLflow, Airflow and other open source tools.
Step by step.   

  Mentorship

Do you want apply new knowledge right away? 
Bring your project and work on it in parallel to the course program! 

We are happy to provide you with 1-1 mentorship and guidance! 

Should you enroll? 

Engineers and Developers

The course will be interesting if you are approaching to integrate DVC, MLflow, Airflow, but do not have enough experience with them. Or you need to quickly figure out how to integrated and make them work together. 

Data Scientists

Improve engineering skills for automating ML experiments, documentation and reports generation. Step forward to developing of production solutions and speed up ML experiments

Team Leads

Promote high standards in ML development and  fast time to marketImplement good engineering practices and automate silos to improve your team collaboration and make it easy to deploy and operationalize ML models on scale. 

Goal: to build mature and scalable MLOps

End-to-end MLOps workflow from training to deployment and monitoring

Organize team work

Enforce team collaboration with  Git and good engineering practices for ML projects

Automate ML pipelines

Automate pipelines for data preparation, experiments and model evaluation with DVC

Manage experiments and models

Ensure reproducibility, versioning, discoverability of models

Setup CI/CD

Build pipelines for automated training, testing and deployment with GitLab

Deploy to Production

Develop a production solutions and run scoring batch jobs by schedule

Monitor models and data

Streamline monitoring of deployed models, tracking their performance and identifying potential issues

Program

Module 1: MLOps for batch-scoring projects (Banking, Telecom, Retail)

- Overview of batch scoring projects in banking, telecom and retail industries
- Design MLOps workflow for batch-scoring projects
- Standards, workflow, tools and requirements
- Overview the course examples and final project

Module 2: Organise ML projects, code and dev environments


- Organize your code & team collaboration
- Reuse project templates 
- Manage development environments
- Learn good practices for Git collaboration

Module 3: Data and artefacts versioning with DVC


- Design data management requirements for your project
- Review data versioning tools
- Setup data versioning with DVC 
- Build Data Registry for training and validation datasets

Module 4: Automate ML experiments and training pipelines with DVC


- Design automation and reproducibility requirements 
- Review training pipelines automation tools 
- Automate ML experiment pipelines with DVC
- Manage models and artifacts versioning

Module 5: Experiments Management and Metrics Tracking with DVC and MLFlow


- Define requirements for Experiments Management and Metrics Tracking
- Review experiments management and metrics tracking tools
- Setup experiments versioning with DVC 
- Setup metrics tracking with MLFlow 

Module 6: Setup CI/CD with GitLab


- Define CI/CD tasks and requirements in ML projects
- Design CI/CD pipeline
- Setup and trigger CI/CD pipeline: build, test, deploy

Module 7: Continuous Training with Gitlab, CML and Airflow


- Define Continuous Training workflow and requirements
- Review tools: Gitlab, CML, Airflow
- Automate Continuous Training pipeline

Module 8: Model Registry & Model Life Cycle Management


- Define a Model Life Cycle: workflow and requirements
- Design Model Life Cycle with Data Registry
- Setup Model Registry with MLFlow
- Setup Model Registry with DVC and GTO

Module 9: Model and ML pipelines Deployment with Airflow and DVC


- Define Model and ML pipelines deployment workflow and requirements
- Design deployment pipelines
- Deploy Airflow pipelines
- Deploy model with GitLab, DVC and Airflow
- Deploy model with Flask API and Docker

Module 10: Prediction Serving (batch scoring)


- Define Prediction Serving workflow and requirements
- Design Prediction Serving pipelines
- Prediction serving with Airflow
- Prediction serving with Flask API 

Module 11: Monitoring ML systems, model performance and data drifts


- Define Monitoring workflow and requirements
- Setup Model and data monitoring with Airflow
- Setup system monitoring withs Prometheus 
- Design Grafana dashboards for ML projects

Course Program

1. Organize your project & code

Overview approaches and technologies that helps to organize work on Machine Learning (ML) projects, code and teamwork. Set up a repository, review requirements for team collaboration with Git, toolkit for tracking tasks, hypotheses and changes in an ML project

2. Manage environment dependencies with Python virtual environments and Docker

Let's deal with Docker and docker-compose. Set up development environment for an Machine Learning project

3. Version data and automate pipelines with DVC

Get started with versioning data, artifacts and models and pipelines automation with Data Version Control (DVC). We automate the pipeline for training models and assessing their quality. After that you may run ML experiments with only one command!

4. ML experiments management and metrics tracking with DVC and MLflow

Let's add MLflow to our project! Now we have an nice UI for tracking metrics and parameters of experiments, comparing experiments, visualizing the results of GridSearch, etc. DVC and MLflow are used together to manage experiments and model lifecycle

5. Automate pipelines with Airflow

Let's get started with Airflow! What you can use it for? How to create pipelines? How to integrate it with DVC and MLflow? Airflow is often used for production run models for batch scoring on a schedule. This is a good solution for running forecast generation in batch mode

6. Setup CI/CD and MLOps for your ML solution with DVC, Gitlab, Arflow & MLFlow

Setting up an automatic CI / CD process using for our Machine Learning solutions (MLOps). We apply DVC, Gitlab, Arflow & MLFlow tools. Also, let's add monitoring of our system using Grafana and Prometheus
Meet the instructor

Mikhail Rozhkov

Machine Learning Engineer & MLOps consultant

Co-founder of the Machine Learning REPA project. Has over 7+ years hands-on experience in Machine Learning & Data Science, leads projects and helps teams to implement good tools and engineering practices. 
Helped 200+ engineers from 50+ companies to design MLOps processes and integrate open source tools. Collaborated with ML teams in US, Europe and Asia, including Fortune 500 companies 
Patrick Jones - Course author

Packages

* We will refund your money within 14 days after the start of the course if you realize that it is not suitable for you!

Professional

self-paced, pay online
Contact us!
  • Online lessons & course materials access
  • Code Examples
  • Standard Course Project (tabular data, scoring)
  • Course Discussion Chat

Mentoring

guidance for your project
Contact us!
  • Online lessons & course materials access
  • Code Examples
  • Standard Course Project (tabular data, scoring)
  • Course Discussion Chat
  • Code Review
  • Weekly Office Hours Discussion Sessions 
  • Guided Course Project

Corporate

efficient teamwork and processes
Contact us!
  • Online lessons & course materials access
  • Code Examples
  • Standard Course Project (tabular data, scoring)
  • Course Discussion Chat
  • Guided Course Project
  • Code Review
  • Weekly Office Hours Discussion Sessions 
  • Custom program (on demand)
  • Custom dataset (on demand)
  • Additional tools & integrations (on demand)
  • Intensive program (on demand)

Do you need a group workshop?

If you are interesting in group workshops on the course materials for your company, please contact us!
Created with