ML-Ops: Operationalizing a Machine Learning Model, end to end

Challanges with the deployment of ML Model:

  1. CACE: Changing Anything Changes Everything
    A ML pipeline consists of many small components (data validation, feature selection, modeling, validation). All these components perform the actions independent of each other but depend on input from the previous step to perform a particular action on it. Changing something in one component will change the performance of the whole pipeline. Ex. change in hyperparameters or input data or sampling methods will change the performance of the model and there is a possibility of getting very different predictions with such changes.
  2. Data Drift:
    Data is the heart of ML model. Data defines how the ML pipeline should behave and change in data distribution is something that is one of the hardest for us to see sometimes. Because we live in a dynamic environment, the disribution of data changes over time. Ex. if a new feature is added to the input data through the course of time, we need to train our model to capture these signals.
  3. Concept Drift:
    It is the change in relationships between input and output data over time. This is usually when the Data Generating Process, in itself has changed over time. Ex. There were alot of issues with credit card fraud detection models after the pandemic started, because the normal behaviour of humans have changed after pandemic (Read More here).
  4. Model Degradation:
    Models are a result of code and data, which usually evolves over time. When there is a shift in the data, the model’s performance degrades (which is normal), but usually, it's either retrained fully, which takes a lot of time or a new model is stacked on top of the previous model which works on the output from the first model. This process can lead to multiple stacking, redundant effort, and increased cost.
  5. Data Centric VS Model Centric approach:
    There are 2 approaches to improve the ML performance, Data-Centric approach(hold the model fixed and improve the quality of data) and Model-Centric approach(hold data fixed and iteratively improve the model). I believe there is a tradeoff between both these approaches because sometimes we need complex deep learning models to pick complex signals from the data to make a prediction, but without a good quality of data, even the most complex model would not be very efficient. Its improtant to figure out the tradeoff between both these approaches
  6. Intermediate Data Storage:
    When a big machine learning pipeline is created, each component depends on the output from previous components, wherein the data has to be shared and stored in an intermediate space. At this stage it becomes important to structure the pipeline in way that the data is transferred in an efficient manner across the components.
  7. Feedback loops:
    It's important to make sure that the training dataset is not influenced by the output from the currently used model because in that case, the model is learning from its own biases and probably will aplify those. To make sure the ML model is always learning about the environment and accurate with its predictions, we need to keep feeding in the fresh examples to the model.

Best Practices:

  1. Modular Design:
    Building a Machine learning pipeline with modular components can make it easy to test and reproducible. These components can exist in form of libraries to build the fragments of the ML pipeline. Ex. a component for loading data, feature selection and engineering, training models and validation. *Transfer Learning*- It makes learning from one model to the next version.
  2. Continuous Experimentation, Learning, and Deployment:
    ML pipelines operate in a dynamic environment, which is why data and models continuously change over time. ML pipeline should be adaptable to continuous changes and upgrades to maintain the accuracy of its predictions. As mentioned above, even a small change such as an addition of a feature in the input dataset can cause the complete pipeline to break and require retraining. ML pipelines are better to be automated, when the pipeline detects changes, a trigger schedules automatic retraining. ML pipelines need repeated experimentation (ex. A/B testing) and the creation of models scratch/retraining with new data, over time to stay relevant in the dynamic environment.
  3. Monitoring Performance:
    In order to detect changes in the behavior of our ML pipeline, we need to monitor the performance of the model and other behaviors such as bias, fairness, etc. ML pipelines always have the risk of degradation and drift. “Statistical Process Control” can be used to detect the changes/deviation in the performance. Humans are really good at inferring from visuals and understanding information in a summarized manner, which is a very important aspect of performance monitoring.
  4. Version Control:
    Version controlling has been a very important pillar for the evolution within software engineering and is similarly very important and effective for tracking the changes in the ML pipeline over time. It makes switching between different versions of the model easy if required.
  5. Logging:
    Writing intermediate data files helps to compare and track the performance of machine learning experiments quantitatively. This can include the multiple metric log files which makes it easy for us to track the improvement and to detect the data/concept drift.
  6. Human in Loop:
    AI models can not be completely trusted without proper checks. A human in the loop for the training and testing stages of the pipeline can always give better results with more confidence. The combination of human and machine learning creates a more effective learning cycle. With a proper interaction between humans and machines, the performance of the pipeline can be increased.

Existing tools for MLOPS:

  1. TensorFlow Extended(TFX): An end-to-end platform for deploying production ML pipelines with TensorFlow.
  2. Torchserve: A flexible and easy-to-use tool for serving PyTorch models.
  3. AWS SageMaker: Prepare, build, train, and deploy high-quality machine learning (ML) models quickly
  4. MLflow: Open source platform for the machine learning lifecycle.
  5. Kubeflow: Making deployments of ML workflows on Kubernetes simple, portable and scalable.
  6. Cortex: Machine learning model serving infrastructure.
  7. Take your ML projects from POC to production with maximum efficiency and minimal risk.
  8. BentoML: Open-source platform for high-performance ML model serving.


  1. Hidden Technical Debt in Machine Learning Systems (



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store