#3 [MUST] Setting Up MLOps Pipeline

4 min readSep 4, 2024

Having a thorough understanding of the foundations is essential before using the MLOps three-stage pipeline. If you haven’t already, I suggest the first article (introduction) to catch up.

Additionally, if the Data Ingestion pipeline hasn’t been set up yet, look at Article #2.1 for the tutorial.

In the event that you neglected to build the MLOps pipeline model, please read our Article #2.2 for the steps.

Lastly, if you haven’t tried inferencing with your model, I insist you get a gist of inferencing with Article #2.3.

We are at the end of our MLOps 3-Stage pipeline project, we have created separate components for each stage. In this article, we will combine this all into a single pipeline, which when executed, will perfrom all the operations as requied.

Create a new python file named main_pipeline.py and start it. We will be following the below folder structure, for our project:

.
|-- Animal_Data/
|   `-- images/
|       `-- animals10/
|           `-- raw-img/
|               `-- ~~ files as per animal classes ~~
|-- pipeline_files/
|   |-- data_ingestion.py
|   |-- model_development.py
|   `-- model_inference.py
|-- inference_samples/
|   `-- sample_img.png
|-- main_pipeline.py
`-- best_checkpoint.pth.tar/
    `-- best_model.pth

Don’t worry if your folder structure doesn’t look like this. You can always make folder and move files in the right directory. I like it this way, since it helps in organizing the files much better. Also, in order code samples, adjust the file location accordingly.

Importing all Functions

Firstly, in the main_pipeline.py, we want to import all the previous developed functions under one hood. We can do it by using “from folder_name.file import function”.

from pipeline_files.data_ingestion import data_ingestion
from pipeline_files.model_development import data_tranforms, model_training_complete
from pipeline_files.model_inference import model_infer

After import all the files, we want to use a method, that will execute the functions, but also help in debugging in case an error occurs. Basically, a try-except clause would do.

Executing Pipelines

Before starting any stage, we will use a variable STAGE and give it the stage name. Then, we will print the STAGE and Exception occurred in the terminal if it occurs. This will help in quick debugging.

For example, when we want to execute the data_ingestion stage, we can do it as:

STAGE = "Data Ingestion"
try:
    data_ingestion()
except Exception as e:
    print(f"Error in {STAGE} Stage: {e}")

This method is pretty basic, but really helps a lot. We can do for other files as well using:

STAGE = "Model Development"
try:
    data_tranforms()
    model_training_complete()
except Exception as e:
    print(f"Error in {STAGE} Stage: {e}")

STAGE = "Model Inference"
try:
    model_infer()
except Exception as e:
    print(f"Error in {STAGE} Stage: {e}")

Running through CMD

Since our pipeline is ready now, we surely don’t want it to run through python file. For this, we will use CMD. To run the python file as a script, we use

if __name__ == "__main__":
...rest of the code here...

This will ensure that when running:

>> python main_pipeline.py

through CMD, the code inside this IF gets executed without any further help.

Packing the Final Code

Here is what your final main_pipeline will look like after completion:

from pipeline_files.data_ingestion import data_ingestion
from pipeline_files.model_development import data_tranforms, model_training_complete
from pipeline_files.model_inference import model_infer

if __name__ == "__main__":
  STAGE = "Data Ingestion"
  try:
      data_ingestion()
  except Exception as e:
      print(f"Error in {STAGE} Stage: {e}")

  STAGE = "Model Development"
  try:
      data_tranforms()
      model_training_complete()
  except Exception as e:
      print(f"Error in {STAGE} Stage: {e}")

  STAGE = "Model Inference"
  try:
      model_infer()
  except Exception as e:
      print(f"Error in {STAGE} Stage: {e}")

Then open up the command line now in the same folder, activate the conda (if) environment you created earlier, and then run the cmd prompt I shared above. And it will surely start all the stages, execute each function and provide you with inference.

Conclusion

So, this article marks the end of 2-week long run of MLOps series project.

Here is the complete project if you want to look at it: Gurneet1928/AnimalVision-DirectML: An Image Classification Project model built using PyTorch and DirectML backend. Secondary purpose is to draw attention to AMD+DirectML and perform benchmarks on AMD GPUs. (github.com) >> Note: This repo uses some more techniques, so feel free to adjust it according to you.

We have examined the crucial phases that convert machine learning models into scalable and dependable production systems in our extensive MLOps series. You can make sure that your models operate consistently in real-world settings by thoroughly putting the MLOps 3-stage pipeline into practice. This pipeline consists of continuous integration, continuous deployment, and strict monitoring.

The last step — monitoring in MLOps — emphasizes the significance of continuous watchfulness in controlling model drift, guaranteeing data quality, and preserving system integrity. The smooth implementation of MLOps depends on each step of this pipeline, which facilitates the efficient scaling of machine learning models.

The tactics we’ve covered will be crucial for anyone trying to stay ahead in the MLOps industry as the industry develops. The goal of this series is to provide you with best practices and practical insights that you can use on your own MLOps journey. Gaining proficiency in these phases will help you optimize your machine learning pipelines and make sure your models are reliable and future-proof.