Manage your Machine Learning Lifecycle with MLflow  –  Part 1

Reproducibility, good management and tracking experiments is necessary for making easy to test other’s work and analysis. In this first part we will start learning with simple examples how to record and query experiments, packaging Machine Learning models so they can be reproducible and ran on any platform using MLflow.
By Favio Vazquez, Principal
c
comments

Image

 

The Machine Learning Lifecycle Conundrum

 

Machine Learning (ML) is not easy, but creating a good workflow which you can reproduce, revisit and deploy to production is even harder. There has been many advances towards creating a good platform or managing solution for ML. Note that this is not the

MLflow is an open source platform for the complete machine learning lifecycle.

MLflow is designed to work with any ML library, algorithm, deployment tool or language. It is very easy to add MLflow to your existing ML code so you can benefit from it immediately, and to share code using any ML library that others in your organization can run. MLflow is also an open source projectthat users and library developers can extend.

Installing MLflow

Installing MLflow is very easy, you just have to run:

pip install mlflow


And this is according to the creators. But I faced several issues while installing it. So here are my recommendations (if you can run mlflow in your terminal after installing ignore ):

From Databricks: MLflow cannot be installed on the MacOS system installation of Python. We recommend installing Python 3 through the Homebrew package manager using brew install python. (In this case, installing mlflow is now pip3 install mlflow).

That did not work for me and I got this error:

~ ❯ mlflow
Traceback (most recent call last):
  File "/usr/bin/mlflow", line 7, in <module>
    from mlflow.cli import cli
  File "/usr/lib/python3.6/site-packages/mlflow/__init__.py", line 8, in <module>
    import mlflow.projects as projects # noqa
  File "/usr/lib/python3.6/site-packages/mlflow/projects.py", line 18, in <module>
    from mlflow.entities.param import Param
  File "/usr/lib/python3.6/site-packages/mlflow/entities/param.py", line 2, in <module>
    from mlflow.protos.service_pb2 import Param as ProtoParam
  File "/usr/lib/python3.6/site-packages/mlflow/protos/service_pb2.py", line 127, in <module>
    options=None, file=DESCRIPTOR),
TypeError: __init__() got an unexpected keyword argument 'file'


And the way of solving that was not very easy. I’m using MacOS btw. To solve that I needed to update the protobuf library. To do that I installed the Google’s protobuf library from source:

google/protobuf
protobuf - Protocol Buffers - Google's data interchange format

Download the 3.5.1 version. I had the 3.3.1 before. Follow these steps:

Installing protoc
API for protocol buffers using modern Haskell language and library patterns.

Or try using Homebrew.

If your installation works, run

mlflow


and you should see this:

Usage: mlflow [OPTIONS] COMMAND [ARGS]...
Options:
  --version  Show the version and exit.
  --help     Show this message and exit.
Commands:
  azureml      Serve models on Azure ML.
  download     Downloads the artifact at the specified DBFS...
  experiments  Tracking APIs.
  pyfunc       Serve Python models locally.
  run          Run an MLflow project from the given URI.
  sagemaker    Serve models on SageMaker.
  sklearn      Serve SciKit-Learn models.
  ui           Run the MLflow tracking UI.


Quickstart with MLflow

Now that you have MLflow installed let’s run a simple example.

import os
from mlflow import log_metric, log_param, log_artifact

if __name__ == "__main__":
    # Log a parameter (key-value pair)
    log_param("param1", 5)

    # Log a metric; metrics can be updated throughout the run
    log_metric("foo", 1)
    log_metric("foo", 2)
    log_metric("foo", 3)

    # Log an artifact (output file)
    with open("output.txt", "w") as f:
        f.write("Hello world!")
    log_artifact("output.txt")


Save that to train.py and then run with

python train.py


You will see the following:

Running test.py


And that’s it? Nope. With MLflow you have a UI that you can access easily writing:

mlflow ui


And you will see (localhost:5000 by default):

So what have we done so far? If you see the code you’ll se we used two things, a log_param, log_metric and log_artifact. The first one logs the passed-in parameter under the current run, creating a run if necessary, the second one logs the passed-in metric under the current run, creating a run if necessary, and the last one log a local file or directory as an artifact of the currently active run.

So with this simple example we learned how to save the log of params, metrics and files in our lifecycle.

If we click on the date of the run, we can see more about it.

Now if we click the metric, we can see how it got updated through the run:

And if we click the artifact we can see a preview of it:

MLflow Tracking

The MLflow Tracking component lets you log and query experiments using either REST or Python.

Each run records the following information:

Code Version: Git commit used to execute the run, if it was executed from an MLflow Project.

Start & End: TimeStart and end time of the run

Source: Name of the file executed to launch the run, or the project name and entry point for the run if the run was executed from an MLflow Project.

Parameters: Key-value input parameters of your choice. Both keys and values are strings.

Metrics: Key-value metrics where the value is numeric. Each metric can be updated throughout the course of the run (for example, to track how your model’s loss function is converging), and MLflow will record and let you visualize the metric’s full history.

Artifacts: Output files in any format. For example, you can record images (for example, PNGs), models (for example, a pickled SciKit-Learn model) or even data files (for example, a Parquet file) as artifacts.

Runs can optionally be organized into experiments, which group together runs for a specific task. You can create an experiment via the mlflowexperimentsCLI, withmlflow.create_experiment(), or via the corresponding REST parameters.

# Prints "created an experiment with ID <id>
mlflow experiments create face-detection
# Set the ID via environment variables
export MLFLOW_EXPERIMENT_ID=<id>


And then you just launch an experiment:

# Launch a run. The experiment ID is inferred from the MLFLOW_EXPERIMENT_ID environment variable

with mlflow.start_run():
    mlflow.log_parameter("a", 1)
    mlflow.log_metric("b", 2)


Example of Tracking:

A simple example using the Wine Quality dataset: Two datasets are included, related to red and white vinho verde wine samples, from the north of Portugal. The goal is to model wine quality based on physicochemical tests.

First download this file:

https://raw.githubusercontent.com/databricks/mlflow/master/example/tutorial/wine-quality.csv

And then in the folder create the file train.py with the content:

# Read the wine-quality csv file

data = pd.read_csv("wine-quality.csv")

# Split the data into training and test sets. (0.75, 0.25) split.
train, test = train_test_split(data)

# The predicted column is "quality" which is a scalar from [3, 9]
train_x = train.drop(["quality"], axis=1)
test_x = test.drop(["quality"], axis=1)
train_y = train[["quality"]]
test_y = test[["quality"]]

alpha = float(sys.argv[1]) if len(sys.argv) > 1 else 0.5
l1_ratio = float(sys.argv[2]) if len(sys.argv) > 2 else 0.5

with mlflow.start_run():
    lr = ElasticNet(alpha=alpha, l1_ratio=l1_ratio, random_state=42)
    lr.fit(train_x, train_y)

    predicted_qualities = lr.predict(test_x)

    (rmse, mae, r2) = eval_metrics(test_y, predicted_qualities)

    print("Elasticnet model (alpha=%f, l1_ratio=%f):" % (alpha, l1_ratio))
    print("  RMSE: %s" % rmse)
    print("  MAE: %s" % mae)
    print("  R2: %s" % r2)

    mlflow.log_param("alpha", alpha)
    mlflow.log_param("l1_ratio", l1_ratio)
    mlflow.log_metric("rmse", rmse)
    mlflow.log_metric("r2", r2)
    mlflow.log_metric("mae", mae)

    mlflow.sklearn.log_model(lr, "model")


Here we will thest MLflow integration for SciKit-Learn too. After running you will see in the terminal this:

Elasticnet model (alpha=0.500000, l1_ratio=0.500000):
  RMSE: 0.82224284976
  MAE: 0.627876141016
  R2: 0.126787219728


And then run the mlflow ui in the same current working directory as the one which contains the mlruns directory and navigate your browser to http://localhost:5000. You will see:

And you will have this for each run, so you can track everything you do. Also the model have a pkl file and a YAML for deployment, reproduction and sharing.

Stay tuned for more

 
In the next post I’ll cover the Projects and Models API, where we will be able to run in production these models, also create a full lifecycle.

Make sure to check the MLflow project for more:

databricks/mlflow
mlflow - Open source platform for the complete machine learning lifecycle

Thanks for reading this. I hope you found something interesting here :)

If you have questions just follow me on Twitter

Favio Vázquez (@FavioVaz) | Twitter
The latest Tweets from Favio Vázquez (@FavioVaz). data Scientist. Physicist and computational engineer. I have a…

and LinkedIn.

Favio Vázquez — Principal data Scientist — OXXO | LinkedIn
View Favio Vázquez’s profile on LinkedIn, the world’s largest professional community. Favio has 15 jobs jobs listed on…

See you there :)

 
Bio: Favio Vazquez is a physicist and computer engineer working on data Science and Computational Cosmology. He has a passion for science, philosophy, programming, and music. Right now he is working on data science, machine learning and big data as the Principal data Scientist at Oxxo. Also, he is the creator of Ciencia y Datos, a data Science publication in Spanish. He loves new challenges, working with a good team and having interesting problems to solve. He is part of Apache Spark collaboration, helping in MLlib, Core and the Documentation. He loves applying his knowledge and expertise in science, data analysis, visualization, and automatic learning to help the world become a better place.

Original. Reposted with permission.

Related:

  • My Journey into Deep Learning
  • Machine Learning with Optimus on Apache Spark
  • Deep Learning With Apache Spark: Part 1