MLFlow的简单使用和理解

什么是MLflow

mlflow的口号喊的很响亮，实际我们分解开看，mlflow提供的其实是一个模型开发过程的管理平台，一个很轻量化的工具。

也就是说mlflow并不限制和关心你使用的什么机器学习框架，只是要求在模型开发过程中，使用mlflow提供的sdk的api对过程中的动作进行注册，以便在mlflow中进行监控。

同时mlflow提供了过程中的标准，包括日志，模型等，这样的好处就是mlflow可以统一的负责模型的管理和部署，以统一的模型服务对模型进行发布，还能基于不同版本对模型进行管理。

MLFlow的配置

我使用的mysql作为mlflow的tracking的存储介质：

mlflow server --backend-store-uri   mysql+pymysql://root:baifachuan@localhost/mlflow --default-artifact-root file:./mlruns -h 0.0.0.0 -p 5000

可以看到mysql数据库中有这样的表：

mysql> use mlflow;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> show tables;
+-----------------------+
| Tables_in_mlflow      |
+-----------------------+
| alembic_version       |
| experiment_tags       |
| experiments           |
| latest_metrics        |
| metrics               |
| model_version_tags    |
| model_versions        |
| params                |
| registered_model_tags |
| registered_models     |
| runs                  |
| tags                  |
+-----------------------+
12 rows in set (0.00 sec)

mysql>

用来把tracking的数据存储在表中。

结合MLFlow的开发

以下代码是我使用sklearn构建的一个线性回归模型，并且将tracking注册到MLFlow进行跟踪。

# The data set used in this example is from http://archive.ics.uci.edu/ml/datasets/Wine+Quality
# P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis.
# Modeling wine preferences by data mining from physicochemical properties. In Decision Support Systems, Elsevier, 47(4):547-553, 2009.

import os
import warnings
import sys

import pandas as pd
import numpy as np
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
from sklearn.model_selection import train_test_split
from sklearn.linear_model import ElasticNet

import mlflow
import mlflow.sklearn

import logging
logging.basicConfig(level=logging.WARN)
logger = logging.getLogger(__name__)

remote_server_tracking_uri = "http://localhost:5000"
mlflow.set_tracking_uri(remote_server_tracking_uri)
mlflow.set_experiment("AirBnb Tracking")

def eval_metrics(actual, pred):
    rmse = np.sqrt(mean_squared_error(actual, pred))
    mae = mean_absolute_error(actual, pred)
    r2 = r2_score(actual, pred)
    return rmse, mae, r2



if __name__ == "__main__":
    warnings.filterwarnings("ignore")
    np.random.seed(40)

    # Read the wine-quality csv file from the URL
    csv_url =\
        'http://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-red.csv'
    try:
        data = pd.read_csv(csv_url, sep=';')
    except Exception as e:
        logger.exception(
            "Unable to download training & test CSV, check your internet connection. Error: %s", e)

    # Split the data into training and test sets. (0.75, 0.25) split.
    train, test = train_test_split(data)

    # The predicted column is "quality" which is a scalar from [3, 9]
    train_x = train.drop(["quality"], axis=1)
    test_x = test.drop(["quality"], axis=1)
    train_y = train[["quality"]]
    test_y = test[["quality"]]

    alpha = float(sys.argv[1]) if len(sys.argv) > 1 else 0.5
    l1_ratio = float(sys.argv[2]) if len(sys.argv) > 2 else 0.5

    with mlflow.start_run():
        lr = ElasticNet(alpha=alpha, l1_ratio=l1_ratio, random_state=42)
        lr.fit(train_x, train_y)

        predicted_qualities = lr.predict(test_x)

        (rmse, mae, r2) = eval_metrics(test_y, predicted_qualities)

        print("Elasticnet model (alpha=%f, l1_ratio=%f):" % (alpha, l1_ratio))
        print("  RMSE: %s" % rmse)
        print("  MAE: %s" % mae)
        print("  R2: %s" % r2)

        mlflow.log_param("alpha", alpha)
        mlflow.log_param("l1_ratio", l1_ratio)
        mlflow.log_metric("rmse", rmse)
        mlflow.log_metric("r2", r2)
        mlflow.log_metric("mae", mae)

        mlflow.sklearn.log_model(lr, "model")

模型保存后会存储在当前目录mlruns下，按照不同的版本进行管理。

MLflow部署

MLFlow在部署的时候需要一个python的运行隔离环境，和virtualenv的概念类似，mlflow默认使用的conda，但是我自己本身已经有virtualenv了，又不想再安装conda的全家桶污染环境，所以使用了手动的静默安装：

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh -O ~/miniconda.sh
bash ~/miniconda.sh -b -p $HOME/miniconda

其中的参数解释如下：

-b---Batch mode with no PATH modifications to ~/.bashrc. Assumes that you agree to the license agreement. Does not edit the .bashrc or .bash_profile files.
-p---Installation prefix/path.
-f---Force installation even if prefix -p already exists.

在运行部署的时候先设置Conda的环境变量：

export MLFLOW_CONDA_HOME=/Users/fcbai/software/workspace/python/mlflow_ws/miniconda

紧接着启动部署：

mlflow models serve -m file:/Users/fcbai/software/workspace/python/mlflow_ws/mlruns/1/84bad08c43f34c78a4ada11ea0e30cf6/artifacts/model/

也可以指定端口：

mlflow models serve -m file:/Users/fcbai/software/workspace/python/mlflow_ws/mlruns/1/84bad08c43f34c78a4ada11ea0e30cf6/artifacts/model/ -p 8088

在经过启动部署后，有如下输出：

2021/01/20 10:26:06 INFO mlflow.pyfunc.backend: === Running command 'source /Users/fcbai/software/workspace/python/mlflow_ws/miniconda/bin/../etc/profile.d/conda.sh && conda activate mlflow-73c2330f885e4aee44450a5d5830493e7d78ae67 1>&2 && gunicorn --timeout=60 -b 127.0.0.1:5000 -w 1 ${GUNICORN_CMD_ARGS} -- mlflow.pyfunc.scoring_server.wsgi:app'
[2021-01-20 10:26:07 +0800] [52783] [INFO] Starting gunicorn 20.0.4
[2021-01-20 10:26:07 +0800] [52783] [INFO] Listening at: http://127.0.0.1:5000 (52783)
[2021-01-20 10:26:07 +0800] [52783] [INFO] Using worker: sync
[2021-01-20 10:26:07 +0800] [52801] [INFO] Booting worker with pid: 52801
^C[2021-01-20 10:27:54 +0800] [52783] [INFO] Handling signal: int
[2021-01-20 10:27:54 +0800] [52801] [INFO] Worker exiting (pid: 52801)

代表模型正式启动，以htt的方式，json的协议提供模型的访问，例如：

curl -X POST -H "Content-Type:application/json; format=pandas-split" \
--data '{"columns":["alcohol", "chlorides", "citric acid", "density", "fixed acidity", "free sulfur dioxide", "pH", "residual sugar", "sulphates", "total sulfur dioxide", "volatile acidity"],"data":[[12.8, 0.029, 0.48, 0.98, 6.2, 29, 3.33, 1.2, 0.39, 75, 0.66]]}' \
http://127.0.0.1:8088/invocations

返回结果如下：

[6.379428821398614]

整个流程便跑完了。

扫码手机观看或分享：

惊帆的BLOG

关于我

MLFlow的简单使用和理解

什么是MLflow

MLFlow的配置

结合MLFlow的开发

MLflow部署