Hi folks, We have our own MLFlow server on interna...
# questions
s
Hi folks, We have our own MLFlow server on internal S3. Below are the setting I used locally:
Copy code
os.environ["MLFLOW_TRACKING_URI"] = "<https://xxx.com/mlflow/>"
os.environ["MLFLOW_S3_ENDPOINT_URL"] = "<http://s3xxx.com>"
os.environ["S3_BUCKET_PATH"] = "<s3://xxx/mlflow>"
os.environ["AWS_ACCESS_KEY_ID"] = "xxx"
os.environ["AWS_SECRET_ACCESS_KEY"] = "xxx"
os.environ['MLFLOW_TRACKING_USERNAME'] = 'xxx'
os.environ['MLFLOW_TRACKING_PASSWORD'] = 'xxx'
os.environ["MLFLOW_TRACKING_SERVER_CERT_PATH"] = "C:\\xxx\\ca-bundle.crt"
EXPERIMENT_NAME = "ZeMC012"
In order to use in Kedro framework, I create a mlflow.yml file in conf/local folder and the content like this:
Copy code
server: 
  mlflow_tracking_uri: <https://xxx.com/mlflow/>
  MLFLOW_S3_ENDPOINT_URL: <http://s3xxx.com>
  S3_BUCKET_PATH: <s3://xxx/mlflow>
  AWS_ACCESS_KEY_ID: xxx
  AWS_SECRET_ACCESS_KEY: xxx
  MLFLOW_TRACKING_USERNAME: xxx
  MLFLOW_TRACKING_PASSWORD: xxx
  MLFLOW_EXPERIMENT_NAME: ZeMC012
  MLFLOW_TRACKING_SERVER_CERT_PATH: C:/xxx/ca-bundle.crt
But I got error
ValidationError: 8 validation errors for KedroMlflowConfig
How should I modify it?
h
Someone will reply to you shortly. In the meantime, we've found some posts that could help answer your question.
d
Hi Shu-Chun, Have you tried using the Kedro-MLflow plugin? Here's the link for more details: Kedro-MLflow Setup. It helps generate a correct
mlflow.yml
file, and as I understand, there should be multiple sections included.
s
@Dmitry Sorokin After I used
kedro mlflow init
to generate mlflow.yml, I don't see the those parameters in the template:
Copy code
MLFLOW_S3_ENDPOINT_URL: <http://s3xxx.com>
S3_BUCKET_PATH: <s3://xxx/mlflow>
MLFLOW_TRACKING_USERNAME: xxx
MLFLOW_TRACKING_PASSWORD: xxx
MLFLOW_TRACKING_SERVER_CERT_PATH: C:/xxx/ca-bundle.crt
Where and how should I put those parameters? Since I still got error messages:
SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1006)
Copy code
MaxRetryError: HTTPSConnectionPool(host='<http://xxx.com|xxx.com>', port=443): Max retries exceeded with url: 
/mlflow/api/2.0/mlflow/experiments/get-by-name?experiment_name=ZeMC012 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED]     
certificate verify failed: unable to get local issuer certificate (_ssl.c:1006)')))
Copy code
SSLError: HTTPSConnectionPool(host='<http://xxx.com|xxx.com>', port=443): Max retries exceeded with url:
/mlflow/api/2.0/mlflow/experiments/get-by-name?experiment_name=ZeMC012 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED]     
certificate verify failed: unable to get local issuer certificate (_ssl.c:1006)')))
Copy code
MlflowException: API request to <https://xxx.com/mlflow/api/2.0/mlflow/experiments/get-by-name> failed with exception
HTTPSConnectionPool(host='<http://dad-rbg.icp.infineon.com|dad-rbg.icp.infineon.com>', port=443): Max retries exceeded with url:
/mlflow/api/2.0/mlflow/experiments/get-by-name?experiment_name=ZeMC012 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED]     
certificate verify failed: unable to get local issuer certificate (_ssl.c:1006)')))
d
@Shu-Chun Wu, you can add those settings manually under the
tracking
section. It seems the errors are occurring because the connection to the MLflow server wasn't properly established, likely due to a missing
MLFLOW_TRACKING_SERVER_CERT_PATH
.
s
what do you mean about
tracking
section? Which file could I add
MLFLOW_TRACKING_SERVER_CERT_PATH
?
d
It looks like you should try to split them into two groups. Some variables, like
MLFLOW_S3_ENDPOINT_URL
,
S3_BUCKET_PATH
, and
MLFLOW_TRACKING_SERVER_CERT_PATH
, should remain as OS environment variables, as they were originally. The credentials for MLflow tracking (username and password) should be specified in
mlflow.yml
under the
credentials
section (as shown in the manual: Kedro Data Catalog - Dataset Access Credentials). Alternatively, you could try specifying them as environment variables as well.
s
@Dmitry Sorokin But after I run
kedro mlflow init
The mlflow.yml file is written:
Copy code
# All credentials needed for mlflow must be stored in credentials .yml as a dict
# they will be exported as environment variable
# If you want to set some credentials,  e.g. AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY
# > in `credentials.yml`:
# your_mlflow_credentials:
#   AWS_ACCESS_KEY_ID: 132456
#   AWS_SECRET_ACCESS_KEY: 132456
# > in this file `mlflow.yml`:
# credentials: mlflow_credentials
Here mixes up AWS credential and mlflow credentail, which is not clear for me. Do I need both? Currently, in mlflow.yml, I have:
Copy code
server:
  mlflow_tracking_uri: <https://xxx.com/mlflow/> 
  mlflow_registry_uri: null 
  credentials: mlflow_credentials  
  request_header_provider:
    type: null 
    pass_context: False 
    init_kwargs: {}
And in credentials.yml, I have:
Copy code
mlflow_credentials:
   MLFLOW_TRACKING_USERNAME: xxx
   MLFLOW_TRACKING_PASSWORD: xxx
Both mlflow.yml and credentials.yml are in
conf/local
folder. Even I have s3 credential in credentials.yml. But it's not read anywhere. On the other hand, I still don't know how to read my certificate file.
MLFLOW_TRACKING_SERVER_CERT_PATH: C:/xxx/ca-bundle.crt