Which of the following describes concept drift?
D
A machine learning engineer is monitoring categorical input variables for a production machine
learning application. The engineer believes that missing values are becoming more prevalent in more
recent data for a particular value in one of the categorical input variables.
Which of the following tools can the machine learning engineer use to assess their theory?
B
A data scientist is using MLflow to track their machine learning experiment. As a part of each MLflow
run, they are performing hyperparameter tuning. The data scientist would like to have one parent
run for the tuning process with a child run for each unique combination of hyperparameter values.
They are using the following code block:
The code block is not nesting the runs in MLflow as they expected.
Which of the following changes does the data scientist need to make to the above code block so that
it successfully nests the child runs under the parent run in MLflow?
E
A machine learning engineer wants to log feature importance data from a CSV file at path
importance_path with an MLflow run for model model.
Which of the following code blocks will accomplish this task inside of an existing MLflow run block?
A.
B.
C. mlflow.log_data(importance_path, "feature-importance.csv")
D. mlflow.log_artifact(importance_path, "feature-importance.csv")
E. None of these code blocks tan accomplish the task.
A
Which of the following is a simple, low-cost method of monitoring numeric feature drift?
B
A data scientist has developed a model to predict ice cream sales using the expected temperature
and expected number of hours of sun in the day. However, the expected temperature is dropping
beneath the range of the input variable on which the model was trained.
Which of the following types of drift is present in the above scenario?
E
A data scientist wants to remove the star_rating column from the Delta table at the location path. To
do this, they need to load in data and drop the star_rating column.
Which of the following code blocks accomplishes this task?
D
Which of the following operations in Feature Store Client fs can be used to return a Spark DataFrame
of a data set associated with a Feature Store table?
A
A machine learning engineer is in the process of implementing a concept drift monitoring solution.
They are planning to use the following steps:
1. Deploy a model to production and compute predicted values
2. Obtain the observed (actual) label values
3. _____
4. Run a statistical test to determine if there are changes over time
Which of the following should be completed as Step #3?
D
Which of the following is a reason for using Jensen-Shannon (JS) distance over a Kolmogorov-
Smirnov (KS) test for numeric feature drift detection?
D
A data scientist is utilizing MLflow to track their machine learning experiments. After completing a
series of runs for the experiment with experiment ID exp_id, the data scientist wants to
programmatically work with the experiment run data in a Spark DataFrame. They have an active
MLflow Client client and an active Spark session spark.
Which of the following lines of code can be used to obtain run-level results for exp_id in a Spark
DataFrame?
B
A data scientist has developed and logged a scikit-learn random forest model model, and then they
ended their Spark session and terminated their cluster. After starting a new cluster, they want to
review the feature_importances_ of the original model object.
Which of the following lines of code can be used to restore the model object so that
feature_importances_ is available?
A
Which of the following is a simple statistic to monitor for categorical feature drift?
C
Which of the following is a probable response to identifying drift in a machine learning application?
A
A data scientist has computed updated feature values for all primary key values stored in the Feature
Store table features. In addition, feature values for some new primary key values have also been
computed. The updated feature values are stored in the DataFrame features_df. They want to
replace all data in features with the newly computed data.
Which of the following code blocks can they use to perform this task using the Feature Store Client
fs?
A)
B)
C)
D)
E)
E