amazon AWS Certified Machine Learning Engineer - Associate practice test

Last exam update: Nov 18 ,2025
Page 1 out of 6. Viewing questions 1-15 out of 85

Question 1

Case Study
A company is building a web-based AI application by using Amazon SageMaker. The application will
provide the following capabilities and features: ML experimentation, training, a
central model registry, model deployment, and model monitoring.
The application must ensure secure and isolated use of training data during the ML lifecycle. The
training data is stored in Amazon S3.
The company needs to use the central model registry to manage different versions of models in the
application.
Which action will meet this requirement with the LEAST operational overhead?

  • A. Create a separate Amazon Elastic Container Registry (Amazon ECR) repository for each model.
  • B. Use Amazon Elastic Container Registry (Amazon ECR) and unique tags for each model version.
  • C. Use the SageMaker Model Registry and model groups to catalogthe models.
  • D. Use the SageMaker Model Registry and unique tags for each model version.
Mark Question:
Answer:

C


User Votes:
A
50%
B
50%
C
50%
D
50%
Discussions
vote your answer:
A
B
C
D
0 / 1000

Question 2

Case Study
A company is building a web-based AI application by using Amazon SageMaker. The application will
provide the following capabilities and features: ML experimentation, training, a
central model registry, model deployment, and model monitoring.
The application must ensure secure and isolated use of training data during the ML lifecycle. The
training data is stored in Amazon S3.
The company is experimenting with consecutive training jobs.
How can the company MINIMIZE infrastructure startup times for these jobs?

  • A. Use Managed Spot Training.
  • B. Use SageMaker managed warm pools.
  • C. Use SageMaker Training Compiler.
  • D. Use the SageMaker distributed data parallelism (SMDDP) library.
Mark Question:
Answer:

B


User Votes:
A
50%
B
50%
C
50%
D
50%
Discussions
vote your answer:
A
B
C
D
0 / 1000

Question 3

Case Study
A company is building a web-based AI application by using Amazon SageMaker. The application will
provide the following capabilities and features: ML experimentation, training, a
central model registry, model deployment, and model monitoring.
The application must ensure secure and isolated use of training data during the ML lifecycle. The
training data is stored in Amazon S3.
The company must implement a manual approval-based workflow to ensure that only approved
models can be deployed to production endpoints.
Which solution will meet this requirement?

  • A. Use SageMaker Experiments to facilitate the approval process during model registration.
  • B. Use SageMaker ML Lineage Tracking on the central model registry. Create tracking entities for the approval process.
  • C. Use SageMaker Model Monitor to evaluate the performance of the model and to manage the approval.
  • D. Use SageMaker Pipelines. When a model version is registered, use the AWS SDK to change the approval status to "Approved."
Mark Question:
Answer:

D


User Votes:
A
50%
B
50%
C
50%
D
50%
Discussions
vote your answer:
A
B
C
D
0 / 1000

Question 4

Case Study
A company is building a web-based AI application by using Amazon SageMaker. The application will
provide the following capabilities and features: ML experimentation, training, a
central model registry, model deployment, and model monitoring.
The application must ensure secure and isolated use of training data during the ML lifecycle. The
training data is stored in Amazon S3.
The company needs to run an on-demand workflow to monitor bias drift for models that are
deployed to real-time endpoints from the application.
Which action will meet this requirement?

  • A. Configure the application to invoke an AWS Lambda function that runs a SageMaker Clarify job.
  • B. Invoke an AWS Lambda function to pull the sagemaker-model-monitor-analyzer built-in SageMaker image.
  • C. Use AWS Glue Data Quality to monitor bias.
  • D. Use SageMaker notebooks to compare the bias.
Mark Question:
Answer:

A


User Votes:
A
50%
B
50%
C
50%
D
50%
Discussions
vote your answer:
A
B
C
D
0 / 1000

Question 5

HOTSPOT
A company stores historical data in .csv files in Amazon S3. Only some of the rows and columns in the
.csv files are populated. The columns are not labeled. An ML
engineer needs to prepare and store the data so that the company can use the data to train ML
models.
Select and order the correct steps from the following list to perform this task. Each step should be
selected one time or not at all. (Select and order three.)
• Create an Amazon SageMaker batch transform job for data cleaning and feature engineering.
• Store the resulting data back in Amazon S3.
• Use Amazon Athena to infer the schemas and available columns.
• Use AWS Glue crawlers to infer the schemas and available columns.
• Use AWS Glue DataBrew for data cleaning and feature engineering.

Mark Question:
Answer:


Explanation:
Step 1: Use AWS Glue crawlers to infer the schemas and available columns.
Step 2: Use AWS Glue DataBrew for data cleaning and feature engineering.
Step 3: Store the resulting dat back in Amazon 89

User Votes:
Discussions
vote your answer:
0 / 1000

Question 6

HOTSPOT
An ML engineer needs to use Amazon SageMaker Feature Store to create and manage features to
train a model.
Select and order the steps from the following list to create and use the features in Feature Store.
Each step should be selected one time. (Select and order three.)
• Access the store to build datasets for training.
• Create a feature group.
• Ingest the records.

Mark Question:
Answer:


Explanation:
Step 1: Create a feature group
Step 2: Ingest the records.
Step 3: Access the store to build datasets for training.

User Votes:
Discussions
vote your answer:
0 / 1000

Question 7

HOTSPOT
A company wants to host an ML model on Amazon SageMaker. An ML engineer is configuring a
continuous integration and continuous delivery (Cl/CD) pipeline in AWS CodePipeline to deploy the
model. The pipeline must run automatically when new training data for the model is uploaded to an
Amazon S3 bucket.
Select and order the pipeline's correct steps from the following list. Each step should be selected one
time or not at all. (Select and order three.)
• An S3 event notification invokes the pipeline when new data is uploaded.
• S3 Lifecycle rule invokes the pipeline when new data is uploaded.
• SageMaker retrains the model by using the data in the S3 bucket.
• The pipeline deploys the model to a SageMaker endpoint.
• The pipeline deploys the model to SageMaker Model Registry.

Mark Question:
Answer:


Explanation:
Step 1: An S3 event notification invokes the pipeline when new data is uploaded.
Step 2: SageMaker retains the model by using the data in the S3 bucket.
Step 3: The pipeline deploys teh model to a SageMker endpoint

User Votes:
Discussions
vote your answer:
0 / 1000

Question 8

HOTSPOT
An ML engineer is building a generative AI application on Amazon Bedrock by using large language
models (LLMs).
Select the correct generative AI term from the following list for each description. Each term should
be selected one time or not at all. (Select three.)
• Embedding
• Retrieval Augmented Generation (RAG)
• Temperature
• Token

Mark Question:
Answer:


Explanation:
Step 1: Token
Step 2: Embedding
Step 3: Retrieval Augmented Generation (RAG)

User Votes:
Discussions
vote your answer:
0 / 1000

Question 9

HOTSPOT
An ML engineer is working on an ML model to predict the prices of similarly sized homes. The model
will base predictions on several features The ML engineer will use the following feature engineering
techniques to estimate the prices of the homes:
• Feature splitting
• Logarithmic transformation
• One-hot encoding
• Standardized distribution
Select the correct feature engineering techniques for the following list of features. Each feature
engineering technique should be selected one time or not at all (Select three.)

Mark Question:
Answer:


Explanation:
Step 1: One-hot encoding
Step 2: Feature splitting
Step 3: Standardized distribution

User Votes:
Discussions
vote your answer:
0 / 1000

Question 10

Case study
An ML engineer is developing a fraud detection model on AWS. The training dataset includes
transaction logs, customer profiles, and tables from an on-premises MySQL database. The
transaction logs and customer profiles are stored in Amazon S3.
The dataset has a class imbalance that affects the learning of the model's algorithm. Additionally,
many of the features have interdependencies. The algorithm is not capturing all the desired
underlying patterns in the data.
Which AWS service or feature can aggregate the data from the various data sources?

  • A. Amazon EMR Spark jobs
  • B. Amazon Kinesis Data Streams
  • C. Amazon DynamoDB
  • D. AWS Lake Formation
Mark Question:
Answer:

A


User Votes:
A
50%
B
50%
C
50%
D
50%
Discussions
vote your answer:
A
B
C
D
0 / 1000

Question 11

Case study
An ML engineer is developing a fraud detection model on AWS. The training dataset includes
transaction logs, customer profiles, and tables from an on-premises MySQL database. The
transaction logs and customer profiles are stored in Amazon S3.
The dataset has a class imbalance that affects the learning of the model's algorithm. Additionally,
many of the features have interdependencies. The algorithm is not capturing all the desired
underlying patterns in the data.
After the data is aggregated, the ML engineer must implement a solution to automatically detect
anomalies in the data and to visualize the result.
Which solution will meet these requirements?

  • A. Use Amazon Athena to automatically detect the anomalies and to visualize the result.
  • B. Use Amazon Redshift Spectrum to automatically detect the anomalies. Use Amazon QuickSight to visualize the result.
  • C. Use Amazon SageMaker Data Wrangler to automatically detect the anomalies and to visualize the result.
  • D. Use AWS Batch to automatically detect the anomalies. Use Amazon QuickSight to visualize the result.
Mark Question:
Answer:

C


User Votes:
A
50%
B
50%
C
50%
D
50%
Discussions
vote your answer:
A
B
C
D
0 / 1000

Question 12

Case study
An ML engineer is developing a fraud detection model on AWS. The training dataset includes
transaction logs, customer profiles, and tables from an on-premises MySQL database. The
transaction logs and customer profiles are stored in Amazon S3.
The dataset has a class imbalance that affects the learning of the model's algorithm. Additionally,
many of the features have interdependencies. The algorithm is not capturing all the desired
underlying patterns in the data.
The training dataset includes categorical data and numerical dat
a. The ML engineer must prepare the training dataset to maximize the accuracy of the model.
Which action will meet this requirement with the LEAST operational overhead?

  • A. Use AWS Glue to transform the categorical data into numerical data.
  • B. Use AWS Glue to transform the numerical data into categorical data.
  • C. Use Amazon SageMaker Data Wrangler to transform the categorical data into numerical data.
  • D. Use Amazon SageMaker Data Wrangler to transform the numerical data into categorical data.
Mark Question:
Answer:

C


User Votes:
A
50%
B
50%
C
50%
D
50%
Discussions
vote your answer:
A
B
C
D
0 / 1000

Question 13

Case study
An ML engineer is developing a fraud detection model on AWS. The training dataset includes
transaction logs, customer profiles, and tables from an on-premises MySQL database. The
transaction logs and customer profiles are stored in Amazon S3.
The dataset has a class imbalance that affects the learning of the model's algorithm. Additionally,
many of the features have interdependencies. The algorithm is not capturing all the desired
underlying patterns in the data.
Before the ML engineer trains the model, the ML engineer must resolve the issue of the imbalanced
data.
Which solution will meet this requirement with the LEAST operational effort?

  • A. Use Amazon Athena to identify patterns that contribute to the imbalance. Adjust the dataset accordingly.
  • B. Use Amazon SageMaker Studio Classic built-in algorithms to process the imbalanced dataset.
  • C. Use AWS Glue DataBrew built-in features to oversample the minority class.
  • D. Use the Amazon SageMaker Data Wrangler balance data operation to oversample the minority class.
Mark Question:
Answer:

D


User Votes:
A
50%
B
50%
C
50%
D
50%
Discussions
vote your answer:
A
B
C
D
0 / 1000

Question 14

Case study
An ML engineer is developing a fraud detection model on AWS. The training dataset includes
transaction logs, customer profiles, and tables from an on-premises MySQL database. The
transaction logs and customer profiles are stored in Amazon S3.
The dataset has a class imbalance that affects the learning of the model's algorithm. Additionally,
many of the features have interdependencies. The algorithm is not capturing all the desired
underlying patterns in the data.
The ML engineer needs to use an Amazon SageMaker built-in algorithm to train the model.
Which algorithm should the ML engineer use to meet this requirement?

  • A. LightGBM
  • B. Linear learner
  • C. К-means clustering
  • D. Neural Topic Model (NTM)
Mark Question:
Answer:

B


User Votes:
A
50%
B
50%
C
50%
D
50%
Discussions
vote your answer:
A
B
C
D
0 / 1000

Question 15

A company has deployed an XGBoost prediction model in production to predict if a customer is likely
to cancel a subscription. The company uses Amazon SageMaker Model Monitor to detect deviations
in the F1 score.
During a baseline analysis of model quality, the company recorded a threshold for the F1 score. After
several months of no change, the model's F1 score decreases significantly.
What could be the reason for the reduced F1 score?

  • A. Concept drift occurred in the underlying customer data that was used for predictions.
  • B. The model was not sufficiently complex to capture all the patterns in the original baseline data.
  • C. The original baseline data had a data quality issue of missing values.
  • D. Incorrect ground truth labels were provided to Model Monitor during the calculation of the baseline.
Mark Question:
Answer:

A


User Votes:
A
50%
B
50%
C
50%
D
50%
Discussions
vote your answer:
A
B
C
D
0 / 1000
To page 2