An AI practitioner trained a custom model on Amazon Bedrock by using a training dataset that
contains confidential dat
a. The AI practitioner wants to ensure that the custom model does not generate inference responses
based on confidential data.
How should the AI practitioner prevent responses based on confidential data?
A
Explanation:
When a model is trained on a dataset containing confidential or sensitive data, the model may
inadvertently learn patterns from this data, which could then be reflected in its inference responses.
To ensure that a model does not generate responses based on confidential data, the most effective
approach is to remove the confidential data from the training dataset and then retrain the model.
Explanation of Each Option:
Option A (Correct): "Delete the custom model. Remove the confidential data from the training
dataset. Retrain the custom model."This option is correct because it directly addresses the core issue:
the model has been trained on confidential data. The only way to ensure that the model does not
produce inferences based on this data is to remove the confidential information from the training
dataset and then retrain the model from scratch. Simply deleting the model and retraining it ensures
that no confidential data is learned or retained by the model. This approach follows the best
practices recommended by AWS for handling sensitive data when using machine learning services
like Amazon Bedrock.
Option B: "Mask the confidential data in the inference responses by using dynamic data
masking."This option is incorrect because dynamic data masking is typically used to mask or
obfuscate sensitive data in a database. It does not address the core problem of the model
beingtrained on confidential data. Masking data in inference responses does not prevent the model
from using confidential data it learned during training.
Option C: "Encrypt the confidential data in the inference responses by using Amazon
SageMaker."This option is incorrect because encrypting the inference responses does not prevent the
model from generating outputs based on confidential data. Encryption only secures the data at rest
or in transit but does not affect the model's underlying knowledge or training process.
Option D: "Encrypt the confidential data in the custom model by using AWS Key Management Service
(AWS KMS)."This option is incorrect as well because encrypting the data within the model does not
prevent the model from generating responses based on the confidential data it learned during
training. AWS KMS can encrypt data, but it does not modify the learning that the model has already
performed.
AWS AI Practitioner Reference:
Data Handling Best Practices in AWS Machine Learning: AWS advises practitioners to carefully handle
training data, especially when it involves sensitive or confidential information. This includes
preprocessing steps like data anonymization or removal of sensitive data before using it to train
machine learning models.
Amazon Bedrock and Model Training Security: Amazon Bedrock provides foundational models and
customization capabilities, but any training involving sensitive data should follow best practices, such
as removing or anonymizing confidential data to prevent unintended data leakage.
Which feature of Amazon OpenSearch Service gives companies the ability to build vector database
applications?
C
Explanation:
Amazon OpenSearch Service (formerly Amazon Elasticsearch Service) has introduced capabilities to
support vector search, which allows companies to build vector database applications. This is
particularly useful in machine learning, where vector representations (embeddings) of data are often
used to capture semantic meaning.
Scalable index management and nearest neighbor search capability are the core features enabling
vector database functionalities in OpenSearch. The service allows users to index high-dimensional
vectors and perform efficient nearest neighbor searches, which are crucial for tasks such as
recommendation systems, anomaly detection, and semantic search.
Here is why option C is the correct answer:
Scalable Index Management: OpenSearch Service supports scalable indexing of vector data. This
means you can index a large volume of high-dimensional vectors and manage these indexes in a cost-
effective and performance-optimized way. The service leverages underlying AWS infrastructure to
ensure that indexing scales seamlessly with data size.
Nearest Neighbor Search Capability: OpenSearch Service's nearest neighbor search capability allows
for fast and efficient searches over vector data. This is essential for applications like product
recommendation engines, where the system needs to quickly find the most similar items based on a
user's query or behavior.
AWS AI Practitioner Reference:
According to AWS documentation, OpenSearch Service's support for nearest neighbor search using
vector embeddings is a key feature for companies building machine learning applications that
require similarity search.
The service uses Approximate Nearest Neighbors (ANN) algorithms to speed up searches over large
datasets, ensuring high performance even with large-scale vector data.
The other options do not directly relate to building vector database applications:
A . Integration with Amazon S3 for object storage is about storing data objects, not vector-based
searching or indexing.
B . Support for geospatial indexing and queries is related to location-based data, not vectors used in
machine learning.
D . Ability to perform real-time analysis on streaming data relates to analyzing incoming data
streams, which is different from the vector search capabilities.
A company wants to display the total sales for its top-selling products across various retail locations
in the past 12 months.
Which AWS solution should the company use to automate the generation of graphs?
C
Explanation:
Amazon QuickSight is a fully managed business intelligence (BI) service that allows users to create
and publish interactive dashboards that include visualizations like graphs, charts, and tables.
"Amazon Q" is the natural language query feature within Amazon QuickSight. It enables users to ask
questions about their data in natural language and receive visual responses such as graphs.
Option C (Correct): "Amazon Q in Amazon QuickSight": This is the correct answer because Amazon
QuickSight Q is specifically designed to allow users to explore their data through natural language
queries, and it can automatically generate graphs to display sales data and other metrics. This makes
it an ideal choice for the company to automate the generation of graphs showing total sales for its
top-selling products across various retail locations.
Option A, B, and D: These options are incorrect:
A . Amazon Q in Amazon EC2: Amazon EC2 is a compute service that provides virtual servers, but it is
not directly related to generating graphs or providing natural language querying features.
B . Amazon Q Developer: This is not an existing AWS service or feature.
D . Amazon Q in AWS Chatbot: AWS Chatbot is a service that integrates with Amazon Chime and
Slack for monitoring and managing AWS resources, but it is not used for generating graphs based on
sales data.
AWS AI Practitioner Reference:
Amazon QuickSight Q is designed to provide insights from data by using natural language queries,
making it a powerful tool for generating automated graphs and visualizations directly from queried
data.
Business Intelligence (BI) on AWS: AWS services such as Amazon QuickSight provide business
intelligence capabilities, including automated reporting and visualization features, which are ideal
for companies seeking to visualize data like sales trends over time.
A company wants to build an interactive application for children that generates new stories based on
classic stories. The company wants to use Amazon Bedrock and needs to ensure that the results and
topics are appropriate for children.
Which AWS service or feature will meet these requirements?
C
Explanation:
Amazon Bedrock is a service that provides foundational models for building generative AI
applications. When creating an application for children, it is crucial to ensure that the generated
content is appropriate for the target audience. "Guardrails" in Amazon Bedrock provide mechanisms
to control the outputs and topics of generated content to align with desired safety standards and
appropriateness levels.
Option C (Correct): "Guardrails for Amazon Bedrock": This is the correct answer because guardrails
are specifically designed to help users enforce content moderation, filtering, and safety checks on
the outputs generated by models in Amazon Bedrock. For a children’s application, guardrails ensure
that all content generated is suitable and appropriate for the intended audience.
Option A: "Amazon Rekognition" is incorrect. Amazon Rekognition is an image and video analysis
service that can detect inappropriate content in images or videos, but it does not handle text or story
generation.
Option B: "Amazon Bedrock playgrounds" is incorrect because playgrounds are environments for
experimenting and testing model outputs, but they do not inherently provide safeguards to ensure
content appropriateness for specific audiences, such as children.
Option D: "Agents for Amazon Bedrock" is incorrect. Agents in Amazon Bedrock facilitate building AI
applications with more interactive capabilities, but they do not provide specific guardrails for
ensuring content appropriateness for children.
AWS AI Practitioner Reference:
Guardrails in Amazon Bedrock: Designed to help implement controls that ensure generated content
is safe and suitable for specific use cases or audiences, such as children, by moderating and filtering
inappropriate or undesired content.
Building Safe AI Applications: AWS provides guidance on implementing ethical AI practices, including
using guardrails to protect against generating inappropriate or biased content.
A company has developed an ML model for image classification. The company wants to deploy the
model to production so that a web application can use the model.
The company needs to implement a solution to host the model and serve predictions without
managing any of the underlying infrastructure.
Which solution will meet these requirements?
A
Explanation:
Amazon SageMaker Serverless Inference is the correct solution for deploying an ML model to
production in a way that allows a web application to use the model without the need to manage the
underlying infrastructure.
Amazon SageMaker Serverless Inference provides a fully managed environment for deploying
machine learning models. It automatically provisions, scales, and manages the infrastructure
required to host the model, removing the need for the company to manage servers or other
underlying infrastructure.
Why Option A is Correct:
No Infrastructure Management: SageMaker Serverless Inference handles the infrastructure
management for deploying and serving ML models. The company can simply provide the model and
specify the required compute capacity, and SageMaker will handle the rest.
Cost-Effectiveness: The serverless inference option is ideal for applications with intermittent or
unpredictable traffic, as the company only pays for the compute time consumed while handling
requests.
Integration with Web Applications: This solution allows the model to be easily accessed by web
applications via RESTful APIs, making it an ideal choice for hosting the model and serving predictions.
Why Other Options are Incorrect:
B . Use Amazon CloudFront to deploy the model: CloudFront is a content delivery network (CDN)
service for distributing content, not for deploying ML models or serving predictions.
C . Use Amazon API Gateway to host the model and serve predictions: API Gateway is used for
creating, deploying, and managing APIs, but it does not provide the infrastructure or the required
environment to host and run ML models.
D . Use AWS Batch to host the model and serve predictions: AWS Batch is designed for running batch
computing workloads and is not optimized for real-time inference or hosting machine learning
models.
Thus, A is the correct answer, as it aligns with the requirement of deploying an ML model without
managing any underlying infrastructure.
A company has petabytes of unlabeled customer data to use for an advertisement campaign. The
company wants to classify its customers into tiers to advertise and promote the company's products.
Which methodology should the company use to meet these requirements?
B
Explanation:
Unsupervised learning is the correct methodology for classifying customers into tiers when the data
is unlabeled, as it does not require predefined labels or outputs.
Unsupervised Learning:
This type of machine learning is used when the data has no labels or pre-defined categories. The goal
is to identify patterns, clusters, or associations within the data.
In this case, the company has petabytes of unlabeled customer data and needs to classify customers
into different tiers. Unsupervised learning techniques like clustering (e.g., K-Means, Hierarchical
Clustering) can group similar customers based on various attributes without any prior knowledge or
labels.
Why Option B is Correct:
Handling Unlabeled Data: Unsupervised learning is specifically designed to work with unlabeled
data, making it ideal for the company’s need to classify customer data.
Customer Segmentation: Techniques in unsupervised learning can be used to find natural groupings
within customer data, such as identifying high-value vs. low-value customers or segmenting based on
purchasing behavior.
Why Other Options are Incorrect:
A . Supervised learning: Requires labeled data with input-output pairs to train the model, which is
not suitable since the company's data is unlabeled.
C . Reinforcement learning: Focuses on training an agent to make decisions by maximizing some
notion of cumulative reward, which does not align with the company's need for customer
classification.
D . Reinforcement learning from human feedback (RLHF): Similar to reinforcement learning but
involves human feedback to refine the model’s behavior; it is also not appropriate for classifying
unlabeled customer data.
A company makes forecasts each quarter to decide how to optimize operations to meet expected
demand. The company uses ML models to make these forecasts.
An AI practitioner is writing a report about the trained ML models to provide transparency and
explainability to company stakeholders.
What should the AI practitioner include in the report to meet the transparency and explainability
requirements?
B
Explanation:
Partial dependence plots (PDPs) are visual tools used to show the relationship between a feature (or
a set of features) in the data and the predicted outcome of a machine learning model. They are
highly effective for providing transparency and explainability of the model's behavior to stakeholders
by illustrating how different input variables impact the model's predictions.
Option B (Correct): "Partial dependence plots (PDPs)": This is the correct answer because PDPs help
to interpret how the model's predictions change with varying values of input features, providing
stakeholders with a clearer understanding of the model's decision-making process.
Option A: "Code for model training" is incorrect because providing the raw code for model training
may not offer transparency or explainability to non-technical stakeholders.
Option C: "Sample data for training" is incorrect as sample data alone does not explain how the
model works or its decision-making process.
Option D: "Model convergence tables" is incorrect. While convergence tables can show the training
process, they do not provide insights into how input features affect the model's predictions.
AWS AI Practitioner Reference:
Explainability in AWS Machine Learning: AWS provides various tools for model explainability, such as
Amazon SageMaker Clarify, which includes PDPs to help explain the impact of different features on
the model’s predictions.
Which option is a use case for generative AI models?
B
Explanation:
Generative AI models are used to create new content based on existing data. One common use case
is generating photorealistic images from text descriptions, which is particularly useful in digital
marketing, where visual content is key to engaging potential customers.
Option B (Correct): "Creating photorealistic images from text descriptions for digital marketing": This
is the correct answer because generative AI models, like those offered by Amazon Bedrock, can
create images based on text descriptions, making them highly valuable for generating marketing
materials.
Option A: "Improving network security by using intrusion detection systems" is incorrect because this
is a use case for traditional machine learning models, not generative AI.
Option C: "Enhancing database performance by using optimized indexing" is incorrect as it is
unrelated to generative AI.
Option D: "Analyzing financial data to forecast stock market trends" is incorrect because it typically
involves predictive modeling rather than generative AI.
AWS AI Practitioner Reference:
Use Cases for Generative AI Models on AWS: AWS highlights the use of generative AI for creative
content generation, including image creation, text generation, and more, which is suited for digital
marketing applications.
An AI practitioner is using a large language model (LLM) to create content for marketing campaigns.
The generated content sounds plausible and factual but is incorrect.
Which problem is the LLM having?
B
Explanation:
In the context of AI, "hallucination" refers to the phenomenon where a model generates outputs that
are plausible-sounding but are not grounded in reality or the training data. This problemoften occurs
with large language models (LLMs) when they create information that sounds correct but is actually
incorrect or fabricated.
Option B (Correct): "Hallucination": This is the correct answer because the problem described
involves generating content that sounds factual but is incorrect, which is characteristic of
hallucination in generative AI models.
Option A: "Data leakage" is incorrect as it involves the model accidentally learning from data it
shouldn't have access to, which does not match the problem of generating incorrect content.
Option C: "Overfitting" is incorrect because overfitting refers to a model that has learned the training
data too well, including noise, and performs poorly on new data.
Option D: "Underfitting" is incorrect because underfitting occurs when a model is too simple to
capture the underlying patterns in the data, which is not the issue here.
AWS AI Practitioner Reference:
Large Language Models on AWS: AWS discusses the challenge of hallucination in large language
models and emphasizes techniques to mitigate it, such as using guardrails and fine-tuning.
A loan company is building a generative AI-based solution to offer new applicants discounts based on
specific business criteri
a. The company wants to build and use an AI model responsibly to minimize bias that could
negatively affect some customers.
Which actions should the company take to meet these requirements? (Select TWO.)
A,C
Explanation:
To build an AI model responsibly and minimize bias, it is essential to ensure fairness and
transparency throughout the model development and deployment process. This involves detecting
and mitigating data imbalances and thoroughly evaluating the model's behavior to understand its
impact on different groups.
Option A (Correct): "Detect imbalances or disparities in the data": This is correct because identifying
and addressing data imbalances or disparities is a critical step in reducing bias. AWS provides tools
like Amazon SageMaker Clarify to detect bias during data preprocessing and model training.
Option C (Correct): "Evaluate the model's behavior so that the company can provide transparency to
stakeholders": This is correct because evaluating the model's behavior for fairness and accuracy is
key to ensuring that stakeholders understand how the model makes decisions. Transparency is a
crucial aspect of responsible AI.
Option B: "Ensure that the model runs frequently" is incorrect because the frequency of model runs
does not address bias.
Option D: "Use the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) technique to ensure
that the model is 100% accurate" is incorrect because ROUGE is a metric for evaluating the quality of
text summarization models, not for minimizing bias.
Option E: "Ensure that the model's inference time is within the accepted limits" is incorrect as it
relates to performance, not bias reduction.
AWS AI Practitioner Reference:
Amazon SageMaker Clarify: AWS offers tools such as SageMaker Clarify for detecting bias in datasets
and models, and for understanding model behavior to ensure fairness and transparency.
Responsible AI Practices: AWS promotes responsible AI by advocating for fairness, transparency, and
inclusivity in model development and deployment.
A medical company is customizing a foundation model (FM) for diagnostic purposes. The company
needs the model to be transparent and explainable to meet regulatory requirements.
Which solution will meet these requirements?
B
Explanation:
Amazon SageMaker Clarify provides transparency and explainability for machine learning models by
generating metrics, reports, and examples that help to understand model predictions. For a medical
company that needs a foundation model to be transparent and explainable to meet regulatory
requirements, SageMaker Clarify is the most suitable solution.
Amazon SageMaker Clarify:
It helps in identifying potential bias in the data and model, and also explains model behavior by
generating feature attributions, providing insights into which features are most influential in the
model's predictions.
These capabilities are critical in medical applications where regulatory compliance often mandates
transparency and explainability to ensure that decisions made by the model can be trusted and
audited.
Why Option B is Correct:
Transparency and Explainability: SageMaker Clarify is explicitly designed to provide insights into
machine learning models' decision-making processes, helping meet regulatory requirements by
explaining why a model made a particular prediction.
Compliance with Regulations: The tool is suitable for use in sensitive domains, such as healthcare,
where there is a need for explainable AI.
Why Other Options are Incorrect:
A . Amazon Inspector: Focuses on security assessments, not on explainability or model transparency.
C . Amazon Macie: Provides data security by identifying and protecting sensitive data, but does not
help in making models explainable.
D . Amazon Rekognition: Used for image and video analysis, not relevant to making models
explainable.
Thus, B is the correct answer for meeting transparency and explainability requirements for the
foundation model
A company is building a solution to generate images for protective eyewear. The solution must have
high accuracy and must minimize the risk of incorrect annotations.
Which solution will meet these requirements?
A
Explanation:
Amazon SageMaker Ground Truth Plus is a managed data labeling service that includes human-in-
the-loop (HITL) validation. This solution ensures high accuracy by involving human reviewers to
validate the annotations and reduce the risk of incorrect annotations.
Amazon SageMaker Ground Truth Plus:
It allows for the creation of high-quality training datasets with human oversight, which minimizes
errors in labeling and increases accuracy.
Human-in-the-loop workflows help verify the correctness of annotations, ensuring that generated
images for protective eyewear meet high-quality standards.
Why Option A is Correct:
High Accuracy: Human-in-the-loop validation provides the ability to catch and correct errors in
annotations, ensuring high-quality data.
Minimized Risk of Incorrect Annotations: Human review adds a layer of quality assurance, which is
especially important in use cases like generating precise images for protective eyewear.
Why Other Options are Incorrect:
B . Amazon Bedrock: Does not offer a knowledge base for data augmentation; it focuses on running
foundation models.
C . Amazon Rekognition: Provides image recognition and analysis, not a solution for minimizing
annotation errors.
D . Amazon QuickSight: A data visualization tool, not relevant to image annotation or generation
tasks.
Thus, A is the correct answer for generating high-accuracy images with minimized annotation risks.
A security company is using Amazon Bedrock to run foundation models (FMs). The company wants to
ensure that only authorized users invoke the models. The company needs to identify any
unauthorized access attempts to set appropriate AWS Identity and Access Management (IAM)
policies and roles for future iterations of the FMs.
Which AWS service should the company use to identify unauthorized users that are trying to access
Amazon Bedrock?
B
Explanation:
AWS CloudTrail is a service that enables governance, compliance, and operational and risk auditing
of your AWS account. It tracks API calls and identifies unauthorized access attempts to AWS
resources, including Amazon Bedrock.
AWS CloudTrail:
Provides detailed logs of all API calls made within an AWS account, including those to Amazon
Bedrock.
Can identify unauthorized access attempts by logging and monitoring the API calls, which helps in
setting appropriate IAM policies and roles.
Why Option B is Correct:
Monitoring and Security: CloudTrail logs all access requests and helps detect unauthorized access
attempts.
Auditing and Compliance: The logs can be used to audit user activity and enforce security measures.
Why Other Options are Incorrect:
A . AWS Audit Manager: Used for automating audit preparation, not for tracking real-time
unauthorized access attempts.
C . Amazon Fraud Detector: Designed to detect fraudulent online activities, not unauthorized access
to AWS services.
D . AWS Trusted Advisor: Provides best practice recommendations for AWS resources, not access
monitoring.
Thus, B is the correct answer for identifying unauthorized users attempting to access Amazon
Bedrock.
A company manually reviews all submitted resumes in PDF format. As the company grows, the
company expects the volume of resumes to exceed the company's review capacity. The company
needs an automated system to convert the PDF resumes into plain text format for additional
processing.
Which AWS service meets this requirement?
A
Explanation:
Amazon Textract is a service that automatically extracts text and data from scanned documents,
including PDFs. It is the best choice for converting resumes from PDF format to plain text for further
processing.
Amazon Textract:
Extracts text, forms, and tables from scanned documents accurately.
Ideal for automating the process of converting PDF resumes into plain text format.
Why Option A is Correct:
Automation of Text Extraction: Textract is designed to handle large volumes of documents and
convert them into machine-readable text, perfect for the company's need.
Scalability and Efficiency: Supports scalability to handle a growing volume of resumes as the
company expands.
Why Other Options are Incorrect:
B . Amazon Personalize: Used for creating personalized recommendations, not for text extraction.
C . Amazon Lex: A service for building conversational interfaces, not for processing documents.
D . Amazon Transcribe: Used for converting speech to text, not for extracting text from documents.
A company wants to use large language models (LLMs) with Amazon Bedrock to develop a chat
interface for the company's product manuals. The manuals are stored as PDF files.
Which solution meets these requirements MOST cost-effectively?
A
Explanation:
Using Amazon Bedrock with large language models (LLMs) allows for efficient utilization of AI to
answer queries based on context provided in product manuals. To achieve this cost-effectively, the
company should avoid unnecessary use of resources.
Option A (Correct): "Use prompt engineering to add one PDF file as context to the user prompt when
the prompt is submitted to Amazon Bedrock": This is the most cost-effective solution. By using
prompt engineering, only the relevant content from one PDF file is added as context to each query.
This approach minimizes the amount of data processed, which helps in reducing costs associated
with LLMs' computational requirements.
Option B: "Use prompt engineering to add all the PDF files as context to the user prompt when the
prompt is submitted to Amazon Bedrock" is incorrect. Including all PDF files would increase costs
significantly due to the large context size processed by the model.
Option C: "Use all the PDF documents to fine-tune a model with Amazon Bedrock" is incorrect. Fine-
tuning a model is more expensive than using prompt engineering, especially if done for multiple
documents.
Option D: "Upload PDF documents to an Amazon Bedrock knowledge base" is incorrect because
Amazon Bedrock does not have a built-in knowledge base feature for directly managing and querying
PDF documents.
AWS AI Practitioner Reference:
Prompt Engineering for Cost-Effective AI: AWS emphasizes the importance of using prompt
engineering to minimize costs when interacting with LLMs. By carefully selecting relevant context,
users can reduce the amount of data processed and save on expenses.