Useful MLS-C01 Dumps (2024 V11
.pdf
keyboard_arrow_up
School
Educational Training Center *
*We aren’t endorsed by this school
Course
200
Subject
Industrial Engineering
Date
May 8, 2024
Type
Pages
84
Uploaded by DeaconFireQuetzal39 on coursehero.com
DUMPS
BASE
EXAM DUMPS
AMAZON
MLS-C01
28% OFF Automatically For You
AWS Certified Machine Learning - Specialty
1 / 84
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
1.A Machine Learning Specialist is working with multiple data sources containing
billions of records that need to be joined.
What feature engineering and model development approach should the Specialist
take with a dataset this large?
A. Use an Amazon SageMaker notebook for both feature engineering and model
development
B. Use an Amazon SageMaker notebook for feature engineering and Amazon ML for
model development
C. Use Amazon EMR for feature engineering and Amazon SageMaker SDK for model
development
D. Use Amazon ML for both feature engineering and model development.
Answer: C
Explanation:
Amazon EMR is a service that can process large amounts of data efficiently and cost-
effectively. It can run distributed frameworks such as Apache Spark, which can
perform feature engineering on big data. Amazon SageMaker SDK is a Python library
that can interact with Amazon SageMaker service to train and deploy machine
learning models. It can also use Amazon EMR as a data source for training data.
References:
Amazon EMR
Amazon SageMaker SDK
2.A Machine Learning Specialist has completed a proof of concept for a company
using a small data sample and now the Specialist is ready to implement an end-to-
end solution in AWS using Amazon SageMaker. The historical training data is stored
in Amazon RDS
Which approach should the Specialist use for training a model using that data?
A. Write a direct connection to the SQL database within the notebook and pull data in
B. Push the data from Microsoft SQL Server to Amazon S3 using an AWS Data
Pipeline and provide the S3 location within the notebook.
C. Move the data to Amazon DynamoDB and set up a connection to DynamoDB
within the notebook
to pull data in
D. Move the data to Amazon ElastiCache using AWS DMS and set up a connection
within the notebook to pull data in for fast access.
Answer: B
Explanation:
Pushing the data from Microsoft SQL Server to Amazon S3 using an AWS Data
Pipeline and providing the S3 location within the notebook is the best approach for
training a model using the data stored in Amazon RDS. This is because Amazon
SageMaker can directly access data from Amazon S3 and train models on it. AWS
Data Pipeline is a service that can automate the movement and transformation of
2 / 84
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
data between different AWS services. It can also use Amazon RDS as a data source
and Amazon S3 as a data destination. This way, the data can be transferred
efficiently and securely without writing any code within the notebook.
References:
Amazon SageMaker
AWS Data Pipeline
3.Which of the following metrics should a Machine Learning Specialist generally use
to compare/evaluate machine learning classification models against each other?
A. Recall
B. Misclassification rate
C. Mean absolute percentage error (MAPE)
D. Area Under the ROC Curve (AUC)
Answer: D
Explanation:
Area Under the ROC Curve (AUC) is a metric that measures the performance of a
binary classifier across all possible thresholds. It is also known as the probability that
a randomly chosen positive example will be ranked higher than a randomly chosen
negative example by the classifier. AUC is a good metric to compare different
classification models because it is independent of the class distribution and the
decision threshold. It also captures both the sensitivity (true positive rate) and the
specificity (true negative rate) of the model.
References:
AWS Machine Learning Specialty Exam Guide
AWS Machine Learning Specialty Sample Questions
4.A Machine Learning Specialist is using Amazon Sage Maker to host a model for a
highly available customer-facing application.
The Specialist has trained a new version of the model, validated it with historical data,
and now wants to deploy it to production To limit any risk of a negative customer
experience, the Specialist wants to be able to monitor the model and roll it back, if
needed
What is the SIMPLEST approach with the LEAST risk to deploy the model and roll it
back, if needed?
A. Create a SageMaker endpoint and configuration for the new model version.
Redirect production traffic to the new endpoint by updating the client configuration.
Revert traffic to the last version if the model does not perform as expected.
B. Create a SageMaker endpoint and configuration for the new model version.
Redirect production traffic to the new endpoint by using a load balancer Revert traffic
to the last version if the model does not perform as expected.
C. Update the existing SageMaker endpoint to use a new configuration that is
3 / 84
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
weighted to send 5% of the traffic to the new variant. Revert traffic to the last version
by resetting the weights if the model does not perform as expected.
D. Update the existing SageMaker endpoint to use a new configuration that is
weighted to send 100% of the traffic to the new variant Revert traffic to the last
version by resetting the weights if the model does not perform as expected.
Answer: C
Explanation:
Updating the existing SageMaker endpoint to use a new configuration that is weighted
to send 5% of the traffic to the new variant is the simplest approach with the least risk
to deploy the model and roll it back, if needed. This is because SageMaker supports
A/B testing, which allows the Specialist to compare the performance of different
model variants by sending a portion of the traffic to each variant. The Specialist can
monitor the metrics of each variant and adjust the weights accordingly. If the new
variant does not perform as expected, the Specialist can revert traffic to the last
version by resetting the weights to 100% for the old variant and 0% for the new
variant. This way, the Specialist can deploy the model without affecting the customer
experience and roll it back easily if needed.
References:
Amazon SageMaker
Deploying models to Amazon SageMaker hosting services
5.A manufacturing company has a large set of labeled historical sales data. The
manufacturer would like to predict how many units of a particular part should be
produced each quarter.
Which machine learning approach should be used to solve this problem?
A. Logistic regression
B. Random Cut Forest (RCF)
C. Principal component analysis (PCA)
D. Linear regression
Answer: D
Explanation:
Linear regression is a machine learning approach that can be used to solve this
problem. Linear regression is a supervised learning technique that can model the
relationship between one or more
input variables (features) and an output variable (target). In this case, the input
variables could be the historical sales data of the part, such as the quarter, the
demand, the price, the inventory, etc. The output variable could be the number of
units to be produced for the part. Linear regression can learn the coefficients
(weights) of the input variables that best fit the output variable, and then use them to
make predictions for new data. Linear regression is suitable for problems that involve
continuous and numeric output variables, such as predicting house prices, stock
prices, or sales volumes.
4 / 84
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
References:
AWS Machine Learning Specialty Exam Guide
Linear Regression
6.A manufacturing company has structured and unstructured data stored in an
Amazon S3 bucket A Machine Learning Specialist wants to use SQL to run queries
on this data.
Which solution requires the LEAST effort to be able to query this data?
A. Use AWS Data Pipeline to transform the data and Amazon RDS to run queries.
B. Use AWS Glue to catalogue the data and Amazon Athena to run queries
C. Use AWS Batch to run ETL on the data and Amazon Aurora to run the quenes
D. Use AWS Lambda to transform the data and Amazon Kinesis Data Analytics to run
queries
Answer: B
Explanation:
AWS Glue is a serverless data integration service that can catalogue, clean, enrich,
and move data between various data stores. Amazon Athena is an interactive query
service that can run SQL queries on data stored in Amazon S3. By using AWS Glue
to catalogue the data and Amazon Athena to run queries, the Machine Learning
Specialist can leverage the existing data in Amazon S3 without any additional data
transformation or loading. This solution requires the least effort compared to the other
options, which involve more complex and costly data processing and storage
services.
References: AWS Glue, Amazon Athena
7.A Machine Learning Specialist is packaging a custom ResNet model into a Docker
container so the company can leverage Amazon SageMaker for training. The
Specialist is using Amazon EC2 P3 instances to train the model and needs to
properly configure the Docker container to leverage the NVIDIA GPUs
What does the Specialist need to do1?
A. Bundle the NVIDIA drivers with the Docker image
B. Build the Docker container to be NVIDIA-Docker compatible
C. Organize the Docker container's file structure to execute on GPU instances.
D. Set the GPU flag in the Amazon SageMaker Create TrainingJob request body
Answer: B
Explanation:
To leverage the NVIDIA GPUs on Amazon EC2 P3 instances, the Machine Learning
Specialist needs to build the Docker container to be NVIDIA-Docker compatible.
NVIDIA-Docker is a tool that enables GPU-accelerated containers to run on Docker. It
automatically configures the container to access the NVIDIA drivers and libraries on
the host system. The Specialist does not need to bundle the NVIDIA drivers with the
5 / 84
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
Docker image, as they are already installed on the EC2 P3 instances. The Specialist
does not need to organize the Docker container’s file structure to execute on GPU
instances, as this is not relevant for GPU compatibility. The Specialist does not need
to set the GPU flag in the Amazon SageMaker Create TrainingJob request body, as
this is only required for using Elastic Inference accelerators, not EC2 P3 instances.
References: NVIDIA-Docker, Using GPU-Accelerated Containers, Using Elastic
Inference in Amazon SageMaker
8.A large JSON dataset for a project has been uploaded to a private Amazon S3
bucket. The Machine Learning Specialist wants to securely access and explore the
data from an Amazon SageMaker notebook instance A new VPC was created and
assigned to the Specialist
How can the privacy and integrity of the data stored in Amazon S3 be maintained
while granting access to the Specialist for analysis?
A. Launch the SageMaker notebook instance within the VPC with SageMaker-
provided internet access enabled Use an S3 ACL to open read privileges to the
everyone group
B. Launch the SageMaker notebook instance within the VPC and create an S3 VPC
endpoint for the notebook to access the data Copy the JSON dataset from Amazon
S3 into the ML storage volume on the SageMaker notebook instance and work
against the local dataset
C. Launch the SageMaker notebook instance within the VPC and create an S3 VPC
endpoint for the notebook to access the data Define a custom S3 bucket policy to only
allow requests from your VPC to access the S3 bucket
D. Launch the SageMaker notebook instance within the VPC with SageMaker-
provided internet access enabled. Generate an S3 pre-signed URL for access to data
in the bucket
Answer: C
Explanation:
The best way to maintain the privacy and integrity of the data stored in Amazon S3 is
to use a combination of VPC endpoints and S3 bucket policies. A VPC endpoint
allows the SageMaker notebook instance to access the S3 bucket without going
through the public internet. A bucket policy allows the S3 bucket owner to specify
which VPCs or VPC endpoints can access the bucket. This way, the data is protected
from unauthorized access and tampering. The other options are either insecure (A
and D) or inefficient (B).
References: Using Amazon S3 VPC Endpoints, Using Bucket Policies and User
Policies
9.Given the following confusion matrix for a movie classification model, what is the
true class frequency for Romance and the predicted class frequency for Adventure?
6 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help