Note: This question is part of a series of questions that use the same scenario. For your convenience, the
scenario is repeated in each question. Each question presents a different goal and answer choices, but the textof the scenario is exactly the same in each question in this series.
You plan to create a predictive analytics solution for credit risk assessment and fraud prediction in Azure
Machine Learning. The Machine Learning workspace for the solution will be shared with other users in your
organization. You will add assets to projects and conduct experiments in the workspace.
The experiments will be used for training models that will be published to provide scoring from web services.
The experiment for fraud prediction will use Machine Learning modules and APIs to train the models and will
predict probabilities in an Apache Hadoop ecosystem.
You plan to configure the resources for part of a workflow that will be used to preprocess data from files stored
in Azure Blob storage. You plan to use Python to preprocess and store the data in Hadoop.
You need to get the data into Hadoop as quickly as possible.
Which three actions should you perform? Each correct answer presents part of the solution.
NOTE: Each correct selection is worth one point.
A.
Create an Azure virtual machine (VM), and then configure MapReduce on the VM.
B.
Create an Azure HDInsight Hadoop cluster.
C.
Create an Azure virtual machine (VM), and then install an IPython Notebook server.
D.
Process the files by using Python to store the data to a Hadoop instance.
E.
Create the Machine learning experiment, and then add an Execute Python Script module.