Which module should you use for each requirement?

DRAG DROP
Note: This question is part of a series of questions that use the same scenario. For your convenience, the
scenario is repeated in each question. Each question presents a different goal and answer choices, but the text
of the scenario is exactly the same in each question in this series.
A travel agency named Margie’s Travel sells airline tickets to customers in the United States.
Margie’s Travel wants you to provide insights and predictions on flight delays. The agency is considering
implementing a system that will communicate to its customers as the flight departure nears about possible
delays due to weather conditions. The flight data contains the following attributes:
DepartureDate: The departure date aggregated at a per hour granularity
Carrier: The code assigned by the IATA and commonly used to identify a carrier
OriginAitportID: An identification number assigned by the USDOT to identify a unique airport (the flight’s
origin)
DestAirportID: An identification number assigned by the USDOT to identify a unique airport (the flight’s
destination)
DepDel: The departure delay in minutes
DepDel30: A Boolean value indicating whether the departure was delayed by 30 minutes or more (a value of
1 indicates that the departure was delayed by 30 minutes or more)
The weather data contains the following attributes: AirportID, ReadingDate (YYYY/MM/DD HH),
SkyConditionVisibility, WeatherType, WindSpeed, StationPressure, PressureChange, and HourlyPrecip.
You need to remove the bias and to identify the columns in the input dataset that have the greatest predictive
power.
Which module should you use for each requirement? To answer, drag the appropriate modules to the correctrequirements. Each module may be used once, more than once, or not at all. You may need to drag the split
bar between panes or scroll to view content.
NOTE: Each correct selection is worth one point.
Select and Place:

Answer:

Explanation:
https://gallery.cortanaintelligence.com/Experiment/Binary-Classification-Flight-delay-prediction-3
https://msdn.microsoft.com/library/azure/038d91b6-c2f2-42a1-9215-1f2c20ed1b40

Show Hint

← Previous question

Next question →

Leave a Reply 4

Ruchita

Filter Based Feature Selection module to identify columns that have greatest predictive power.

Reply

rai

Filter Based Feature Selection
・Used to identify the columns in your input dataset that have the greatest predictive power.

Reply

Klaus Wolz

New 70-774 Exam Questions and Answers Updated Recently (27/Dec/2017):

NEW QUESTION 1
You have an Azure Machine Learning environment. You are evaluating whether to use R code or Python. Which three actions can you perform by using both R code and Python in the Machine Learning environment? (Each correct answer presents a complete solution. Choose three.)

A. Preprocess, cleanse, and group data.
B. Score a training model.
C. Create visualizations.
D. Create an untrained model that can be used with the Train Model module.
E. Implement feature ranking.

Answer: ABC

NEW QUESTION 2
Note: This question is part of a series of questions that use the same scenario. For your convenience, the scenario is repeated in each question. Each question presents a different goal and answer choices, but the text of the scenario is exactly the same in each question in this series.
You plan to create a predictive analytics solution for credit risk assessment and fraud prediction in Azure Machine Learning. The Machine Learning workspace for the solution will be shared with other users in your organization. You will add assets to projects and conduct experiments in the workspace. The experiments will be used for training models that will be published to provide scoring from web services. The experiment for fraud prediction will use Machine Learning modules and APIs to train the models and will predict probabilities in an Apache Hadoop ecosystem. You need to alter the list of columns that will be used for predicting fraud for an input web service endpoint. The columns from the original data source must be retained while running the Machine Learning experiment. Which module should you add after the web service input module and before the prediction module?

A. Edit Metadata
B. Import Data
C. SMOTE
D. Select Columns in Dataset

Answer: D

NEW QUESTION 3
Note: This question is part of a series of questions that use the same or similar answer choices. An answer choice may be correct for more than one question in the series. Each question is independent of the other questions in this series. Information and details provided in a question apply only to that question.
You need to remove rows that have an empty value in a specific column. The solution must use a native module. Which module should you use?

A. Execute Python Script
B. Tune Model Hyperparameters
C. Normalize Data
D. Select Columns in Dataset
E. Import Data
F. Edit Metadata
G. Clip Values
H. Clean Missing Data

Answer: H

NEW QUESTION 4
You need to integrate code and formatted text into an Azure Machine Learning experiment that enables interactive execution. What should you use?

A. A Jupyter notebook
B. Azure Stream Analytics
C. An Execute Python Script module
D. An Execute R Script module

Answer: A

NEW QUESTION 5
Note: This question is part of a series of questions that use the same or similar answer choices. An answer choice may be correct for more than one question in the series. Each question is independent of the other questions in this series. Information and details provided in a question apply only to that question.
You have a non-tabular file that is saved in Azure Blob storage. You need to download the file locally, access the data in the file, and then format the data as a dataset. Which module should you use?

A. Execute Python Script
B. Tune Model Hyperparameters
C. Normalize Data
D. Select Columns in Dataset
E. Import Data
F. Edit Metadata
G. Clip Values
H. Clean Missing Data

Answer: E

NEW QUESTION 6
You are performing exploratory analysis of files that are encoded in a complex proprietary format. The format requires disk intensive access to several dependent files in HDFS. You need to build an Azure Machine Learning model by using a canopy clustering algorithm. You must ensure that changes to proprietary file formats can be maintained by using the least amount of effort. Which Machine Learning library should you use?

A. MicrosoftML
B. Scikit-learn
C. SparkR
D. Mahout

Answer: D

NEW QUESTION 7
You plan to use the Data Science Virtual Machine for development, but you are unfamiliar with R scripts. You need to generate R code for an experiment. Which IDE should you use?

A. XgBoost
B. Rattle
C. Vowpal Wabbit
D. R Tools for Visual Studio

Answer: B

NEW QUESTION 8
You are building an Azure Machine Learning workflow by using Azure Machine Learning Studio. You create an Azure notebook that supports the Microsoft Cognitive Toolkit. You need to ensure that the stochastic gradient descent (SGD) configuration maximizes the samples per second and supports parallel modeling that is managed by a parameter server. Which SGD algorithm should you use?

A. DataParallelASGD
B. DataParallelSGD
C. ModelAveragingSGD
D. BlockMomentumSGD

Answer: B

NEW QUESTION 9
You are building an Azure Machine Learning experiment. You need to transform a string column that has 47 distinct values into a binary indicator column. The solution must use the One-vs-All Multiclass model. Which module should you use?

A. Select Column Transform
B. Convert to Indicator Values
C. Group Categorical Values
D. Edit Metadata

Answer: B

NEW QUESTION 10
You are analyzing taxi trips in New York City. You leverage the Azure Data Factory to create data pipelines and to orchestrate data movement. You plan to develop a predictive model for 170 million rows (37 GB) of raw data in Apache Hive by using Microsoft R Server to identify which factors contribute to the passenger tipping behavior. All of the platforms that are used for the analysis are the same. Each worker node has eight processor cores and 26 GB of memory. Which type of Azure HDInsight cluster should you use to produce results as quickly as possible?

A. Hadoop
B. HBase
C. Interactive Hive
D. Spark

Answer: C

NEW QUESTION 11
……

P.S. These New 70-774 Exam Questions Were Just Updated From The Real 70-774 Exam, You Can Get The Newest 70-774 Dumps In PDF And VCE From — https://www.passleader.com/70-774.html (45q VCE and PDF)

Good Luck!

Reply

Klaus Wolz

Besides, part of that new 45Q 70-774 dumps are available here:

https://drive.google.com/open?id=1kasCFEWbbVbNGXhtQ5AfwMOgYaKryNdH

Best Regards!

Reply