HOTSPOT
Your company has 2,000 servers.
You plan to aggregate all of the log files from the servers in a central repository that uses Microsoft Azure
HDInsight. Each log file contains approximately one million records. All of the files use the .log file name
extension.
The following is a sample of the entries in the log files.
2017-02-03 20:26:41 SampleClass3 [ERROR] verbose detail for id 1527353937
In Apache Hive, you need to create a data definition and a query capturing the number of records that have anerror level of [ERROR].
What should you do? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:
Explanation:
Box 1: table
Box 2: /t
Apache Hive example:
CREATE TABLE raw (line STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘\\t’ LINES TERMINATED BY ‘\\n’;
Box 3: count(*)
Box 4: ‘*.log’
https://cwiki.apache.org/confluence/display/Hive/CompressedStorage
Last should be ‘%.log’, I think because it is treated as a column name.