Engineering Questions with Answers - Multiple Choice Questions

MCQs on Data Flow

1 - Question

________ is a programming model designed for processing large volumes of data in parallel by dividing the work into a set of independent tasks.
a) Hive
b) MapReduce
c) Pig
d) Lucene

View Answer

Answer: b
Explanation: MapReduce is the heart of hadoop.




2 - Question

Point out the correct statement.
a) Data locality means movement of the algorithm to the data instead of data to algorithm
b) When the processing is done on the data algorithm is moved across the Action Nodes rather than data to the algorithm
c) Moving Computation is expensive than Moving Data
d) None of the mentioned

View Answer

Answer: a
Explanation: Data flow framework possesses the feature of data locality.




3 - Question

The daemons associated with the MapReduce phase are ________ and task-trackers.
a) job-tracker
b) map-tracker
c) reduce-tracker
d) all of the mentioned

View Answer

Answer: a
Explanation: Map-Reduce jobs are submitted on job-tracker.




4 - Question

The JobTracker pushes work out to available _______ nodes in the cluster, striving to keep the work as close to the data as possible.
a) DataNodes
b) TaskTracker
c) ActionNodes
d) All of the mentioned

View Answer

Answer: b
Explanation: A heartbeat is sent from the TaskTracker to the JobTracker every few minutes to check its status whether the node is dead or alive.




5 - Question

Point out the wrong statement.
a) The map function in Hadoop MapReduce have the following general form:map:(K1, V1) -> list(K2, V2)
b) The reduce function in Hadoop MapReduce have the following general form: reduce: (K2, list(V2)) -> list(K3, V3)
c) MapReduce has a complex model of data processing: inputs and outputs for the map and reduce functions are key-value pairs
d) None of the mentioned

View Answer

Answer: c
Explanation: MapReduce is relatively simple model to implement in Hadoop.




6 - Question

InputFormat class calls the ________ function and computes splits for each file and then sends them to the jobtracker.
a) puts
b) gets
c) getSplits
d) all of the mentioned

View Answer

Answer: c
Explanation: InputFormat uses their storage locations to schedule map tasks to process them on the tasktrackers.




7 - Question

On a tasktracker, the map task passes the split to the createRecordReader() method on InputFormat to obtain a _________ for that split.
a) InputReader
b) RecordReader
c) OutputReader
d) None of the mentioned

View Answer

Answer: b
Explanation: The RecordReader loads data from its source and converts into key-value pairs suitable for reading by mapper.




8 - Question

The default InputFormat is __________ which treats each value of input a new value and the associated key is byte offset.
a) TextFormat
b) TextInputFormat
c) InputFormat
d) All of the mentioned

View Answer

Answer: b
Explanation: A RecordReader is little more than an iterator over records, and the map task uses one to generate record key-value pairs.




9 - Question

__________ controls the partitioning of the keys of the intermediate map-outputs.
a) Collector
b) Partitioner
c) InputFormat
d) None of the mentioned

View Answer

Answer: b
Explanation: The output of the mapper is sent to the partitioner.




10 - Question

Output of the mapper is first written on the local disk for sorting and _________ process.
a) shuffling
b) secondary sorting
c) forking
d) reducing

View Answer

Answer: a
Explanation: All values corresponding to the same key will go the same reducer.

Get weekly updates about new MCQs and other posts by joining 18000+ community of active learners