Engineering Questions with Answers - Multiple Choice Questions

MCQs on MapReduce Features – 2

1 - Question

____________ specifies the number of segments on disk to be merged at the same time.
a) mapred.job.shuffle.merge.percent
b) mapred.job.reduce.input.buffer.percen
c) mapred.inmem.merge.threshold
d) io.sort.factor

View Answer

Answer: d
Explanation: io.sort.factor limits the number of open files and compression codecs during the merge.

2 - Question

Point out the correct statement.
a) The number of sorted map outputs fetched into memory before being merged to disk
b) The memory threshold for fetched map outputs before an in-memory merge is finished
c) The percentage of memory relative to the maximum heap size in which map outputs may not be retained during the reduce
d) None of the mentioned

View Answer

Answer: a
Explanation: When the reduce begins, map outputs will be merged to disk until those that remain are under the resource limit this defines.

3 - Question

Map output larger than ___________ percent of the memory allocated to copying map outputs.
a) 10
b) 15
c) 25
d) 35

View Answer

Answer: c
Explanation: Map output will be written directly to disk without first staging through memory.

4 - Question

Jobs can enable task JVMs to be reused by specifying the job configuration _________
a) mapred.job.recycle.jvm.num.tasks
b) mapissue.job.reuse.jvm.num.tasks
c) mapred.job.reuse.jvm.num.tasks
d) all of the mentioned

View Answer

Answer: b
Explanation: Many of my tasks had performance improved over 50% using mapissue.job.reuse.jvm.num.tasks.

5 - Question

Point out the wrong statement.
a) The task tracker has local directory to create localized cache and localized job
b) The task tracker can define multiple local directories
c) The Job tracker cannot define multiple local directories
d) None of the mentioned

View Answer

Answer: d
Explanation: When the job starts, task tracker creates a localized job directory relative to the local directory specified in the configuration.

6 - Question

During the execution of a streaming job, the names of the _______ parameters are transformed.
a) vmap
b) mapvim
c) mapreduce
d) mapred

View Answer

Answer: d
Explanation: To get the values in a streaming job’s mapper/reducer use the parameter names with the underscores.

7 - Question

The standard output (stdout) and error (stderr) streams of the task are read by the TaskTracker and logged to _________
a) ${HADOOP_LOG_DIR}/user
b) ${HADOOP_LOG_DIR}/userlogs
c) ${HADOOP_LOG_DIR}/logs
d) None of the mentioned

View Answer

Answer: b
Explanation: The child-jvm always has its current working directory added to the java.library.path and LD_LIBRARY_PATH.

8 - Question

____________ is the primary interface by which user-job interacts with the JobTracker.
a) JobConf
b) JobClient
c) JobServer
d) All of the mentioned

View Answer

Answer: b
Explanation: JobClient provides facilities to submit jobs, track their progress, access component-tasks’ reports and logs, get the MapReduce cluster status information and so on.

9 - Question

The _____________ can also be used to distribute both jars and native libraries for use in the map and/or reduce tasks.
a) DistributedLog
b) DistributedCache
c) DistributedJars
d) None of the mentioned

View Answer

Answer: b
Explanation: Cached libraries can be loaded via System.loadLibrary or System.load.

10 - Question

__________ is used to filter log files from the output directory listing.
a) OutputLog
b) OutputLogFilter
c) DistributedLog
d) DistributedJars

View Answer

Answer: b
Explanation: User can view the history logs summary in specified directory using the following command $ bin/hadoop job -history output-dir.

Get weekly updates about new MCQs and other posts by joining 18000+ community of active learners