Engineering Questions with Answers - Multiple Choice Questions
Home » MCQs » Computer Science » MCQs on Spark with Hadoop – 2
MCQs on Spark with Hadoop – 2
Users can easily run Spark on top of Amazon’s __________
a) Infosphere
b) EC2
c) EMR
d) None of the mentioned
View Answer
Answer: b
Explanation: Users can easily run Spark (and Shark) on top of Amazon’s EC2 either using the scripts that come with Spark.
Point out the correct statement.
a) Spark enables Apache Hive users to run their unmodified queries much faster
b) Spark interoperates only with Hadoop
c) Spark is a popular data warehouse solution running on top of Hadoop
d) None of the mentioned
View Answer
Answer: a
Explanation: Shark can accelerate Hive queries by as much as 100x when the input data fits into memory, and up 10x when the input data is stored on disk.
Spark runs on top of ___________ a cluster manager system which provides efficient resource isolation across distributed applications.
a) Mesjs
b) Mesos
c) Mesus
d) All of the mentioned
View Answer
Answer: b
Explanation: Mesos enables fine grained sharing which allows a Spark job to dynamically take advantage of the idle resources in the cluster during its execution.
Which of the following can be used to launch Spark jobs inside MapReduce?
a) SIM
b) SIMR
c) SIR
d) RIS
View Answer
Answer: b
Explanation: With SIMR, users can start experimenting with Spark and use its shell within a couple of minutes after downloading it.
Point out the wrong statement.
a) Spark is intended to replace, the Hadoop stack
b) Spark was designed to read and write data from and to HDFS, as well as other storage systems
c) Hadoop users who have already deployed or are planning to deploy Hadoop Yarn can simply run Spark on YARN
d) None of the mentioned
View Answer
Answer: a
Explanation: Spark is intended to enhance, not replace, the Hadoop stack.
Which of the following language is not supported by Spark?
a) Java
b) Pascal
c) Scala
d) Python
View Answer
Answer: b
Explanation: The Spark engine runs in a variety of environments, from cloud services to Hadoop or Mesos clusters.
Spark is packaged with higher level libraries, including support for _________ queries.
a) SQL
b) C
c) C++
d) None of the mentioned
View Answer
Answer: a
Explanation: Standard libraries increase developer productivity and can be seamlessly combined to create complex workflows.
Spark includes a collection over ________ operators for transforming data and familiar data frame APIs for manipulating semi-structured data.
a) 50
b) 60
c) 70
d) 80
View Answer
Answer: d
Explanation: Spark provides easy-to-use APIs for operating on large datasets.
Spark is engineered from the bottom-up for performance, running ___________ faster than Hadoop by exploiting in memory computing and other optimizations.
a) 100x
b) 150x
c) 200x
d) None of the mentioned
View Answer
Answer: a
Explanation: Spark is fast on disk too; it currently holds the world record in large scale on-disk sorting.
Spark powers a stack of high-level tools including Spark SQL, MLlib for _________
a) regression models
b) statistics
c) machine learning
d) reproductive research
View Answer
Answer: c
Explanation: Spark is used at a wide range of organizations to process large datasets.