Engineering Questions with Answers - Multiple Choice Questions

Min Hash Multiple Choice MCQ

1 - Question

Which technique is used for finding similarity between two sets?
a) MinHash
b) Stack
c) Priority Queue
d) PAT Tree

View Answer

Answer: a
Explanation: In computer science as well as data mining, to find the similarity between two given sets, a technique called MinHash or min-wise independent permutation scheme is used. It helps in the quick estimation of the similarity between two sets.




2 - Question

Who invented the MinHash technique?
a) Weiner
b) Samuel F. B. Morse
c) Friedrich Clemens Gerke
d) Andrei Broder

View Answer

Answer: d
Explanation: In computer science as well as data mining, to find the similarity between two given sets, a technique called MinHash or min-wise independent permutation scheme is used. It helps in the quick estimation of the similarity between two sets. It was invented by Andrei Broder in 1997.




3 - Question

Which technique was firstly used to remove duplicate web pages from search results in AltaVista search engine?
a) MinHash
b) Stack
c) Priority Queue
d) PAT Tree

View Answer

Answer: a
Explanation: In computer science as well as data mining, to find the similarity between two given sets, a technique called MinHash or min-wise independent permutation scheme is used. It helps in the quick estimation of the similarity between two sets. It is used in removing duplicate web pages from search results in AltaVista search engine.




4 - Question

Which technique was firstly used clustering documents using the similarity of two words or strings?
a) MinHash
b) Stack
c) Priority Queue
d) PAT Tree

View Answer

Answer: a
Explanation: In computer science as well as data mining, to find the similarity between two given sets, a technique called MinHash or min-wise independent permutation scheme is used. It helps in the quick estimation of similarity between two sets. It is used in clustering documents using the similarity of two words or strings.




5 - Question

Which indicator is used for similarity between two sets?
a) Rope Tree
b) Jaccard Coefficient
c) Tango Tree
d) MinHash Coefficient

View Answer

Answer: b
Explanation: In computer science as well as data mining, to find the similarity between two given sets, a technique called MinHash or min-wise independent permutation scheme is used. It helps in the quick estimation of similarity between two sets. Jaccard Coefficient is used for similarity between two sets.




6 - Question

Which of the following is defined as the ratio of total elements of intersection and union of two sets?
a) Rope Tree
b) Jaccard Coefficient Index
c) Tango Tree
d) MinHash Coefficient

View Answer

Answer: b
Explanation: MinHash helps in the quick estimation of similarity between two sets. Jaccard Coefficient is used for similarity between two sets. Jaccard Coefficient Index is defined as the ratio of total elements of intersection and union of two sets.




7 - Question

What is the value of the Jaccard index when the two sets are disjoint?
a) 1
b) 2
c) 3
d) 0

View Answer

Answer: d
Explanation: MinHash helps in the quick estimation of similarity between two sets. Jaccard Coefficient is used for the similarity between two sets. Jaccard Coefficient Index is defined as the ratio of total elements of intersection and union of two sets. For two disjoint sets, the value of the Jaccard index is zero.




8 - Question

When are the members of two sets more common relatively?
a) Jaccard Index is Closer to 1
b) Jaccard Index is Closer to 0
c) Jaccard Index is Closer to -1
d) Jaccard Index is Farther to 1

View Answer

Answer: a
Explanation: Jaccard Coefficient Index is defined as the ratio of total elements of intersection and union of two sets. For two disjoint sets, the value of the Jaccard index is zero. The members of two set more common relatively when the Jaccard Index is Closer to 1.




9 - Question

 What is the expected error for estimating the Jaccard index using MinHash scheme for k different hash functions?
a) O (log k!)
b) O (k!)
c) O (k2)
d) O (1/k½)

View Answer

Answer: d
Explanation: Jaccard Coefficient Index is defined as the ratio of total elements of intersection and union of two sets. For two disjoint sets, the value of the Jaccard index is zero. The expected error for estimating the Jaccard index using MinHash scheme for k different hash functions is O (1/k½).




10 - Question

How many hashes will be needed for calculating Jaccard index with an expected error less than or equal to 0.05?
a) 100
b) 200
c) 300
d) 400

View Answer

Answer: d
Explanation: The expected error for estimating the Jaccard index using MinHash scheme for k different hash functions is O (1/k½). 400 hashes will be needed for calculating Jaccard index with an expected error less than or equal to 0.05.




11 - Question

What is the expected error by the estimator Chernoff bound on the samples performed without replacement?
a) O (log k!)
b) O (k!)
c) O (k2)
d) O (1/k½)

View Answer

Answer: d
Explanation: The expected error for estimating the Jaccard index using MinHash scheme for k different hash functions is O (1/k½). The expected error by the estimator Chernoff bound on the samples performed without replacement is O (1/k½).




12 - Question

 What is the time required for single variant hashing to maintain the minimum hash queue?
a) O (log n!)
b) O (n!)
c) O (n2)
d) O (n)

View Answer

Answer: d
Explanation: The expected error for estimating the Jaccard index using MinHash scheme for k different hash functions is O (1/k½). The time required for single variant hashing to maintain the minimum hash queue is O (n).




13 - Question

How many bits are needed to specify the single permutation by min-wise independent family?
a) O (log n!)
b) O (n!)
c) Ω (n2)
d) Ω (n)

View Answer

Answer: d
Explanation: The time required for single variant hashing to maintain the minimum hash queue is O (n). Ω (n) bits are needed to specify the single permutation by min-wise independent family.




14 - Question

Is MinHash used as a tool for association rule learning.
a) True
b) False

View Answer

Answer: a
Explanation: MinHash was originally used to remove the duplicate webpages from a search engine. But in data mining, MinHash used as a tool for association rule learning by Cohen at 2001.




15 - Question

Did Google conduct a large evaluation for comparing the performance by two technique MinHash and SimHash.
a) True
b) False

View Answer

Answer: a
Explanation: MinHash was originally used to remove the duplicate webpages from a search engine. But in data mining, MinHash used as a tool for association rule learning by Cohen at 2001. Google conducted a survey to compare the performance by two technique MinHash and SimHash.

Get weekly updates about new MCQs and other posts by joining 18000+ community of active learners