You are working on a project where you need to chain together MapReduce, Pig jobs. You also need the ability to use forks, decision points, and path joins. Which ecosystem project should you use to perform these actions?
Answer : A
You are planning a Hadoop cluster and considering implementing 10 Gigabit Ethernet as the network fabric. Which workloads benefit the most from faster network fabric?
Answer : A
Identify two features/issues that YARN is designated to address: (Choose two)
Answer : D,E
Reference:http://www.revelytix.com/?q=content/hadoop-ecosystem(YARN, first para)
Table schemas in Hive are:
Answer : B
Explanation: http://stackoverflow.com/questions/22989592/how-to-get-hive-table-name- based-on-hdfs-location-path-with-out-connecting-to-m
Assuming a cluster running HDFS, MapReduce version 2 (MRv2) on YARN with all settings at their default, what do you need to do when adding a new slave node to cluster?
Answer : A
Explanation:
http://wiki.apache.org/hadoop/FAQ#I_have_a_new_node_I_want_to_add_to_a_running_H adoop_cluster.3B_how_do_I_start_services_on_just_one_node.3F
Your cluster is configured with HDFS and MapReduce version 2 (MRv2) on YARN. What is the result when you execute: hadoop jar SampleJar MyClass on a client machine?
Answer : A
Your cluster has the following characteristics:
-> A rack aware topology is configured and on
-> Replication is set to 3
-> Cluster block size is set to 64MB
Which describes thefile read process when a client application connects into the cluster and requests a 50MB file?
Answer : B
For each YARN job, the Hadoop framework generates task log file. Where are Hadoop task log files stored?
Answer : D
Youre upgrading a Hadoop cluster from HDFS and MapReduce version 1 (MRv1) to one running HDFS and MapReduce version 2 (MRv2) on YARN. You want to set and enforce version 1 (MRv1) to one running HDFS and MapReduce version 2 (MRv2) on YARN. You want to set and enforce a block size of 128MB for all new files written to the cluster after upgrade. What should you do?
Answer : C
You want to node to only swap Hadoop daemon data from RAM to disk when absolutely necessary. What should you do?
Answer : D
You need to analyze 60,000,000 images stored in JPEG format, each of which is approximately 25 KB. Because you Hadoop cluster isnt optimized for storing and processing many small files, you decide to do the following actions:
1. Group the individual images into a set of larger files
2. Use the set of larger files as input for a MapReduce job that processes them directly with python using Hadoopstreaming.
Which data serialization system gives the flexibility to do this?
Answer : E
Explanation: Sequence files are block-compressed and provide direct serialization and deserialization of several arbitrarydata types (not just text). Sequence files can be generated as the output of other MapReduce tasks and are an efficient intermediate representation for data that is passing from one MapReduce job to anther.
Which three basic configuration parameters must you set to migrate your cluster from
MapReduce 1 (MRv1) to MapReduce V2 (MRv2)? (Choose three)
Answer : A,E,F
You have installed a cluster HDFS andMapReduce version 2 (MRv2) on YARN. You have no dfs.hosts entry(ies) in your hdfs-site.xml configuration file. You configure a new worker node by setting fs.default.name in its configuration files to point to the NameNode on your cluster, and you start theDataNode daemon on that worker node. What do you have to do on the cluster to allow the worker node to join, and start sorting HDFS blocks?
Answer : A
During the execution of a MapReduce v2 (MRv2) job on YARN, where does the Mapper place the intermediate data of each Map Task?
Answer : E
Which two are features of Hadoop’s rack topology? (Choose two)
Answer : BC
Have any questions or issues ? Please dont hesitate to contact us