CCA175 CCA Spark and Hadoop Developer Exam Questions and Answers

Questions 4

Problem Scenario 96 : Your spark application required extra Java options as below. -XX:+PrintGCDetails-XX:+PrintGCTimeStamps

Please replace the XXX values correctly

./bin/spark-submit --name "My app" --master local[4] --conf spark.eventLog.enabled=talse --conf XXX hadoopexam.jar

Options:

Buy Now

Questions 5

Problem Scenario 8 : You have been given following mysql database details as well as other info.

Please accomplish following.

1. Import joined result of orders and order_items table join on orders.order_id = order_items.order_item_order_id.

2. Also make sure each tables file is partitioned in 2 files e.g. part-00000, part-00002

3. Also make sure you use orderid columns for sqoop to use for boundary conditions.

Options:

Buy Now

Questions 6

Problem Scenario 52 : You have been given below code snippet.

val b = sc.parallelize(List(1,2,3,4,5,6,7,8,2,4,2,1,1,1,1,1))

Operation_xyz

Write a correct code snippet for Operation_xyz which will produce below output. scalaxollection.Map[lnt,Long] = Map(5 -> 1, 8 -> 1, 3 -> 1, 6 -> 1, 1 -> S, 2 -> 3, 4 -> 2, 7 -> 1)

Options:

Buy Now

Questions 7

Problem Scenario GG : You have been given below code snippet.

val a = sc.parallelize(List("dog", "tiger", "lion", "cat", "spider", "eagle"), 2)

val b = a.keyBy(_.length)

val c = sc.parallelize(List("ant", "falcon", "squid"), 2)

val d = c.keyBy(.length)

operation 1

Write a correct code snippet for operationl which will produce desired output, shown below. Array[(lnt, String)] = Array((4,lion))

Options:

Buy Now

Questions 8

Problem Scenario 40 : You have been given sample data as below in a file called spark15/file1.txt

3070811,1963,1096,,"US","CA",,1,

3022811,1963,1096,,"US","CA",,1,56

3033811,1963,1096,,"US","CA",,1,23

Below is the code snippet to process this tile.

val field= sc.textFile("spark15/f ilel.txt")

val mapper = field.map(x=> A)

mapper.map(x => x.map(x=> {B})).collect

Please fill in A and B so it can generate below final output

Array(Array(3070811,1963,109G, 0, "US", "CA", 0,1, 0)

,Array(3022811,1963,1096, 0, "US", "CA", 0,1, 56)

,Array(3033811,1963,1096, 0, "US", "CA", 0,1, 23)

)

Options:

Buy Now

Questions 9

Problem Scenario 24 : You have been given below comma separated employee information.

Data Set:

name,salary,sex,age

alok,100000,male,29

jatin,105000,male,32

yogesh,134000,male,39

ragini,112000,female,35

jyotsana,129000,female,39

valmiki,123000,male,29

Requirements:

Use the netcat service on port 44444, and nc above data line by line. Please do the following activities.

1. Create a flume conf file using fastest channel, which write data in hive warehouse directory, in a table called flumemaleemployee (Create hive table as well tor given data).

2. While importing, make sure only male employee data is stored.

Options:

Buy Now

Answer:

See the explanation for Step by Step Solution and configuration.

Explanation:

Step 1 : Create hive table for flumeemployee.'

CREATE TABLE flumemaleemployee

(

name string,

salary int,

sex string,

age int

)

ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';

step 2 : Create flume configuration file, with below configuration for source, sink and channel and save it in flume4.conf.

#Define source , sink, channel and agent.

agent1 .sources = source1

agent1 .sinks = sink1

agent1 .channels = channel1

# Describe/configure source1

agent1 .sources.source1.type = netcat

agent1 .sources.source1.bind = 127.0.0.1

agent1.sources.sourcel.port = 44444

#Define interceptors

agent1.sources.source1.interceptors=il

agent1 .sources.source1.interceptors.i1.type=regex_filter

agent1 .sources.source1.interceptors.i1.regex=female

agent1 .sources.source1.interceptors.i1.excludeEvents=true

## Describe sink1

agent1 .sinks, sinkl.channel = memory-channel

agent1.sinks.sink1.type = hdfs

agent1 .sinks, sinkl. hdfs. path = /user/hive/warehouse/flumemaleemployee

hdfs-agent.sinks.hdfs-write.hdfs.writeFormat=Text

agentl .sinks.sink1.hdfs.fileType = Data Stream

# Now we need to define channel1 property.

agent1.channels.channel1.type = memory

agent1.channels.channell.capacity = 1000

agent1.channels.channel1.transactionCapacity = 100

# Bind the source and sink to the channel

agent1 .sources.source1.channels = channel1

agent1 .sinks.sink1.channel = channel1

step 3 : Run below command which will use this configuration file and append data in hdfs.

Start flume service:

flume-ng agent -conf /home/cloudera/flumeconf -conf-file /home/cloudera/flumeconf/flume4.conf --name agentl

Step 4 : Open another terminal and use the netcat service, nc localhost 44444

Step 5 : Enter data line by line.

alok,100000,male,29

jatin,105000,male,32

yogesh,134000,male,39

ragini,112000,female,35

jyotsana,129000,female,39

valmiki.123000.male.29

Step 6 : Open hue and check the data is available in hive table or not.

Step 7 : Stop flume service by pressing ctrl+c

Step 8 : Calculate average salary on hive table using below query. You can use either hive command line tool or hue. select avg(salary) from flumeemployee;

Questions 10

Problem Scenario 41 : You have been given below code snippet.

val aul = sc.parallelize(List (("a" , Array(1,2)), ("b" , Array(1,2))))

val au2 = sc.parallelize(List (("a" , Array(3)), ("b" , Array(2))))

Apply the Spark method, which will generate below output.

Array[(String, Array[lnt])] = Array((a,Array(1, 2)), (b,Array(1, 2)), (a(Array(3)), (b,Array(2)))

Options:

Buy Now

Questions 11

Problem Scenario 22 : You have been given below comma separated employee information.

name,salary,sex,age

alok,100000,male,29

jatin,105000,male,32

yogesh,134000,male,39

ragini,112000,female,35

jyotsana,129000,female,39

valmiki,123000,male,29

Use the netcat service on port 44444, and nc above data line by line. Please do the following activities.

1. Create a flume conf file using fastest channel, which write data in hive warehouse directory, in a table called flumeemployee (Create hive table as well tor given data).

2. Write a hive query to read average salary of all employees.

Options:

Buy Now

Answer:

See the explanation for Step by Step Solution and configuration.

Explanation:

Solution :

Step 1 : Create hive table forflumeemployee.'

CREATE TABLE flumeemployee

(

name string, salary int, sex string,

age int

)

ROW FORMAT DELIMITED

FIELDS TERMINATED BY ',';

Step 2 : Create flume configuration file, with below configuration for source, sink and channel and save it in flume2.conf.

#Define source , sink , channel and agent,

agent1 .sources = source1

agent1 .sinks = sink1

agent1.channels = channel1

# Describe/configure source1

agent1.sources.source1.type = netcat

agent1.sources.source1.bind = 127.0.0.1

agent1.sources.source1.port = 44444

## Describe sink1

agent1 .sinks.sink1.channel = memory-channel

agent1.sinks.sink1.type = hdfs

agent1 .sinks.sink1.hdfs.path = /user/hive/warehouse/flumeemployee

hdfs-agent.sinks.hdfs-write.hdfs.writeFormat=Text

agent1 .sinks.sink1.hdfs.tileType = Data Stream

# Now we need to define channel1 property.

agent1.channels.channel1.type = memory

agent1.channels.channel1.capacity = 1000

agent1.channels.channel1.transactionCapacity = 100

# Bind the source and sink to the channel

Agent1 .sources.sourcel.channels = channell agent1 .sinks.sinkl.channel = channel1

Step 3 : Run below command which will use this configuration file and append data in hdfs.

Start flume service:

flume-ng agent -conf /home/cloudera/flumeconf -conf-file /home/cloudera/flumeconf/flume2.conf --name agent1

Step 4 : Open another terminal and use the netcat service.

nc localhost 44444

Step 5 : Enter data line by line.

alok,100000.male,29

jatin,105000,male,32

yogesh,134000,male,39

ragini,112000,female,35

jyotsana,129000,female,39

valmiki,123000,male,29

Step 6 : Open hue and check the data is available in hive table or not.

step 7 : Stop flume service by pressing ctrl+c

Step 8 : Calculate average salary on hive table using below query. You can use either hive command line tool or hue. select avg(salary) from flumeemployee;

Questions 12

Problem Scenario 5 : You have been given following mysql database details.

user=retail_dba

password=cloudera

database=retail_db

jdbc URL = jdbc:mysql://quickstart:3306/retail_db

Please accomplish following activities.

1. List all the tables using sqoop command from retail_db

2. Write simple sqoop eval command to check whether you have permission to read database tables or not.

3. Import all the tables as avro files in /user/hive/warehouse/retail cca174.db

4. Import departments table as a text file in /user/cloudera/departments.

Options:

Buy Now

Questions 13

Problem Scenario 57 : You have been given below code snippet.

val a = sc.parallelize(1 to 9, 3) operationl

Write a correct code snippet for operationl which will produce desired output, shown below.

Array[(String, Seq[lnt])] = Array((even,ArrayBuffer(2, 4, G, 8)), (odd,ArrayBuffer(1, 3, 5, 7, 9)))

Options:

Buy Now

Questions 14

Problem Scenario 28 : You need to implement near real time solutions for collecting information when submitted in file with below

Data

echo "IBM,100,20160104" >> /tmp/spooldir2/.bb.txt

echo "IBM,103,20160105" >> /tmp/spooldir2/.bb.txt

mv /tmp/spooldir2/.bb.txt /tmp/spooldir2/bb.txt

After few mins

echo "IBM,100.2,20160104" >> /tmp/spooldir2/.dr.txt

echo "IBM,103.1,20160105" >> /tmp/spooldir2/.dr.txt

mv /tmp/spooldir2/.dr.txt /tmp/spooldir2/dr.txt

You have been given below directory location (if not available than create it) /tmp/spooldir2 .

As soon as file committed in this directory that needs to be available in hdfs in /tmp/flume/primary as well as /tmp/flume/secondary location.

However, note that/tmp/flume/secondary is optional, if transaction failed which writes in this directory need not to be rollback.

Write a flume configuration file named flumeS.conf and use it to load data in hdfs with following additional properties .

1. Spool /tmp/spooldir2 directory

2. File prefix in hdfs sholuld be events

3. File suffix should be .log

4. If file is not committed and in use than it should have _ as prefix.

5. Data should be written as text to hdfs

Options:

Buy Now

Exam Code: CCA175

Exam Name: CCA Spark and Hadoop Developer Exam

Last Update: Jul 3, 2025

Questions: 96

CCA175 PDF

$29.75 ~~$84.99~~

Add to Cart

CCA175 Testing Engine

$35 ~~$99.99~~

Add to Cart

CCA175 PDF + Testing Engine

$47.25 ~~$134.99~~

Add to Cart

Summer Special Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: cramtreat

cramtick logo

Navigation:

Hot Vendors:

CCA175 CCA Spark and Hadoop Developer Exam Questions and Answers

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

CCA175 PDF

CCA175 Testing Engine

CCA175 PDF + Testing Engine

Quick Links

Recently New Released Certification Exams

Site Secure