The code block displayed below contains an error. The code block below is intended to add a column itemNameElements to DataFrame itemsDf that includes an array of all words in column
itemName. Find the error.
Sample of DataFrame itemsDf:
1.+------+----------------------------------+-------------------+
2.|itemId|itemName |supplier |
3.+------+----------------------------------+-------------------+
4.|1 |Thick Coat for Walking in the Snow|Sports Company Inc.|
5.|2 |Elegant Outdoors Summer Dress |YetiX |
6.|3 |Outdoors Backpack |Sports Company Inc.|
7.+------+----------------------------------+-------------------+
Code block:
itemsDf.withColumnRenamed("itemNameElements", split("itemName"))
itemsDf.withColumnRenamed("itemNameElements", split("itemName"))
Which of the following code blocks reads in the two-partition parquet file stored at filePath, making sure all columns are included exactly once even though each partition has a different schema?
Schema of first partition:
1.root
2. |-- transactionId: integer (nullable = true)
3. |-- predError: integer (nullable = true)
4. |-- value: integer (nullable = true)
5. |-- storeId: integer (nullable = true)
6. |-- productId: integer (nullable = true)
7. |-- f: integer (nullable = true)
Schema of second partition:
1.root
2. |-- transactionId: integer (nullable = true)
3. |-- predError: integer (nullable = true)
4. |-- value: integer (nullable = true)
5. |-- storeId: integer (nullable = true)
6. |-- rollId: integer (nullable = true)
7. |-- f: integer (nullable = true)
8. |-- tax_id: integer (nullable = false)
Which of the following code blocks generally causes a great amount of network traffic?
Which of the following code blocks returns about 150 randomly selected rows from the 1000-row DataFrame transactionsDf, assuming that any row can appear more than once in the returned
DataFrame?
The code block shown below should return the number of columns in the CSV file stored at location filePath. From the CSV file, only lines should be read that do not start with a # character. Choose
the answer that correctly fills the blanks in the code block to accomplish this.
Code block:
__1__(__2__.__3__.csv(filePath, __4__).__5__)
Which of the following code blocks returns a one-column DataFrame of all values in column supplier of DataFrame itemsDf that do not contain the letter X? In the DataFrame, every value should
only be listed once.
Sample of DataFrame itemsDf:
1.+------+--------------------+--------------------+-------------------+
2.|itemId| itemName| attributes| supplier|
3.+------+--------------------+--------------------+-------------------+
4.| 1|Thick Coat for Wa...|[blue, winter, cozy]|Sports Company Inc.|
5.| 2|Elegant Outdoors ...|[red, summer, fre...| YetiX|
6.| 3| Outdoors Backpack|[green, summer, t...|Sports Company Inc.|
7.+------+--------------------+--------------------+-------------------+
The code block displayed below contains multiple errors. The code block should return a DataFrame that contains only columns transactionId, predError, value and storeId of DataFrame
transactionsDf. Find the errors.
Code block:
transactionsDf.select([col(productId), col(f)])
Sample of transactionsDf:
1.+-------------+---------+-----+-------+---------+----+
2.|transactionId|predError|value|storeId|productId| f|
3.+-------------+---------+-----+-------+---------+----+
4.| 1| 3| 4| 25| 1|null|
5.| 2| 6| 7| 2| 2|null|
6.| 3| 3| null| 25| 3|null|
7.+-------------+---------+-----+-------+---------+----+
Which of the following code blocks reads in the JSON file stored at filePath as a DataFrame?
The code block shown below should return an exact copy of DataFrame transactionsDf that does not include rows in which values in column storeId have the value 25. Choose the answer that
correctly fills the blanks in the code block to accomplish this.
Which of the following code blocks performs an inner join between DataFrame itemsDf and DataFrame transactionsDf, using columns itemId and transactionId as join keys, respectively?
Which of the following code blocks stores DataFrame itemsDf in executor memory and, if insufficient memory is available, serializes it and saves it to disk?
The code block displayed below contains an error. The code block should display the schema of DataFrame transactionsDf. Find the error.
Code block:
transactionsDf.rdd.printSchema
Which of the following code blocks returns a new DataFrame with only columns predError and values of every second row of DataFrame transactionsDf?
Entire DataFrame transactionsDf:
1.+-------------+---------+-----+-------+---------+----+
2.|transactionId|predError|value|storeId|productId| f|
3.+-------------+---------+-----+-------+---------+----+
4.| 1| 3| 4| 25| 1|null|
5.| 2| 6| 7| 2| 2|null|
6.| 3| 3| null| 25| 3|null|
7.| 4| null| null| 3| 2|null|
8.| 5| null| null| null| 2|null|
9.| 6| 3| 2| 25| 2|null|
10.+-------------+---------+-----+-------+---------+----+
Which of the following code blocks sorts DataFrame transactionsDf both by column storeId in ascending and by column productId in descending order, in this priority?
Which of the following code blocks returns a copy of DataFrame itemsDf where the column supplier has been renamed to manufacturer?
The code block displayed below contains an error. The code block should return the average of rows in column value grouped by unique storeId. Find the error.
Code block:
transactionsDf.agg("storeId").avg("value")
Which of the following options describes the responsibility of the executors in Spark?
Which of the following code blocks produces the following output, given DataFrame transactionsDf?
Output:
1.root
2. |-- transactionId: integer (nullable = true)
3. |-- predError: integer (nullable = true)
4. |-- value: integer (nullable = true)
5. |-- storeId: integer (nullable = true)
6. |-- productId: integer (nullable = true)
7. |-- f: integer (nullable = true)
DataFrame transactionsDf:
1.+-------------+---------+-----+-------+---------+----+
2.|transactionId|predError|value|storeId|productId| f|
3.+-------------+---------+-----+-------+---------+----+
4.| 1| 3| 4| 25| 1|null|
5.| 2| 6| 7| 2| 2|null|
6.| 3| 3| null| 25| 3|null|
7.+-------------+---------+-----+-------+---------+----+
Databricks Certification | Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 Questions Answers | Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 Test Prep | Databricks Certified Associate Developer for Apache Spark 3.0 Exam Questions PDF | Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 Online Exam | Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 Practice Test | Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 PDF | Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 Test Questions | Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 Study Material | Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 Exam Preparation | Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 Valid Dumps | Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 Real Questions | Databricks Certification Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 Exam Questions