Free Actual Databricks-Certified-Data-Engineer-Associate Exam Questions & Answers

Question 1

Which of the following commands will return the number of null values in the member_id column?

ASELECT count(member_id) FROM my_table;

BSELECT count(member_id) - count_null(member_id) FROM my_table;

CSELECT count_if(member_id IS NULL) FROM my_table;

DSELECT null(member_id) FROM my_table;

ESELECT count_null(member_id) FROM my_table;

Correct : C

To return the number of null values in the member_id column, the best option is to use the count_if function, which counts the number of rows that satisfy a given condition. In this case, the condition is that the member_id column is null. The other options are either incorrect or not supported by Spark SQL. Option A will return the number of non-null values in the member_id column. Option B will not work because there is no count_null function in Spark SQL. Option D will not work because there is no null function in Spark SQL. Option E will not work because there is no count_null function in Spark SQL.Reference:

Built-in Functions - Spark SQL, Built-in Functions

count_if - Spark SQL, Built-in Functions

Options Selected by Other Users:

Question 2

Which of the following must be specified when creating a new Delta Live Tables pipeline?

AA key-value pair configuration

BThe preferred DBU/hour cost

CA path to cloud storage location for the written data

DA location of a target database for the written data

EAt least one notebook library to be executed

Correct : E

Option E is the correct answer because it is the only mandatory requirement when creating a new Delta Live Tables pipeline. A pipeline is a data processing workflow that contains materialized views and streaming tables declared in Python or SQL source files. Delta Live Tables infers the dependencies between these tables and ensures updates occur in the correct order. To create a pipeline, you need to specify at least one notebook library to be executed, which contains the Delta Live Tables syntax. You can also specify multiple libraries of different languages within your pipeline. The other options are optional or not applicable for creating a pipeline. Option A is not required, but you can optionally provide a key-value pair configuration to customize the pipeline settings, such as the storage location, the target schema, the notifications, and the pipeline mode. Option B is not applicable, as the DBU/hour cost is determined by the cluster configuration, not the pipeline creation. Option C is not required, but you can optionally specify a storage location for the output data from the pipeline. If you leave it empty, the system uses a default location. Option D is not required, but you can optionally specify a location of a target database for the written data, either in the Hive metastore or the Unity Catalog.

Databricks-Certified-Data-Engineer-Associate Exam Questions - Navigate Your Path to Success

Sample Questions for Databricks-Certified-Data-Engineer-Associate Exam Preparation

Options Selected by Other Users:

Options Selected by Other Users: