In this article, we will cover the 30 most important and frequently asked questions in the DP 203: Data Engineering on Microsoft Azure Certification exam. End, we have attached a pdf of DP-203 Exam Questions/Dumps with Answers Free PDF Download.
We strongly recommend checking our Guide to DP 203 Exam: Data Engineering on Microsoft Azure. This is a step-by-step guide for DP 203 exam preparation.
Skills measured in DP 203 / DP-203
The following skills are measured in DP-203 certification exam:
- Design and implement data storage (15–20%)
- Develop data processing (40–45%)
- Secure, monitor, and optimize data storage and data processing (30–35%)
Link of Complete Syllbus of Microsoft dp 203 certification exam: Microsoft Dp-203 syllabus
DP 203 Exam Questions
1) JSON
2) XML
3) Apache Parquet
Answer:
Apache Parquet. Apache Parquet is a highly optimized solution for data storage and is the recommended option for storage.
· A. locally-redundant storage (LRS)
· B. read-access geo-redundant storage (RA-GRS)
· C. zone-redundant storage (ZRS)
· D. geo-redundant storage (GRS)
Answer:
B. read-access geo-redundant storage (RA-GRS)
Q3. You build a data warehouse in a Synapse Analytics dedicated SQL pool. Analysts write a complex SELECT query that contains multiple JOIN and CASE statements to Manipulate data for use in inventory reports.
You need to implement a solution to make the dataset available for the reports that need to be published daily. What should you implement so the solution must minimize query times.?
· A. result set caching
· B. a replicated table
· C. an ordered clustered columnstore index
· D. a materialized view
Answer:
Materialized views for dedicated SQL pools in Azure Synapse provide a low maintenance method for complex analytical queries with faster Performance.
Q4. You are designing a dimension table for a data warehouse. Which type of slowly changing dimension (SCD) should you use for the table will track the value of the dimension attributes over time and preserve the history of the data by adding new rows as the data changes?
· A. Type 1
· B. Type 2
· C. Type 3
· D. Type 6
Q5. You need to trigger an Azure Data Factory pipeline when a file arrives in an Azure Data Lake Storage Gen2 container.
Which resource provider should you enable?
· A. Microsoft Sql
· B. Microsoft Event Grid
· C. Microsoft Automation
· D. Microsoft Event Hub
Answer:
Microsoft Event Grid
Q6. You plan to perform batch processing in Azure Databricks once daily. Which type of Databricks cluster should you use?
· A. Interactive
· B. High Concurrency
· C. Automated
Answer:
C. Automated
Q7. You have an Azure Data Factory that contains 10 pipelines. You need to label each pipeline with its main purpose of either ingesting, transforming, or loading. The labels must be available for grouping and filtering when using the monitoring experience in Data Factory.
What should you add to each pipeline?
· A. an annotation
· B. a correlation ID
· C. a resource tag
· D. a run group ID
Answer:
A. an annotation
Q8. What should you recommend for designing a statistical analysis solution that will use custom proprietary Python functions on near real-time data from Azure Event Hubs.
· A. Databricks
· B. Synapse Analytics
· C. Stream Analytics
· D. Sql Database
Answer:
A, Azure Databricks
Q9. You create an Azure Databricks cluster and specify an additional library to install.
When you attempt to load the library to a notebook, the library is not found.
What should you review to identify the cause of the issue?
· A. global init scripts logs
· B. workspace logs
· C. notebook logs
· D. cluster event logs
Answer:
D. cluster event logs
Q10. You need to examine the pipeline failures from the last 60 days in your Azure data factory.
what should you do?
A. the Activity log blade for the Data Factory resource
B. the Monitor & Manage app in Data Factory
C. the Resource health blade for the Data Factory resource
D. Azure Monitor
Answer:
Azure Monitor Data Factory stores pipeline-run data for only 45 days. Use Azure Monitor if you want to keep that data for a longer time.
Q11. Which of the following terms refer to the scale of compute that is being used in an Azure SQL Synapse Analytics server?
· A. DTU
· B. RTU
· C. DWU
Answer:
DWU refers to Data Warehouse Units. It is the measure of compute scale that is assigned to an Azure SL Data Warehouse. RTU is a compute scale unit of Cosmos DB. DTU is a compute scale unit of Azure SQL Database.
Q12. You have an Azure Synapse Analytics database, within this, you have a dimension table named Stores that contains store information. There is a total of 263 stores nationwide. Store information is retrieved in more than half of the queries that are issued against this database. These queries include staff information per store, sales information per store and finance information. You want to improve the query performance of these queries by configuring the table geometry of the stores table. Which is the appropriate table geometry to select for the stores table?
· A. Round Robin
· B. Replicated table
· C. Non Clustered
Answer:
B. Replicated table
Q13. Which Azure Data Factory integration runtime would be used in a data copy activity to move data from an Azure Data Lake Gen2 store to Azure Synapse Analytics?
· A. Azure IR
· B. Azure – SSIS
· C. Pipelines
· D. Self-hosted
Answer:
A. Azure IR
Q14. Is Encrypted communication is turned on automatically when connecting to an Azure SQL Database or Azure Synapse Analytics?
· False
· True
Answer:
True. Azure SQL Database enforces encryption (SSL/TLS) at all times for all conections.
Q15. You are designing an enterprise data warehouse in Azure Synapse Analytics. You plan to load millions of rows of data into the data warehouse each day.
You must ensure that staging tables are optimized for data loading.
You need to design the staging tables.
What type of tables should you recommend?
· A. Hash-distributed table
· B. Round-robin distributed table
· C. External table
· D. Replicated table
Answer:
B. Round-robin distributed table
Q16. You have an Azure Synapse Analytics dedicated SQL pool.
You need to ensure that data in the pool is encrypted at rest. The solution must NOT require modifying applications that query the data.
What should you do?
· A. Use a customer-managed key to enable double encryption for the Azure Synapse workspace.
· B. Create an Azure key vault in the Azure subscription grant access to the pool.
· C. Enable encryption at rest for the Azure Data Lake Storage Gen2 account.
· D. Enable Transparent Data Encryption (TDE) for the pool.
Answer:
D. Enable Transparent Data Encryption (TDE) for the pool.
Q17. You are designing an enterprise data warehouse in Azure Synapse Analytics that will contain a table named Customers. Customers will contain credit card information.
You need to recommend a solution to provide salespeople with the ability to view all the entries in Customers. The solution must prevent all the salespeople from viewing or inferring the credit card information.
What should you include in the recommendation?
· A. column-level security
· B. data masking
· C. Always Encrypted
· D. row-level security
Answer:
A. column-level security
Q18. You have an Azure Data Lake Storage Gen2 account named adls2 that is protected by a virtual network.
You are designing a SQL pool in Azure Synapse that will use adls2 as a source.
What should you use to authenticate to adls2?
· A. a managed identity
· B. a shared access signature (SAS)
· C. an Azure Active Directory (Azure AD) user
· D. a shared key
Answer:
A. a managed identity
Q19. Users report slow performance when they run commonly used queries in an enterprise data warehouse in Azure Synapse Analytics.. Users do not report performance changes for infrequently used queries.
Which metric should you monitor to determine the source of the performance issues.
Which metric should you monitor?
· A. DWU percentage
· B. DWU limit
· C. Data IO percentage
· D. Cache hit percentage
Answer:
D. Cache hit percentager
Q20. By default, how many partitions will a new Event Hub have?
· A. 1
· B. 2
· C. 4
· D. 8
Answer:
C. 4
Q21. You are a Data Engineer for a company. You want to view key health metrics of your Stream Analytics jobs. Which tool in Streaming Analytics should you use?
· A. Dashboards
· B. Alerts
· C. Diagnostics
· D. App
Answer:
A. Dashboards
Q22. Which input type should you use for the reference data when You are developing a solution that will stream to Azure Stream Analytics? The solution will have both streaming data and reference data.
· A. Azure IoT Hub
· B. Azure Blob storage
· C. Azure Cosmos DB
· D. Azure Event Hubs
Answer:
Stream Analytics supports Azure Blob storage and Azure SQL Database as the storage layer for Reference Data
Q23. Authentication for an Event hub is defined with a combination of an Event Publisher and which other component?
· A. Storage Account Key
· B. Shared Access Signature
· C. Transport Layer Security v1.2
Answer:
Shared Access Signature.
Q24. What is the maximum number of activities per pipeline in Azure Data Factory?
· A. 60
· B. 40
· C. 160
· D. 80
Answer:
40
Q25. A company manages several on-premises Microsoft SQL Server databases.
Which data technology should you use to migrate the databases to Microsoft Azure by using a backup process of Microsoft SQL Server?
· A. Azure SQL Data Warehouse
· B. Azure SQL Database Managed Instance
· C. Azure Cosmos DB
· D. Azure SQL Database single database
Answer:
B. Azure SQL Database Managed Instance
DP 203 Exam Dumps Free Download – Link
We have prepared a PDF that contains the most repeated questions of the Microsoft DP 203: Data Engineering on Microsoft Azure Certification exam. This PDF has created by Professionals having more than 10 years of experience in Azure and taught more than 10K, successful students.
PDF Link – DP203 Exam Dumps Free Download – Link