Updated Oct 30, 2021 DP-201 Exam Dumps - PDF Questions and Testing Engine [Q34-Q57] | TestBraindump

Updated Oct 30, 2021 DP-201 Exam Dumps - PDF Questions and Testing Engine [Q34-Q57]

Share

Updated Oct 30, 2021 DP-201  Exam Dumps - PDF Questions and Testing Engine

New (2021) Microsoft DP-201  Exam Dumps


Exam DP-201: Designing an Azure Data Solution

A NEW VERSION OF THIS EXAM, DP-203, IS AVAILABLE. You will be able to take this exam until it retires on August 31, 2021. Note: Exams retire at 11:59 PM Central Standard Time.

The content of this exam was updated on July 31, 2020.

Candidates for this exam are Microsoft Azure data engineers who collaborate with business stakeholders to identify and meet the data requirements to design data solutions that use Azure data services.

Azure data engineers are responsible for data-related design tasks that include designing Azure data storage solutions that use relational and non-relational data stores, batch and real-time data processing solutions, and data security and compliance solutions.

Candidates for this exam must design data solutions that use the following Azure services: Azure Cosmos DB, Azure Synapse Analytics, Azure Data Lake Storage, Azure Data Factory, Azure Stream Analytics, Azure Databricks, and Azure Blob storage.

Part of the requirements for: Microsoft Certified: Azure Data Engineer Associate

Download exam skills outline


DP-201 Exam topics

Candidates must know the exam topics before they start of preparation. Because it will really help them in hitting the core. Our DP-201 dumps will include the following topics:

  • Design data processing solutions 25-30%
  • Design Azure data storage solutions 30-35%
  • Design for data security and compliance 15-20%
  • Design for high availability and disaster recovery 20-25%

 

NEW QUESTION 34
You need to recommend the appropriate storage and processing solution?
What should you recommend?

  • A. Flush the blob cache using Windows PowerShell.
  • B. Configure the reading speed using Azure Data Studio.
  • C. Enable auto-shrink on the database.
  • D. Enable Apache Spark RDD (RDD) caching.
  • E. Enable Databricks IO (DBIO) caching.

Answer: D

Explanation:
Scenario: You must be able to use a file system view of data stored in a blob. You must build an architecture that will allow Contoso to use the DB FS filesystem layer over a blob store.
Databricks File System (DBFS) is a distributed file system installed on Azure Databricks clusters. Files in DBFS persist to Azure Blob storage, so you won't lose data even after you terminate a cluster.
The Databricks Delta cache, previously named Databricks IO (DBIO) caching, accelerates data reads by creating copies of remote files in nodes' local storage using a fast intermediate data format. The data is cached automatically whenever a file has to be fetched from a remote location. Successive reads of the same data are then performed locally, which results in significantly improved reading speed.
Reference:
https://docs.databricks.com/delta/delta-cache.html#delta-cache
Design Azure data storage solutions
Testlet 4
Case study
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to complete each case. However, there may be additional case studies and sections on this exam. You must manage your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the case study. Case studies might contain exhibits and other resources that provide more information about the scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to make changes before you move to the next section of the exam. After you begin a new section, you cannot return to this section.
To start the case study
To display the first question in this case study, click the Next button. Use the buttons in the left pane to explore the content of the case study before you answer the questions. Clicking these buttons displays information such as business requirements, existing environment, and problem statements. If the case study has an All Information tab, note that the information displayed is identical to the information displayed on the subsequent tabs. When you are ready to answer a question, click the Question button to return to the question.
Overview
General Overview
ADatum Corporation is a medical company that has 5,000 physicians located in more than 300 hospitals across the US. The company has a medical department, a sales department, a marketing department, a medical research department, and a human resources department.
You are redesigning the application environment of ADatum.
Physical Locations
ADatum has three main offices in New York, Dallas, and Los Angeles. The offices connect to each other by using a WAN link. Each office connects directly to the Internet. The Los Angeles office also has a datacenter that hosts all the company's applications.
Existing Environment
Health Review
ADatum has a critical OLTP web application named Health Review that physicians use to track billing, patient care, and overall physician best practices.
Health Interface
ADatum has a critical application named Health Interface that receives hospital messages related to patient care and status updates. The messages are sent in batches by each hospital's enterprise relationship management (ERM) system by using a VPN. The data sent from each hospital can have varying columns and formats.
Currently, a custom C# application is used to send the data to Health Interface. The application uses deprecated libraries and a new solution must be designed for this functionality.
Health Insights
ADatum has a web-based reporting system named Health Insights that shows hospital and patient insights to physicians and business users. The data is created from the data in Health Review and Health Interface, as well as manual entries.
Database Platform
Currently, the databases for all three applications are hosted on an out-of-date VMware cluster that has a single instance of Microsoft SQL Server 2012.
Problem Statements
ADatum identifies the following issues in its current environment:
* Over time, the data received by Health Interface from the hospitals has slowed, and the number of messages has increased.
* When a new hospital joins ADatum, Health Interface requires a schema modification due to the lack of data standardization.
* The speed of batch data processing is inconsistent.
Business Requirements
Business Goals
ADatum identifies the following business goals:
* Migrate the applications to Azure whenever possible.
* Minimize the development effort required to perform data movement.
* Provide continuous integration and deployment for development, test, and production environments.
* Provide faster access to the applications and the data and provide more consistent application performance.
* Minimize the number of services required to perform data processing, development, scheduling, monitoring, and the operationalizing of pipelines.
Health Review Requirements
ADatum identifies the following requirements for the Health Review application:
* Ensure that sensitive health data is encrypted at rest and in transit.
* Tag all the sensitive health data in Health Review. The data will be used for auditing.
Health Interface Requirements
ADatum identifies the following requirements for the Health Interface application:
* Upgrade to a data storage solution that will provide flexible schemas and increased throughput for writing data. Data must be regionally located close to each hospital, and reads must display be the most recent committed version of an item.
* Reduce the amount of time it takes to add data from new hospitals to Health Interface.
* Support a more scalable batch processing solution in Azure.
* Reduce the amount of development effort to rewrite existing SQL queries.
Health Insights Requirements
ADatum identifies the following requirements for the Health Insights application:
* The analysis of events must be performed over time by using an organizational date dimension table.
* The data from Health Interface and Health Review must be available in Health Insights within 15 minutes of being committed.
* The new Health Insights application must be built on a massively parallel processing (MPP) architecture that will support the high performance of joins on large fact tables.

 

NEW QUESTION 35
You need to design the runtime environment for the Real Time Response system.
What should you recommend?

  • A. General Purpose nodes without the Enterprise Security package
  • B. General Purpose nodes with the Enterprise Security package
  • C. Memory Optimized Nodes without the Enterprise Security package
  • D. Memory Optimized nodes with the Enterprise Security package

Answer: C

Explanation:
Explanation/Reference:
Explanation:
Scenario: You must maximize the performance of the Real Time Response system.
Testlet 2
Case study
This is a case study. Case studies are not timed separately. You can use as much exam time as you
would like to complete each case. However, there may be additional case studies and sections on this
exam. You must manage your time to ensure that you are able to complete all questions included on this
exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in
the case study. Case studies might contain exhibits and other resources that provide more information
about the scenario that is described in the case study. Each question is independent of the other question
on this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers
and to make changes before you move to the next sections of the exam. After you begin a new section,
you cannot return to this section.
To start the case study
To display the first question on this case study, click the Next button. Use the buttons in the left pane to
explore the content of the case study before you answer the questions. Clicking these buttons displays
information such as business requirements, existing environment, and problem statements. If the case
study has an All Information tab, note that the information displayed is identical to the information
displayed on the subsequent tabs. When you are ready to answer a question, click the Question button to
return to the question.
Requirements
Business
The company identifies the following business requirements:
You must transfer all images and customer data to cloud storage and remove on-premises servers.

You must develop an analytical processing solution for transforming customer data.

You must develop an image object and color tagging solution.

Capital expenditures must be minimized.

Cloud resource costs must be minimized.

Technical
The solution has the following technical requirements:
Tagging data must be uploaded to the cloud from the New York office location.

Tagging data must be replicated to regions that are geographically close to company office locations.

Image data must be stored in a single data store at minimum cost.

Customer data must be analyzed using managed Spark clusters.

Power BI must be used to visualize transformed customer data.

All data must be backed up in case disaster recovery is required.

Security and optimization
All cloud data must be encrypted at rest and in transit. The solution must support:
parallel processing of customer data

hyper-scale storage of images

global region data replication of processed image data

Testlet 3
Case study
This is a case study. Case studies are not timed separately. You can use as much exam time as you
would like to complete each case. However, there may be additional case studies and sections on this
exam. You must manage your time to ensure that you are able to complete all questions included on this
exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in
the case study. Case studies might contain exhibits and other resources that provide more information
about the scenario that is described in the case study. Each question is independent of the other question
on this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers
and to make changes before you move to the next sections of the exam. After you begin a new section,
you cannot return to this section.
To start the case study
To display the first question on this case study, click the Next button. Use the buttons in the left pane to
explore the content of the case study before you answer the questions. Clicking these buttons displays
information such as business requirements, existing environment, and problem statements. If the case
study has an All Information tab, note that the information displayed is identical to the information
displayed on the subsequent tabs. When you are ready to answer a question, click the Question button to
return to the question.
Background
Current environment
The company has the following virtual machines (VMs):

Requirements
Storage and processing
You must be able to use a file system view of data stored in a blob.
You must build an architecture that will allow Contoso to use the DB FS filesystem layer over a blob store.
The architecture will need to support data files, libraries, and images. Additionally, it must provide a web-
based interface to documents that contain runnable command, visualizations, and narrative text such as a
notebook.
CONT_SQL3 requires an initial scale of 35000 IOPS.
CONT_SQL1 and CONT_SQL2 must use the vCore model and should include replicas. The solution must
support 8000 IOPS.
The storage should be configured to optimized storage for database OLTP workloads.
Migration
You must be able to independently scale compute and storage resources.

You must migrate all SQL Server workloads to Azure. You must identify related machines in the on-

premises environment, get disk size data usage information.
Data from SQL Server must include zone redundant storage.

You need to ensure that app components can reside on-premises while interacting with components

that run in the Azure public cloud.
SAP data must remain on-premises.

The Azure Site Recovery (ASR) results should contain per-machine data.

Business requirements
You must design a regional disaster recovery topology.

The database backups have regulatory purposes and must be retained for seven years.

CONT_SQL1 stores customers sales data that requires ETL operations for data analysis. A solution is

required that reads data from SQL, performs ETL, and outputs to Power BI. The solution should use
managed clusters to minimize costs. To optimize logistics, Contoso needs to analyze customer sales
data to see if certain products are tied to specific times in the year.
The analytics solution for customer sales data must be available during a regional outage.

Security and auditing
Contoso requires all corporate computers to enable Windows Firewall.

Azure servers should be able to ping other Contoso Azure servers.

Employee PII must be encrypted in memory, in motion, and at rest. Any data encrypted by SQL Server

must support equality searches, grouping, indexing, and joining on the encrypted data.
Keys must be secured by using hardware security modules (HSMs).

CONT_SQL3 must not communicate over the default ports

Cost
All solutions must minimize cost and resources.

The organization does not want any unexpected charges.

The data engineers must set the SQL Data Warehouse compute resources to consume 300 DWUs.

CONT_SQL2 is not fully utilized during non-peak hours. You must minimize resource costs for during

non-peak hours.

 

NEW QUESTION 36
You are evaluating the use of an Azure Cosmos DB account for a new database.
The proposed account will be configured as shown in the following exhibit.

Use the drop-down menus to select the answer choice that completes each statement based on the information presented in the graphic.
NOTE: Each correct selection is worth one point.

Answer:

Explanation:

Explanation

Box 1: vertices and edges
Gremlin API is selected.
You can use the Gremlin language to create graph entities (vertices and edges), modify properties within those entities, perform queries and traversals, and delete entities.
Box 2: US East
The (US) West US is selected as the primary location and geo- redundancy is enabled.
The secondary location for West US is East US.
Note: When a storage account is created, the customer chooses the primary location for their storage account.
However, the secondary location for the storage account is fixed and customers do not have the ability to change this. The following table shows the current primary and secondary location pairings:

Reference:
https://docs.microsoft.com/en-us/azure/cosmos-db/gremlin-support
https://technet2.github.io/Wiki/blogs/windowsazurestorage/windows-azure-storage-redundancy-options-and-read

 

NEW QUESTION 37
You need to recommend the appropriate storage and processing solution?
What should you recommend?

  • A. Flush the blob cache using Windows PowerShell.
  • B. Configure the reading speed using Azure Data Studio.
  • C. Enable auto-shrink on the database.
  • D. Enable Apache Spark RDD (RDD) caching.
  • E. Enable Databricks IO (DBIO) caching.

Answer: D

Explanation:
Scenario: You must be able to use a file system view of data stored in a blob. You must build an architecture that will allow Contoso to use the DB FS filesystem layer over a blob store.
Databricks File System (DBFS) is a distributed file system installed on Azure Databricks clusters. Files in DBFS persist to Azure Blob storage, so you won't lose data even after you terminate a cluster.
The Databricks Delta cache, previously named Databricks IO (DBIO) caching, accelerates data reads by creating copies of remote files in nodes' local storage using a fast intermediate data format. The data is cached automatically whenever a file has to be fetched from a remote location. Successive reads of the same data are then performed locally, which results in significantly improved reading speed.
References:
https://docs.databricks.com/delta/delta-cache.html#delta-cache

 

NEW QUESTION 38
You use Azure Data Lake Storage Gen2 to store data that data scientists and data engineers will query by using Azure Databricks interactive notebooks. The folders in Data Lake Storage will be secured, and users will have access only to the folders that relate to the projects on which they work.
You need to recommend which authentication methods to use for Databricks and Data Lake Storage to provide the users with the appropriate access. The solution must minimize administrative effort and development effort Which authentication method should you recommend for each Azure service? To answer, select the appropriate options in the answer area NOTE: Each correct selection is worth one point.

Answer:

Explanation:

Explanation:
Databricks: Personal access tokens
To authenticate and access Databricks REST APIs, you use personal access tokens. Tokens are similar to passwords; you should treat them with care. Tokens expire and can be revoked.
Data Lake Storage: Azure Active Directory
Azure Data Lake Storage Gen1 uses Azure Active Directory for authentication.
References:
https://docs.azuredatabricks.net/dev-tools/api/latest/authentication.html
https://docs.microsoft.com/en-us/azure/data-lake-store/data-lakes-store-authentication-using-azure-active-directory

 

NEW QUESTION 39
You plan to deploy an Azure SQL Database instance to support an application. You plan to use the DTUbased purchasing model.
Backups of the database must be available for 30 days and point-in-time restoration must be possible.
You need to recommend a backup and recovery policy.
What are two possible ways to achieve the goal? Each correct answer presents a complete solution.
NOTE: Each correct selection is worth one point.

  • A. Use the Standard tier and the default backup retention policy.
  • B. Use the Premium tier and the default backup retention policy.
  • C. Use the Premium tier and configure a long-term backup retention policy.
  • D. Use the Standard tier and configure a long-term backup retention policy.
  • E. Use the Basic tier and the default backup retention policy.

Answer: C,D

Explanation:
The default retention period for a database created using the DTU-based purchasing model depends on the service tier:
* Basic service tier is 1 week.
* Standard service tier is 5 weeks.
* Premium service tier is 5 weeks.
References:
https://docs.microsoft.com/en-us/azure/sql-database/sql-database-long-term-retention

 

NEW QUESTION 40
You are evaluating data storage solutions to support a new application.
You need to recommend a data storage solution that represents data by using nodes and relationships in
graph structures.
Which data storage solution should you recommend?

  • A. Data Lake Store
  • B. Cosmos DB
  • C. HDInsight
  • D. Blob Storage

Answer: B

Explanation:
Explanation/Reference:
Explanation:
For large graphs with lots of entities and relationships, you can perform very complex analyses very
quickly. Many graph databases provide a query language that you can use to traverse a network of
relationships efficiently.
Relevant Azure service: Cosmos DB
References:
https://docs.microsoft.com/en-us/azure/architecture/guide/technology-choices/data-store-overview

 

NEW QUESTION 41
A company has many applications. Each application is supported by separate on-premises databases.
You must migrate the databases to Azure SQL Database. You have the following requirements:
Organize databases into groups based on database usage.
Define the maximum resource limit available for each group of databases.
You need to recommend technologies to scale the databases to support expected increases in demand.
What should you recommend?

  • A. Database sharding
  • B. Elastic pools
  • C. Read scale-out
  • D. Managed instances

Answer: B

Explanation:
SQL Database elastic pools are a simple, cost-effective solution for managing and scaling multiple databases that have varying and unpredictable usage demands. The databases in an elastic pool are on a single Azure SQL Database server and share a set number of resources at a set price.
You can configure resources for the pool based either on the DTU-based purchasing model or the vCorebased purchasing model.
Incorrect Answers:
D: Database sharding is a type of horizontal partitioning that splits large databases into smaller components, which are faster and easier to manage.
References:
https://docs.microsoft.com/en-us/azure/sql-database/sql-database-elastic-pool

 

NEW QUESTION 42
You discover that the highest chance of corruption or bad data occurs during nightly inventory loads.
You need to ensure that you can quickly restore the data to its state before the nightly load and avoid missing any streaming data.
Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Answer:

Explanation:

Explanation

Step 1: Before the nightly load, create a user-defined restore point
SQL Data Warehouse performs a geo-backup once per day to a paired data center. The RPO for a geo-restore is 24 hours. If you require a shorter RPO for geo-backups, you can create a user-defined restore point and restore from the newly created restore point to a new data warehouse in a different region.
Step 2: Restore the data warehouse to a new name on the same server.
Step 3: Swap the restored database warehouse name.
References:
https://docs.microsoft.com/en-us/azure/sql-data-warehouse/backup-and-restore
Topic 3, Case study 2
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to complete each case. However, there may be additional case studies and sections on this exam. You must manage your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the case study. Case studies might contain exhibits and other resources that provide more information about the scenario that is described in the case study. Each question is independent of the other question on this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to make changes before you move to the next sections of the exam. After you begin a new section, you cannot return to this section.
To start the case study
To display the first question on this case study, click the button. Use the buttons in the left pane to explore the content of the case study before you answer the questions. Clicking these buttons displays information such as business requirements, existing environment, and problem statements. If the case study has an All Information tab, note that the information displayed is identical to the information displayed on the subsequent tabs. When you are ready to answer a question, click the button to return to the question.
Background
Current environment
The company has the following virtual machines (VMs):

Requirements
Storage and processing
You must be able to use a file system view of data stored in a blob.
You must build an architecture that will allow Contoso to use the DB FS filesystem layer over a blob store.
The architecture will need to support data files, libraries, and images. Additionally, it must provide a web-based interface to documents that contain runnable command, visualizations, and narrative text such as a notebook.
CONT_SQL3 requires an initial scale of 35000 IOPS.
CONT_SQL1 and CONT_SQL2 must use the vCore model and should include replicas. The solution must support 8000 IOPS.
The storage should be configured to optimized storage for database OLTP workloads.
Migration
* You must be able to independently scale compute and storage resources.
* You must migrate all SQL Server workloads to Azure. You must identify related machines in the on-premises environment, get disk size data usage information.
* Data from SQL Server must include zone redundant storage.
* You need to ensure that app components can reside on-premises while interacting with components that run in the Azure public cloud.
* SAP data must remain on-premises.
* The Azure Site Recovery (ASR) results should contain per-machine data.
Business requirements
* You must design a regional disaster recovery topology.
* The database backups have regulatory purposes and must be retained for seven years.
* CONT_SQL1 stores customers sales data that requires ETL operations for data analysis. A solution is required that reads data from SQL, performs ETL, and outputs to Power BI. The solution should use managed clusters to minimize costs. To optimize logistics, Contoso needs to analyze customer sales data to see if certain products are tied to specific times in the year.
* The analytics solution for customer sales data must be available during a regional outage.
Security and auditing
* Contoso requires all corporate computers to enable Windows Firewall.
* Azure servers should be able to ping other Contoso Azure servers.
* Employee PII must be encrypted in memory, in motion, and at rest. Any data encrypted by SQL Server must support equality searches, grouping, indexing, and joining on the encrypted data.
* Keys must be secured by using hardware security modules (HSMs).
* CONT_SQL3 must not communicate over the default ports
Cost
* All solutions must minimize cost and resources.
* The organization does not want any unexpected charges.
* The data engineers must set the SQL Data Warehouse compute resources to consume 300 DWUs.
* CONT_SQL2 is not fully utilized during non-peak hours. You must minimize resource costs for during non-peak hours.

 

NEW QUESTION 43
You are designing a real-time stream solution based on Azure Functions. The solution will process data uploaded to Azure Blob Storage.
The solution requirements are as follows:
New blobs must be processed with a little delay as possible.
Scaling must occur automatically.
Costs must be minimized.
What should you recommend?

  • A. Deploy the Azure Function in an App Service plan and use an Event Grid trigger.
  • B. Deploy the Azure Function in a Consumption plan and use a Blob trigger.
  • C. Deploy the Azure Function in an App Service plan and use a Blob trigger.
  • D. Deploy the Azure Function in a Consumption plan and use an Event Grid trigger.

Answer: B

Explanation:
Create a function, with the help of a blob trigger template, which is triggered when files are uploaded to or updated in Azure Blob storage.
You use a consumption plan, which is a hosting plan that defines how resources are allocated to your function app. In the default Consumption Plan, resources are added dynamically as required by your functions. In this serverless hosting, you only pay for the time your functions run. When you run in an App Service plan, you must manage the scaling of your function app.
References:
https://docs.microsoft.com/en-us/azure/azure-functions/functions-create-storage-blob-triggered-function

 

NEW QUESTION 44
You are planning a design pattern based on the Lambda architecture as shown in the exhibit.

Which Azure services should you use f2 or the cold path? To answer, drag the appropriate services to the correct layers. Each service may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.
NOTE: Each correct selection is worth one point.

Answer:

Explanation:

Explanation

Layer 2: Azure Data Lake Storage Gen2
Layer 3: Azure SQL Data Warehouse
Azure SQL Data Warehouse can be used for batch processing.
Note: Lambda architectures use batch-processing, stream-processing, and a serving layer to minimize the latency involved in querying big data.

References:
https://azure.microsoft.com/en-us/blog/lambda-architecture-using-azure-cosmosdb-faster-performance-low-tco-l
https://docs.microsoft.com/en-us/azure/architecture/data-guide/technology-choices/batch-processing

 

NEW QUESTION 45
You need to recommend a security solution for containers in Azure Blob storage. The solution must ensure that only read permissions are granted to a specific user for a specific container.
What should you include in the recommendation?

  • A. public read access for blobs only
  • B. shared access signatures (SAS)
  • C. an RBAC role in Azure Active Directory (Azure AD)
  • D. access keys

Answer: B

Explanation:
Explanation
You can delegate access to read, write, and delete operations on blob containers, tables, queues, and file shares that are not permitted with a service SAS.
Note: A shared access signature (SAS) provides secure delegated access to resources in your storage account without compromising the security of your data. With a SAS, you have granular control over how a client can access your data. You can control what resources the client may access, what permissions they have on those resources, and how long the SAS is valid, among other parameters.

 

NEW QUESTION 46
You need to design a sharding strategy for the Planning Assistance database.
What should you recommend?

  • A. a range mapping shard map on the time column
  • B. a range mapping shard map on the binary representation of the speed column
  • C. a list mapping shard map on the location column
  • D. a list mapping shard map on the binary representation of the License Plate column

Answer: D

Explanation:
Data used for Planning Assistance must be stored in a sharded Azure SQL Database.
A shard typically contains items that fall within a specified range determined by one or more attributes of the data. These attributes form the shard key (sometimes referred to as the partition key). The shard key should be static. It shouldn't be based on data that might change.
References:
https://docs.microsoft.com/en-us/azure/architecture/patterns/sharding

 

NEW QUESTION 47
You are designing a big data storage solution. The solution must meet the following requirements:
* Provide unlimited account sizes.
* Support a hierarchical file system.
* Be optimized for parallel analytics workloads.
Which storage solution should you use?

  • A. Azure Data Lake Storage Gen2
  • B. Azure Blob storage
  • C. Azure Cosmos DB
  • D. Apache HBase in Azure HDInsight

Answer: A

Explanation:
Azure Data Lake Storage is optimized performance for parallel analytics workloads A key mechanism that allows Azure Data Lake Storage Gen2 to provide file system performance at object storage scale and prices is the addition of a hierarchical namespace. This allows the collection of objects/files within an account to be organized into a hierarchy of directories and nested subdirectories in the same way that the file system on your computer is organized.
References:
https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-namespace

 

NEW QUESTION 48
You are planning a design pattern based on the Lambda architecture as shown in the exhibit.

Which Azure services should you use f2 or the cold path? To answer, drag the appropriate services to the correct layers. Each service may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.
NOTE: Each correct selection is worth one point.

Answer:

Explanation:

Explanation

Layer 2: Azure Data Lake Storage Gen2
Layer 3: Azure SQL Data Warehouse
Azure SQL Data Warehouse can be used for batch processing.
Note: Lambda architectures use batch-processing, stream-processing, and a serving layer to minimize the latency involved in querying big data.

References:
https://azure.microsoft.com/en-us/blog/lambda-architecture-using-azure-cosmosdb-faster-performance-low-tco-l
https://docs.microsoft.com/en-us/azure/architecture/data-guide/technology-choices/batch-processing

 

NEW QUESTION 49
You are designing a solution for a company. You plan to use Azure Databricks.
You need to recommend workloads and tiers to meet the following requirements:
* Provide managed clusters for running production jobs.
* Provide persistent clusters that support auto-scaling for analytics processes.
* Provide role-based access control (RBAC) support for Notebooks.
What should you recommend? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.

Answer:

Explanation:

Box 1: Data Engineering Only
Box 2: Data Engineering and Data Analytics
Box 3: Standard
Box 4: Data Analytics only
Box 5: Premium
Premium required for RBAC. Data Analytics Premium Tier provide interactive workloads to analyze data collaboratively with notebooks References:
https://azure.microsoft.com/en-us/pricing/details/databricks/

 

NEW QUESTION 50
You need to design network access to the SQL Server data.
What should you recommend? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.

Answer:

Explanation:

Explanation

Box 1: 8080
1433 is the default port, but we must change it as CONT_SQL3 must not communicate over the default ports.
Because port 1433 is the known standard for SQL Server, some organizations specify that the SQL Server port number should be changed to enhance security.
Box 2: SQL Server Configuration Manager
You can configure an instance of the SQL Server Database Engine to listen on a specific fixed port by using the SQL Server Configuration Manager.
References:
https://docs.microsoft.com/en-us/sql/database-engine/configure-windows/configure-a-server-to-listen-on-a-speci
Topic 3, ADatum Corporation
Case study
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to complete each case. However, there may be additional case studies and sections on this exam. You must manage your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the case study. Case studies might contain exhibits and other resources that provide more information about the scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to make changes before you move to the next section of the exam. After you begin a new section, you cannot return to this section.
To start the case study
To display the first question in this case study, click the button. Use the buttons in the left pane to explore the content of the case study before you answer the questions. Clicking these buttons displays information such as business requirements, existing environment, and problem statements. If the case study has an All Information tab, note that the information displayed is identical to the information displayed on the subsequent tabs. When you are ready to answer a question, click the button to return to the question.
Overview
General Overview
ADatum Corporation is a medical company that has 5,000 physicians located in more than 300 hospitals across the US. The company has a medical department, a sales department, a marketing department, a medical research department, and a human resources department.
You are redesigning the application environment of ADatum.
Physical Locations
ADatum has three main offices in New York, Dallas, and Los Angeles. The offices connect to each other by using a WAN link. Each office connects directly to the Internet. The Los Angeles office also has a datacenter that hosts all the company's applications.
Existing Environment
Health Review
ADatum has a critical OLTP web application named Health Review that physicians use to track billing, patient care, and overall physician best practices.
Health Interface
ADatum has a critical application named Health Interface that receives hospital messages related to patient care and status updates. The messages are sent in batches by each hospital's enterprise relationship management (ERM) system by using a VPN. The data sent from each hospital can have varying columns and formats.
Currently, a custom C# application is used to send the data to Health Interface. The application uses deprecated libraries and a new solution must be designed for this functionality.
Health Insights
ADatum has a web-based reporting system named Health Insights that shows hospital and patient insights to physicians and business users. The data is created from the data in Health Review and Health Interface, as well as manual entries.
Database Platform
Currently, the databases for all three applications are hosted on an out-of-date VMware cluster that has a single instance of Microsoft SQL Server 2012.
Problem Statements
ADatum identifies the following issues in its current environment:
* Over time, the data received by Health Interface from the hospitals has slowed, and the number of messages has increased.
* When a new hospital joins ADatum, Health Interface requires a schema modification due to the lack of data standardization.
* The speed of batch data processing is inconsistent.
Business Requirements
Business Goals
ADatum identifies the following business goals:
* Migrate the applications to Azure whenever possible.
* Minimize the development effort required to perform data movement.
* Provide continuous integration and deployment for development, test, and production environments.
* Provide faster access to the applications and the data and provide more consistent application performance.
* Minimize the number of services required to perform data processing, development, scheduling, monitoring, and the operationalizing of pipelines.
Health Review Requirements
ADatum identifies the following requirements for the Health Review application:
* Ensure that sensitive health data is encrypted at rest and in transit.
* Tag all the sensitive health data in Health Review. The data will be used for auditing.
Health Interface Requirements
ADatum identifies the following requirements for the Health Interface application:
* Upgrade to a data storage solution that will provide flexible schemas and increased throughput for writing data. Data must be regionally located close to each hospital, and reads must display be the most recent committed version of an item.
* Reduce the amount of time it takes to add data from new hospitals to Health Interface.
* Support a more scalable batch processing solution in Azure.
* Reduce the amount of development effort to rewrite existing SQL queries.
Health Insights Requirements
ADatum identifies the following requirements for the Health Insights application:
* The analysis of events must be performed over time by using an organizational date dimension table.
* The data from Health Interface and Health Review must be available in Health Insights within 15 minutes of being committed.
* The new Health Insights application must be built on a massively parallel processing (MPP) architecture that will support the high performance of joins on large fact tables.

 

NEW QUESTION 51
You are designing a solution for the ad hoc analysis of data in Azure Databricks notebooks. The data will be stored in Azure Blob storage.
You need to ensure that Blob storage will support the recovery of the data if the data is overwritten accidentally.
What should you recommend?

  • A. Add a resource lock.
  • B. Use read-access geo-redundant storage (RA-GRS).
  • C. Enable diagnostics logging.
  • D. Enable soft delete.

Answer: D

Explanation:
Explanation
Soft delete protects blob data from being accidentally or erroneously modified or deleted. When soft delete is enabled for a storage account, blobs, blob versions (preview), and snapshots in that storage account may be recovered after they are deleted, within a retention period that you specify.
Reference:
https://docs.microsoft.com/en-us/azure/storage/blobs/soft-delete-overview

 

NEW QUESTION 52
You need to design the image processing solution to meet the optimization requirements for image tag data.
What should you configure? To answer, drag the appropriate setting to the correct drop targets.
Each source may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.
NOTE: Each correct selection is worth one point.

Answer:

Explanation:

Explanation

Tagging data must be uploaded to the cloud from the New York office location.
Tagging data must be replicated to regions that are geographically close to company office locations.

 

NEW QUESTION 53
You are designing a new application that uses Azure Cosmos DB. The application will support a variety of data patterns including log records and social media mentions.
You need to recommend which Cosmos DB API to use for each data pattern. The solution must minimize resource utilization.
Which API should you recommend for each data pattern? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.

Answer:

Explanation:

Explanation

Log records: SQL
Social media mentions: Gremlin
You can store the actual graph of followers using Azure Cosmos DB Gremlin API to create vertexes for each user and edges that maintain the "A-follows-B" relationships. With the Gremlin API, you can get the followers of a certain user and create more complex queries to suggest people in common. If you add to the graph the Content Categories that people like or enjoy, you can start weaving experiences that include smart content discovery, suggesting content that those people you follow like, or finding people that you might have much in common with.
References:
https://docs.microsoft.com/en-us/azure/cosmos-db/social-media-apps

 

NEW QUESTION 54
You discover that the highest chance of corruption or bad data occurs during nightly inventory loads.
You need to ensure that you can quickly restore the data to its state before the nightly load and avoid missing any streaming data.
Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Answer:

Explanation:

Explanation:
Step 1: Before the nightly load, create a user-defined restore point
SQL Data Warehouse performs a geo-backup once per day to a paired data center. The RPO for a geo-restore is 24 hours. If you require a shorter RPO for geo-backups, you can create a user-defined restore point and restore from the newly created restore point to a new data warehouse in a different region.
Step 2: Restore the data warehouse to a new name on the same server.
Step 3: Swap the restored database warehouse name.
References:
https://docs.microsoft.com/en-us/azure/sql-data-warehouse/backup-and-restore

 

NEW QUESTION 55
You need to recommend a storage solution for a sales system that will receive thousands of small files per minute. The files will be in JSON, text, and CSV formats. The files will be processed and transformed before they are loaded into an Azure data warehouse. The files must be stored and secured in folders.
Which storage solution should you recommend?

  • A. Azure SQL Database
  • B. Azure Data Lake Storage Gen2
  • C. Azure Blob storage
  • D. Azure Cosmos DB

Answer: B

Explanation:
Azure provides several solutions for working with CSV and JSON files, depending on your needs. The primary landing place for these files is either Azure Storage or Azure Data Lake Store.1 Azure Data Lake Storage is an optimized storage for big data analytics workloads.
Incorrect Answers:
D: Azure Blob Storage containers is a general purpose object store for a wide variety of storage scenarios.
Blobs are stored in containers, which are similar to folders.
References:
https://docs.microsoft.com/en-us/azure/architecture/data-guide/scenarios/csv-and-json

 

NEW QUESTION 56
A company purchases IoT devices to monitor manufacturing machinery. The company uses an IoT appliance to communicate with the IoT devices.
The company must be able to monitor the devices in real-time.
You need to design the solution.
What should you recommend?

  • A. Azure Analysis Services using Azure PowerShell
  • B. Azure Data Factory instance using Azure Portal
  • C. Azure Analysis Services using Azure Portal
  • D. Azure Stream Analytics cloud job using Azure PowerShell

Answer: D

Explanation:
Explanation
Stream Analytics is a cost-effective event processing engine that helps uncover real-time insights from devices, sensors, infrastructure, applications and data quickly and easily.
Monitor and manage Stream Analytics resources with Azure PowerShell cmdlets and powershell scripting that execute basic Stream Analytics tasks.
Reference:
https://cloudblogs.microsoft.com/sqlserver/2014/10/29/microsoft-adds-iot-streaming-analytics-data-production-a

 

NEW QUESTION 57
......


Certification Path

The Microsoft Certified Azure Data Engineer Associate Certification include DP-200 and DP-201 exams.

 

Updated Verified Pass DP-201 Exam - Real Questions & Answers: https://www.testbraindump.com/DP-201-exam-prep.html