DP-100 (Microsoft Azure Data Science Solution) Exam Study Guide

Preparing for DP-100 Designing and Implementing a Data Science Solution on Azure Certificate exam? Don’t know where to start? This post is the DP-100 Certificate Study Guide (with links to each exam objective).

I have curated a list of articles from Microsoft documentation for each objective of DP-100 exam. I hope this article will help you to prepare for DP-100 Certification exam. Also, please share the post within your circles so it helps them to prepare for the exam.

 

DP-100 Course (Online Training)

Pluralsight (Learning Path)Microsoft Azure Data Scientist Course
LinkedIn Learning (Free trial)Azure Machine Learning Development
UdemyA-Z Machine Learning using Azure Machine Learning

 

DP-100 Practice Tests & Labs

Udemy Practice Test70+ Practice Exam Questions
DP-100 Labs on GitHubImplementing an Azure Data Science Solution

 

DP-100 E-book (PDF)

AmazonAzure Data Scientist 48 Test Prep Questions

 

DP-100 Exam Voucher

Test VoucherMicrosoft Azure Single Shot Exam Voucher ($30 OFF)

 

To view other Azure Certificate Study Guides, click here

Full Disclosure: Some of the links in this post are affiliate links. I receive a commission when you purchase through them.

 

Looking for DP-100 dumps? Read this!

Using dp-100 exam dumps can get you permanently banned from taking any future Microsoft certificate exam. Read the FAQ page for more information. However, I strongly suggest you validate your understanding with practice questions.

 

Not Sure Which Exam Is Right for You?

Confused between AI-100 and DP-100? You are not alone. Read this blog post and choose the one that’s right for you!

 

Define and prepare the development environment (15-20%)

Select development environment

Assess the deployment environment constraints

https://docs.microsoft.com/en-us/learn/modules/register-and-deploy-model-with-amls/8-where-to-deploy-model

Analyze and recommend tools that meet system requirements

https://docs.microsoft.com/en-us/learn/modules/choose-data-science-option-in-azure/index

 

Select the development environment

https://docs.microsoft.com/en-us/learn/modules/choose-data-science-option-in-azure/index

 

Set up development environment

Create an Azure data science environment

https://docs.microsoft.com/en-us/azure/machine-learning/team-data-science-process/environment-setup

https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-manage-workspace

Configure data science work environments

https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-configure-environment

 

Quantify the business problem

Define technical success metrics

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/evaluate-model

Quantify risks

https://www.oliverwyman.com/our-expertise/insights/2018/dec/risk-journal-vol8-rethinking-tactics/the-risk-of-machine-learning-bias-and-how-to-prevent-it.html

https://www.oreilly.com/radar/managing-risk-in-machine-learning/

 

Prepare data for modeling (25-30%)

Transform data into usable datasets

Develop data structures

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/data-format-conversions

Design a data sampling strategy

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/partition-and-sample

Design the data preparation flow

https://azure.microsoft.com/en-us/resources/videos/preprocessing-data-in-azure-ml-studio/

 

Perform Exploratory Data Analysis (EDA)

Review visual analytics data to discover patterns and determine next steps

https://docs.microsoft.com/en-us/learn/create-data-visualizations-using-azure-databricks-and-power-bi/3-complete-labs-in-azure-databricks

https://docs.microsoft.com/en-us/learn/modules/analyze-climate-data-with-azure-notebooks/2-upload-data-and-create-scatterplot

https://docs.microsoft.com/en-us/learn/modules/analyze-climate-data-with-azure-notebooks/5-analyze-data-with-seaborn

Identify anomalies, outliers, and other data inconsistencies

https://docs.microsoft.com/en-us/azure/machine-learning/team-data-science-process/prepare-data

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/clip-values

Create descriptive statistics for a dataset

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/summarize-data

 

Cleanse and transform data

Resolve anomalies, outliers, and other data inconsistencies

The module ‘Preparing Input Data for Machine Learning Models’ covers most of the objectives

Standardize data formats

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/convert-to-csv

Set the granularity for data

View this Reddit discussion on data granularity in Machine Learning

Video on choosing Data Granularity

Amazon link (affiliate)

Amazon link (affiliate)

Develop models (40-45%)

Select an algorithmic approach

Determine appropriate performance metrics

https://docs.microsoft.com/en-us/azure/machine-learning/algorithm-module-reference/evaluate-model#metrics

Implement appropriate algorithms

https://docs.microsoft.com/en-us/azure/machine-learning/studio/algorithm-choice

Consider data preparation steps that are specific to the selected algorithms

Check the module ‘Identify Data-level Issues In Machine Learning Models’

 

Split datasets

Determine ideal split based on the nature of the data

https://www.freecodecamp.org/news/what-to-do-when-your-training-and-testing-data-come-from-different-distributions-d89674c6ecd8/

Determine number of splits

Coursera video on Data splitting strategies

Determine relative size of splits

https://stackoverflow.com/questions/is-there-a-rule-of-thumb-for-how-to-divide-a-dataset-into-training-and-validation

Ensure splits are balanced

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/split-data-using-split-rows#divide-a-dataset-into-two-groups (Understand the concept Stratified split)

 

Identify data imbalances

Resample a dataset to impose balance

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/smote

Adjust performance metric to resolve imbalances

https://blogs.msdn.microsoft.com/using-roc-plots-and-the-auc-measure-in-azure-ml/

Implement penalization

https://docs.microsoft.com/en-us/archive/msdn-magazine/2015/february/test-run-l1-and-l2-regularization-for-machine-learning

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/linear-regression#how-to-configure-linear-regression

 

Train the model

Select early stopping criteria

https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-tune-hyperparameters#specify-early-termination-policy

Tune hyper-parameters

https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-tune-hyperparameters

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/tune-model-hyperparameters

Check the FAQs on Azure Certification

This brings us to the end of DP-100 Study Guide

What do you think? Let me know in the comments section if I have missed out on anything. Also, I love to hear from you about how your preparation is going on!

In case you are looking for other Azure certification exams check out this page

Sign up for Newsletter

Want to be notified as soon as I post? Subscribe to RSS feed / leave your email address in the subscribe section. Share the article to your social networks with the below links so it can benefit others.

  •  
  •  
  •  
  •  
  •  

You may also like

2 Comments

Leave a Reply

Your e-mail address will not be published. Required fields are marked *