Skip to content

Microsoft Azure Data Engineering Associate (DP-203) Study Guide

Menu
  • Contact Us
Menu

Contact Us

[email protected]

If you have never performed any analysis on your datasets to see if they would benefit from partitioning, by all means do that. You should now have the skill set to ask those basic questions. The recommended file size depends on the pool type: SQL is between 100 MB and 10 GB, and Spark is between 256 MB and 100 GB. The number of files as well as their size impact performance, so finding the best ratio for the given context requires testing and tuning. Over time, the amount of data for a specific partition might get much larger than the other. That means that those queries would run more slowly than others. Perhaps most brainjammer scenarios uploaded over the past few months were of a single type. If that’s the case, then a partition would be larger than the others, so you might want to find a new way to partition, perhaps on session datetime.

When rows of data exceed 1,000,000 in a partition, the compression improves; therefore, performance increases. You need to keep an eye on that number and make sure the row number is optional on all partitions. You can monitor shuffling on a SQL pool by running an Execution plan that will show the amount of shuffling for a given query. Queries that suffer from shuffling are ones that contain JOINs that include data that is not present on the chosen node that executes the query.

Archives

  • August 2024
  • July 2024
  • June 2024
  • May 2024
  • March 2024
  • February 2024
  • January 2024
  • December 2023
  • November 2023
  • October 2023
  • September 2023
  • August 2023
  • June 2023
  • May 2023
  • April 2023
  • March 2023
  • February 2023
  • January 2023
  • December 2022
  • November 2022
  • October 2022
  • July 2022
  • May 2022
  • March 2022
  • February 2022
  • January 2022
  • December 2021
  • November 2021
  • October 2021
  • September 2021
  • May 2021

Categories

  • ARM TEMPLATE
  • Create an Azure Data Factory
  • DATA EXPLORER POOLS
  • Design Analytical Stores
  • MANAGED PRIVATE ENDPOINTS
  • Microsoft DP-203
© 2025 Microsoft Azure Data Engineering Associate (DP-203) Study Guide All Rights Reserved