Skip to content

Microsoft Azure Data Engineering Associate (DP-203) Study Guide

Menu
  • Contact Us
Menu

Design Slowly Changing Dimensions – Data Sources and Ingestion

Posted on 2021-09-222024-08-05 by Benjamin Goodwin

Managing changes to the data stored on dimension tables over time is an important factor to consider and be handled by designing slowly changing dimensions (SCD). Recognize that the data on the fact table will contain a large amount of data. Making an update to data on that scale is not a prudent approach to manage changes that happen while running a business. Perhaps the model of BCI changes, or either the ELECTRODE name or its LOCATION on your head changes. How would you manage the dimension table data if the AF3 electrode were renamed to AF3i and its location were changed from Front Left to Top Left? Because it might not be feasible to update all the data in the fact table, you could instead keep track of the changes—for example, by adding two columns to the DimELECTRODE table. The first column would store the date when initially inserted, and the second column would store when the data has been modified—something like the following:

+—————+
| DimELECTRODE  |
+—————+
| ELECTRODE_ID  |
| ELECTRODE     |
| LOCATION      |
| INSERT_DATE   |
| MODIFIED_DATE |
+—————+

You might be wondering what happens when the data in the ELECTRODE and LOCATION data is changed. Do you lose that data? How would you then know that before AF3i and Top Left, existed AF3 and Front Left? This is accomplished by choosing which slowly changing dimension type is required for the given scenario (see Table 3.5).

TABLE 3.5 Slowly changing dimension types

TypeDescription
Type 1 SCDReflects latest values, overwrites historical data
Type 2 SCDPerforms versioning of dimensional data
Type 3 SCDStores both current and original data in separate columns
Type 4 SCDCreates a new table to store dimensional table data history
Type 5 SCDA combination of Type 1 and 4
Type 6 SCDA combination of Type 1, 2, and 3

The most common SCD types are 1, 2, 3, and 6.

Type 1 SCD

The two tables in Figure 3.15 represent the change in data in a Type 1 SCD scenario. Notice the values on the top table—specifically, ELECTRODE and LOCATION—which are the values that change in the update. Also note the values in the INSERT_DATE and MODIFIED_DATE columns. They are the same, which means there has been no change to these values.

FIGUER 3.15 Type 1 SCD

In the second table, note that the ELECTRODE and LOCATION values have changed, and the MODIFIED_DATE has been updated with a new date. You can see that there is no history of what the value was before the update. If you do not need this history in your scenario, then use this type. In the brainjammer scenario, is it important to know what the electrode was named before it got changed? Not really, from an analytics perspective.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Archives

  • August 2024
  • July 2024
  • June 2024
  • May 2024
  • March 2024
  • February 2024
  • January 2024
  • December 2023
  • November 2023
  • October 2023
  • September 2023
  • August 2023
  • June 2023
  • May 2023
  • April 2023
  • March 2023
  • February 2023
  • January 2023
  • December 2022
  • November 2022
  • October 2022
  • July 2022
  • May 2022
  • March 2022
  • February 2022
  • January 2022
  • December 2021
  • November 2021
  • October 2021
  • September 2021
  • May 2021

Categories

  • ARM TEMPLATE
  • Create an Azure Data Factory
  • DATA EXPLORER POOLS
  • Design Analytical Stores
  • MANAGED PRIVATE ENDPOINTS
  • Microsoft DP-203
© 2025 Microsoft Azure Data Engineering Associate (DP-203) Study Guide All Rights Reserved