Skip to content

Microsoft Azure Data Engineering Associate (DP-203) Study Guide

Menu
  • Contact Us
Menu

Configure an Azure Synapse Analytics Workspace Package – Data Sources and Ingestion

Posted on 2023-04-272024-08-05 by Benjamin Goodwin
  1. Log in to the Azure portal at https://portal.azure.com ➢ navigate to the Azure Synapse Analytics workspace you created in Exercise 3.3 ➢ on the Overview blade click the Open link in the Open Synapse Studio tile ➢ select the Manage hub item ➢ click Upload in the Workspace Packages section ➢ upload the brainjammer.whl file downloadable from the Chapter03/Ch03Ex05 directory on GitHub at https://github.com/benperk/ADE ➢ choose the Apache Spark Pools menu item ➢ hover over the Spark pool you created earlier ➢ select the ellipse (…) ➢ select Packages ➢ select the Enable radio button under the Allow Session Level Packages section ➢ in the Workspace Packages section, click + Select from Workspace Packages ➢ check the box next to the brainjammer.whl package (if the package is not there, wait; it is likely still being applied) ➢ be patient ➢ click again on the ellipse (…) ➢ once the .whl file is visible (see Figure 3.42), also consider clicking the notification bell at the top of the page and look for a Successfully Applied Settings notification.

FIGUER 3.42 Adding a workspace package in Azure Synapse Analytics

  1. Select the Develop hub link ➢ click the + ➢ select Notebook ➢ in the Attach To drop‐down box, select your Spark pool ➢ enter the following code snippet into the command window ➢ run the command:
    import pkg_resources for d in pkg_resources.working_set: print(d)
  2. The results show brainjammer 0.0.1 in the list of packages available on that instance. Enter the following snippet to execute the code in the csharpguitarpkg package. Figure 3.43 shows the output.
    from csharpguitarpkg.brainjammer import brainjammer brainjammer()

FIGUER 3.43 Consuming a workspace package in Azure Synapse Analytics

Configuring the workspace is a very powerful aspect of the platform. As long as your custom code runs with the default installed comments, you can run just about any computation. You are limited only by the limits of your imagination.
You might have noticed the requirements.txt file in Figure 3.42. That file is useful for managing versions of Python libraries that run in a pool. For example, numpy 1.19.4 is currently installed on the pools, as you can see by running the pkg_resources.working_set method in the Spark pool notebook. If you were to add numpy==1.22.2 to the requirements.txt file you created in the previous exercise and upload it, then that version of the library will be downloaded and installed on the pool instances from that point on. You can confirm this by running pkg_resources.working_set again after the change has successfully been applied.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Archives

  • August 2024
  • July 2024
  • June 2024
  • May 2024
  • March 2024
  • February 2024
  • January 2024
  • December 2023
  • November 2023
  • October 2023
  • September 2023
  • August 2023
  • June 2023
  • May 2023
  • April 2023
  • March 2023
  • February 2023
  • January 2023
  • December 2022
  • November 2022
  • October 2022
  • July 2022
  • May 2022
  • March 2022
  • February 2022
  • January 2022
  • December 2021
  • November 2021
  • October 2021
  • September 2021
  • May 2021

Categories

  • ARM TEMPLATE
  • Create an Azure Data Factory
  • DATA EXPLORER POOLS
  • Design Analytical Stores
  • MANAGED PRIVATE ENDPOINTS
  • Microsoft DP-203
© 2025 Microsoft Azure Data Engineering Associate (DP-203) Study Guide All Rights Reserved