You have an Azure subscription that contains the following resources

Category : Microsoft Azure Data Engineering | Sub Category : Practice Assessment for Exam DP-203 - Data Engineering on Microsoft Azure | By Prasad Bonam Last updated: 2023-09-10 03:18:33 Viewed : 288


You have an Azure subscription that contains the following resources:

  • An Azure Stream Analytics job named Job1 that is configured to use six Scale Units (SUs)
  • An Azure event hub named Hub1 that contains a single partition
  • An event hub named Hub2 that contains 12 partitions

Job1 reads data from Hub1 and writes data to Hub2.

You need to ensure that Job1 can run parallelized.

Which two methods can you use? Each correct answer presents a complete solution.


Creating a job to partition the input into a new event hub that has 12 partitions and changing Job1 to use the new job as input or repartitioning the input within Job1 will parallelize the processing.

Increasing the SUs to 36 or decreasing the SUs to 12 will not improve performance, since the job is not parallelized.

Use repartitioning to optimize Azure Stream Analytics jobs | Microsoft Learn

Get started with Azure Stream Analytics - Training | Microsoft Learn


To ensure that your Azure Stream Analytics job (Job1) can run parallelized and take advantage of the configured Scale Units (SUs), you can use the following methods:

  1. Increase the number of partitions in the input source (Hub1):

    • In this case, you would increase the number of partitions in the Azure Event Hub (Hub1) from its current single partition to multiple partitions.
    • Azure Stream Analytics jobs can read from multiple partitions in parallel, and the number of SUs is distributed across the partitions. By having more partitions, you enable parallel processing of data, allowing your job to run parallelized.
    • It is important to note that the number of partitions should align with the number of SUs. For optimal parallelization, you should aim for a partition count that is a multiple of your SU count (e.g., 6 SUs with 6 partitions for balanced parallelism).
  2. Increase the number of output partitions in the output source (Hub2):

    • Similar to the input side, you can increase the number of partitions in the target Azure Event Hub (Hub2) from its current 12 partitions to more partitions.
    • By doing this, you allow Azure Stream Analytics to write data to multiple partitions in parallel, which can help maximize the utilization of your SUs.
    • Again, ensure that the number of partitions aligns with your SU count for balanced parallelism.

Both of these methods enable parallel processing and can help your Azure Stream Analytics job utilize the configured Scale Units effectively. The choice between them depends on your specific use case and whether you have control over the configuration of the input and output Event Hubs.

Search
Related Articles

Leave a Comment: