Sunday, 17 September 2017

Parallel Programming: Introducing Task Programming Library

Introduction To Parallel Programming

Parallel programming is a programming model wherein the execution flow of the application is broken up into pieces that will be done at the same time (concurrently) by multiple cores, processors, or computers for the sake of better performance. Spreading these pieces across them can reduce the overall time needed to complete the work and/or improve the user's experience. However, this is not always the case as there are several pitfalls you need to be aware of and we will certainly discuss them through the series.
The idea of parallelism is not new, but years ago the availability of multi-cores processors was low so only the employees of research labs were able to take advantage of it. Thankfully, the times have changed and processors have greatly evolved. The current mobile devices have more than one core and four or eight cores for PCs are pretty common.
Still, the developers are not as aware of the features of the TPL or of the advantages of parallel programming as they should be. Therefore, the main goal of this series is to introduce the new TPL library and show how easy it is to use it compared to the old classic threading model. In the end, you should be able to incorporate it into your current projects or implement it in the new ones. Of course, if needed.

When To Go Parallel

The entire parallelism concept is nothing more than just a performance play. That is its key benefit. Even though there are some scenarios where concurrent execution is a clear solution, you usually can't just automatically assume that dividing the workload over several cores will outperform the sequential execution, so a lot of measurement is usually involved. For this purpose, Stopwatch from the System.Diagnostics namespace usually meets all the requirements needed.
"Going parallel" is not a cure for everything since there are some caveats you should definitely be aware of. With these in mind, you should be able to make the proper and educated decision whether to stay sequential or dive into parallel programming.
Additional overhead: there is always some overhead involved since the TPL engine needs to manage all the features. So if you have just a small amount of work to do, running it concurrently may not outperform the sequential version.
Data coordination: if your pieces of work need to access and alter the same data or resources, you will need to add some kind of coordination. The more of it, the worse parallel performance you will achieve. However, if the pieces are independent and isolated from each other, there is nothing to worry about.
Scaling: the TPL engine will usually take care of all the scaling. Still, there are cases when you need to do it by yourself. Even though there are many options available to play with, these will usually give you uncertain results, so be ready to play the "hit or miss" game. This is due to various hardware designs and their limits. Whereas adding new cores can provide you a significant performance improvement, new cores might not be used 100% and there is a point where it won't improve the performance at all.

Beginning With Task Programming Library

The support of parallel programming within the .Net framework is not new since it is supported from its very first version 1.0. We refer to this as a classic threading model. Even though it works really well, managing all the parallel aspects is complicated, so many times the applications end with unexpected results.
On the other hand, the TPL is built on the foundation of the classic threading features and manages many aspects for you, so you will need to write less code to achieve the same behavior. Actually, the reduction of the amount of code is huge.
We start with the basics of the Task class that can be considered to be the heart of the entire library. Please note that as long as you want to rely on TPL's features, you need to reference the proper namespace in your project.
  1. using System.Threading.Tasks;  

Creating And Starting New Task

In the simplest scenarios to create and start a task, you just need to provide its body that represents the workload you want to run in parallel by passing in a System.Action delegate. There are several ways to declare the task's body. These are listed below and demonstrated in the first example.
  • Using Action delegate
  • Using anonymous function
  • Using lambda function
For the sake of simplicity, our pieces of workload running concurrently will be represented by a simple HelloConsole method that will print one line to the console.
  1. static void HelloConsole()  
  2. {  
  3.        Console.WriteLine("Hello Task");  
  4. }  
After creating a new instance of the Task class and passing the workload you want to perform in the constructor argument, you just need to call the instance Start() method to begin with the execution. 
The following example shows the three options for declaring the Task object along with the console output.
  1. static void Main(string[] args)  
  2. {  
  3.      //Action delegate  
  4.      Task task1 = new Task(new Action(HelloConsole));  
  5.   
  6.      //anonymous function  
  7.      Task task2 = new Task(delegate  
  8.      {  
  9.           HelloConsole();  
  10.      });  
  11.               
  12.      //lambda expression  
  13.      Task task3 = new Task(() => HelloConsole());                        
  14.               
  15.      task1.Start();  
  16.      task2.Start();  
  17.      task3.Start();  
  18.               
  19.      Console.WriteLine("Main method complete. Press any key to finish.");  
  20.      Console.ReadKey();  
  21. }  
 
 Image 1: Creating and running simple tasks
 
Note: If you have some simple and short-living tasks, you can start them directly using the Task.Factory.StartNew() static method without having to explicitly create the object.
  1. Task.Factory.StartNew(() => {  
  2.      HelloConsole()  
  3. });  

Setting Task State

If you need to perform the same workload on a different set of data or just need to provide some parameter to the task, you need to pass in a System.Action<object> and an object representing these data/parameters. This process is very similar to supplying your console application with command line arguments. The following example shows this process by providing a simple string argument that will be printed to the console during the workload execution. 
  1. static void Main(string[] args)  
  2. {  
  3.      //Action delegate  
  4.      Task task1 = new Task(new Action<object>(HelloConsole), "Task 1");  
  5.   
  6.      //anonymous function  
  7.      Task task2 = new Task(delegate(object obj)  
  8.      {  
  9.           HelloConsole(obj);  
  10.      }, "Task 2");  
  11.   
  12.      //lambda expression  
  13.      Task task3 = new Task((obj) => HelloConsole(obj), "Task 3");  
  14.   
  15.      task1.Start();  
  16.      task2.Start();  
  17.      task3.Start();  
  18.   
  19.      Console.WriteLine("Main method complete. Press any key to finish.");  
  20.      Console.ReadKey();  
  21. }   
We have also slightly altered the HelloConsole method that now accepts an object argument that will be printed to the console.
  1. static void HelloConsole(object message)  
  2. {  
  3.      Console.WriteLine("Hello: {0}", message);  
  4. }  
 
 
Image 2: Setting state/supplying a parameter
 
Please note that the order of the task's completion might differ on your machine since it depends on how fast each task is executed.

Getting A Task's Result

To get a result from a Task, you need to create an instance of Task<T> instead of just a pure Task. T represents the type of the result that will be returned. Returning the desired result is identical to other C# methods, so you use the "return" keyword. Finally, to fetch the result, you need to call the Result property. Note that reading this property will wait until its task has completed.
  1. static void Main(string[] args)  
  2. {  
  3.      //creating the task  
  4.      Task<int> task1 = new Task<int>(() =>  
  5.      {  
  6.           int result = 1;  
  7.                   
  8.           for (int i = 1; i < 10; i++)  
  9.                result *= i;  
  10.   
  11.           return result;  
  12.      });  
  13.               
  14.      //starting the task  
  15.      task1.Start();  
  16.               
  17.      //waiting for result - printing to the console  
  18.      Console.WriteLine("Task result: {0}", task1.Result);              
  19.               
  20.      Console.WriteLine("Main method complete. Press any key to finish.");  
  21.      Console.ReadKey();  
  22. }  
 
Image 3:  Getting a result from a task

Cancelling A Task

If we have more complex tasks that take some time to complete, we undoubtedly need a way how to cancel them before they finish if needed. For this purpose, the TPL introduced cancellation tokensthat are used to cancel the given tasks. To be able to cancel a started task, we need to provide an instance of a CancellationToken in the task's constructor.
  1. Task task = new Task(() =>  
  2. {  
  3.      //task's body  
  4. }, token);  
Acquiring this token is a two-step process:
First, we need to create an instance of CancellationTokenSource:
    1. CancellationTokenSource cancellationTokenSource = new CancellationTokenSource();   
Next, to get the required CancellationToken instance, we call the CancellationTokenSource.Tokenproperty:
    1. CancellationToken token = cancellationTokenSource.Token;  
Finally, when the token is acquired and passed to the task's constructor, we simply call the Cancel() method of CancellationTokenSource to cancel it.
  1. cancellationTokenSource.Cancel();  
Calling the Cancel() method won't cancel the task immediately. Therefore, in the body of a given task you need to monitor the token whether a cancellation was requested by checking the token's IsCancellationRequested property. Once set to true, a cancellation was requested and you can cancel it either by calling "return" or throwing an OperationCanceledException.
The following example shows a basic use of cancellation tokens to cancel a running task.
  1. static void Main(string[] args)  
  2. {  
  3.      //creating the cancelation token  
  4.      CancellationTokenSource cancellationTokenSource = new CancellationTokenSource();  
  5.      CancellationToken token = cancellationTokenSource.Token;  
  6.   
  7.      //creating the task  
  8.      Task task = new Task(() =>  
  9.      {  
  10.           for (int i = 0; i < 100000; i++)  
  11.           {  
  12.                if (token.IsCancellationRequested)  
  13.                {  
  14.                     Console.WriteLine("Cancel() called.");  
  15.                     return;  
  16.                }  
  17.                    
  18.                Console.WriteLine("Loop value {0}", i);  
  19.           }  
  20.      }, token);  
  21.   
  22.      Console.WriteLine("Press any key to start task");  
  23.      Console.WriteLine("Press any key again to cancel the running task");  
  24.      Console.ReadKey();  
  25.               
  26.      //starting the task  
  27.      task.Start();  
  28.               
  29.      //reading a console key  
  30.      Console.ReadKey();  
  31.               
  32.      //canceling the task  
  33.      Console.WriteLine("Canceling task");  
  34.      cancellationTokenSource.Cancel();  
  35.   
  36.      Console.WriteLine("Main method complete. Press any key to finish.");  
  37.      Console.ReadKey();  
  38. }  
 
Image 4: Cancelling a task

Summary

The Task Programming Library is built on the classic threading model and greatly simplifies the management of concurrent workloads. As a result, it greatly reduces the amount of code we need to write thus helps to prevent typical problems that are associated with the older threading concept.

No comments:

Post a Comment