Wednesday, February 13, 2013

What is the exact difference between parallel and concurrent programming ?

Parallelism is about speeding things up, whereas concurrency is about dealing with simultaneity or nondeterminism.

When we talk about parallel programming, typically we are interested in reducing execution times by taking advantage of hardware’s ability to do more than one thing at once, whether by vectorization, instruction-level parallelism (superscalar architectures), or multiprocessing (multiple cores or processors). Parallel programming does not imply a particular programming model, and many techniques and languages for parallel programming are at various stages of development or adoption.

One model that seems promising is data parallelism, in which uniform operations over aggregate data can be sped up by partitioning the data and computing on the partitions simultaneously. For example, High Performance Fortran offers parallel loops, where an operation such as adding two arrays element-wise can be sped up by vectorizing a loop or splitting it between several processing units that handle different portions of the arrays simultaneously. Data Parallel Haskell, a research language, extends this model to allow parallelizing operations on ragged arrays of arrays. MapReduce is also very similar to the data parallel model. The nice thing about data parallelism is that it’s usually deterministic and implicit—you don’t have to write a program that says “do these four things at the same time,” but rather say “do this to that whole array” and the compiler or run-time system works out how to partition things. The less nice thing is that data parallelism usually only applies when you have a big array that you want to treat more or less uniformly.

Another approach to parallelism is futures or fork/join parallelism, in which some computation whose value is not needed right away is started in parallel with the “main” computation, and then when the main computation needs the result of the side computation, it may have to wait for it to finish. Futures were introduced, as far as I can tell, in MultiLisp. An expression e could instead be written (FUTURE e) , and it would mean the same thing, but might run in parallel until its value is actually needed. Another language that provides a similar model is Cilk, a C-like language that has a parallel procedure call mechanism that looks a lot like futures. Finally, my favorite language for this style of parallel programming, since I use it on a daily basis, is parallelmake, which, after computing the dependency graph for your desired build products, runs build stages in parallel if it can. Futures are often good when you have a computation that can be broken into pieces whose values depend on each other in well-defined ways, but aren’t necessarily uniform enough for data parallelism to work. Futures are notionally deterministic, in the sense that you shouldn’t be able to observe whether a future is run in parallel or whether the program waits for it to finish before proceeding, but actual implementations’ approach to side effects may result in nondeterminism. For example, parallel make should give the same results as serial make, but if you have two different build stages that, say, access some file in common that you don’t mention in the recipe, then they can interfere with one another and get a different result.

Another way to get parallelism is through explicit concurrent programming, though that’s probably not the best way to get parallelism and not the most important use of concurrency.

In concurrent programming, a program is typically factored into multiple threads of control with distinct responsibilities and purposes, which may run simultaneously or take turns. This is often about dealing with concurrency and nondeterminism in theworld, or about reducing latency. For example, your web browser needs to deal with events from the UI and events from the network while also performing some computation such as rendering a page. While it is possible to program that kind of thing as a single thread of control that checks for various events and does other computations, it’s often easier, conceptually, to think of it as several different processes cooperating on different aspects of the whole.

Most programming models for concurrency differ along two main axes: how processes are scheduled and how they cooperate. For the former, one end of the axis is preemptive multithreading, wherein several independent instruction streams run either simultaneously on different execution units or are given turns by a scheduler. This is the scheduling model used by Java and pthreads, for example. At the other end are coroutines, which yield control to one other at well-defined points of the programmer’schoosing. This is the idea behind Ruby’s fibers. Other models for scheduling include event loops with callbacks, as used by some GUI libraries, and stream transducers in dataflow programming. One advantage of preemptive multithreading is that it can reduce latency, and it doesn’t require programmer effort to figure out when to transfer control, but a big disadvantage is that it can make coordination more difficult.

Coordination—how multiple threads of control cooperate to get something done—is to me the more interesting axis. Probably the most common model, and the hardest to use, is shared memory with explicit synchronization. This is the model used by pthreads and, to some extent, Java. In this model, mutable objects may be shared by multiple threads running at the same time, which is a common source of bugs. Synchronization mechanisms such as locks can prevent interference, but they introduce problems of their own. Java improves on this model a bit with monitors, which make some synchronization less explicit, but it’s still pretty difficult to get right.

Because of the problems with shared-memory concurrency, two other models for concurrent communication have been gaining traction lately: transactional memory and message passing. In a transactional memory system, memory appears to be shared between threads, but it provides atomic transactions, whereby a thread may make several accesses to shared mutable variables and have guaranteed isolation.This has been implemented in software, originally in Concurrent Haskell, and more recently in other languages such as Clojure. In message-passing concurrency, threads no longer share mutable variables, but communicate by explicitly sending messages. The poster language for message-passing concurrency is Erlang, and another language with a slightly different approach is Concurrent ML.


All that said, most modern programming languages support both concurrency and parallelism, often via multiple paradigms. For example, C#’s native model for concurrency is shared memory with synchronization, but there are libraries for both message passing and software transactional memory. In some sense, shared memory is the least-common denominator on which other concurrency approaches and parallelism may be implemented (though tricky or missing memory models can make it harder to get right than it seems). Many programs can benefit from both parallelism and concurrency. For example, the web browser I mentioned earlier may be able to speed up particular tasks, such as rendering, by parallelizing them, at the same time that it uses concurrency to deal with concurrency in the world.

Another thing that I should mention is that this answer deals almost entirely with concurrency and parallelism in software. Both are fundamental to hardware as well, but I’m less qualified to speak on that. Suffice it to say that modern processors are both highly concurrent, with interrupts and such, and highly parallel, with pipelines and multiple functional units.

No comments:

Post a Comment