Parallel computing in a nutshell

During the last 20 years, a set of programming tools have been developed and maintained by thousands of experts, which can be used for free to make awesome applications. Learn about these technologies and exploit your potential.

Nowadays, every programmer should knows how to assemble a parallel program in order to excel in a highly connected and competitive world, where the efficiency is the first priority in most of computational applications, such as games, physical simulations, industrial applications, mathematical computing, augmented reality, computer vision, robotics, cinematography renders, customized prothesis, intelligent houses, medical tools and many others scientific and industrial requirements which aims to enhance the human experience and extends its life.

The art of making efficient computing programs is known as High Performance Computing (HPC), which has as main objective to reduce the computing time with the least amount of resources, such as memory and instructions. This can be achieved by software techniques or by hardware optimizations (or both). In one hand, the hardware optimizations for HPC are mainly developed and promoted by manufacturers, in the other hand, the software best practices are proposed by scientific communities.

The modern computers are formed by a Random Access Memory (RAM) and a Central Processing Unit (CPU). The CPU could have more than one core, the cores are independent processors by themselves. The RAM is the memory space where the instructions of a program read and save data most of the time. The CPUs have a super fast memory embedded known as Cache, the correct use of this memory speed up the execution of programs. A computer program is composed by sequences of instructions known as threads, a serial program executes a queue of threads, while a parallel program could execute several threads at the same time. Each core could execute multiple threads, but it is recommendable to give only one thread to each one. Most of the modern computers are equipped with a Graphics Processing Unit (GPU), some GPUs could be programmed to execute thousand of threads at the same time, these are known as General Purpose GPUs (GPGPUs).

If you are thinking in developing a parallel program, I recommend you to use C programming language, since you can exploit your hardware better than any other language with it, but if you are used to code in an object oriented paradigm, you could use as well C++, but keep in mind that the tools I'm going to explain in the following paragraphs are developed to be used from C, so you will need to use some binding which encapsulates their functionality into C++ classes or mix your code with object functions and C-like routines.

There are five main ways to develop a parallel program:

  1. Using shared memory (One CPU with multiple cores using the same RAM).
  2. Using distributed memory (Multiples CPUs with their own RAM).
  3. Using a GPGPU.
  4. Creating an hybrid with shared and distributed memory.
  5. Creating an hybrid using CPUs and GPGPUs.

The shared memory scheme allows the accessing of multiple threads to the same memory space, the RAM. This scheme is common in the modern desktops, laptops and notebooks, which have a single processor with multiple

cores and a single RAM. These programs could be written using POSIX threads or OpenMP, the first option is implemented in the library pthreads and allows a high control over every thread of your program, you should use pthreads until you domain the C programming language and have a well understanding about your hardware and operating system. For beginners, I recommend OpenMP, which is easier to use and allows to generate programs with the same efficiency than pthreads for most of the cases.

The distributed memory scheme is used in clusters of computers, when you need to perform computations in several machines with their own RAM.

This scheme is commonly used for huge problems, where the work must be divided into many parts to be completed in a reasonably amount of time. The best option to develop a program with this scheme is the Message Passing Interface (MPI), a standardized programming interface to send and receive data between computers.

The GPGPUs are mainly used to perform the same computation over different data, such as in computer graphics, where the same calculations are executed over all the pixels, which could be thousands of millions.

The GPGPUs has shown their efficiency in several scientific applications, such as the N-body problem, the Fourier transform and some medical applications. The most widely used programming interfaces for GPGPUs are CUDA and OpenCL. CUDA is maintained by Nvidia, while OpenCL is developed by a community of scientist and experts of the industry. The GPGPUs can not access the RAM, instead they have to use their own embedded memory.

A programming expert could exploit these technologies to develop hybrid programs in order to solve specific problems. Nothing prevent us to create a "farm" of computers with the best CPUs and a "farm" of computers with the best GPGPUs working together to reach some "unattainable" objective, such as the complete simulation of an animal cell, the complete map of the connections between neurons of a human brain, or finding the cure of some cancer, just imagine the possibilities.

The goal of this first article is to serve as a quick guide to start with parallel computing and give an overview of the HPC world. Comments and observations are welcome.

To view or add a comment, sign in

More articles by Victor E Cardoso

  • Android: The big picture

    Nowadays, most of the software engineering work force is focused on mobile development, and there is a good reason for…

  • Pyramid of developer needs

    I was wondering about the stack of ideas hanging out in my head when I want to start a new "great" library or App; most…

  • Good programming has gone to hell

    I was reading the introduction to the site betteros.org (an attempt to make computer machines run better), and I think…

  • 5 nested levels can cause brain damage

    Every programmer has an interior wild an savage coder who is always inciting to write the code "quick and dirty", it is…

  • 8 tips every C/C++ developer should know about memory.

    Today I want to share 8 tips about memory allocation that every programmer must consider when designing an application.…

    2 Comments

Others also viewed

Explore content categories