Linux Programming - Multiple Processes Vs Multiple Threads
Would you prefer multiple threads in one application or multiple processes? This is the question asked in many job interviews for embedded system developers. But this question is not asked during many system architecture designs where I strongly believe it should be asked and design decisions made based on that. We make fantastic UML diagrams and boast about design patterns used in the software, but this aspect is not considered seriously. This is important decision to be made which will have major impact on the performance and security architecture of the product.
My choice would be to use multiple processes rather than one big monolithic application with hundreds of threads. I would design my system modeled based on data flow and use processes even when there are strong data dependencies by using less overhead IPC mechanism such as shared memory. Threads will be used for operating on different data streams and channels originating from the same source. In a way I am a big fan of Google Android framework where there are separate applications and/or services for each unique functional feature.
Security considerations:
In the case of Linux security framework SELinux (https://github.com/SELinuxProject), the access controls are at the application level which means at the process level. If SELinux is enabled, the policy defines what access to resources and operations on them is allowed by processes. It is not possible to assign different privileges to different threads of the same process. For example, if a process A have permissions to access a file or socket then all threads, not only main thread, within that process are permitted to access. So, if single monolithic application is used then enforcing security will be difficult.
Single-core considerations:
If one sees this problem from execution perspective, then there are no differences between threads and processes in Linux. All POSIX threads in Linux is of system scope (process scope is NOT allowed) which makes all threads to compete among themselves and compete with threads within other processes in the system for its execution time slot. A thread is selected for execution based on scheduling algorithm and priorities. The priorities of threads and processes are normalized (1-99 for real-time and 100-139 for normal) and treated the same way.
If the same problem is seen from memory perspective then there are big differences, all the threads share the same virtual address space for code, data, BSS and heap except stack but whereas there is nothing common between two processes. When two processes are in parent child hierarchy then code section is shared, data is shared until child process modifies it (Page copy on write). Inter-process communication (IPC) mechanisms must be used if data is to be exchanged between two processes. IPC overheads are of big concern if either data volume is high or data delivery latency is critical.
When we design our applications we tend to choose multiple threads approach because it is easier to design with global functions, global data structures, global semaphores, heap buffers, etc. so don’t have to think affront much about data flow model since everything is accessible from everywhere. The same advantage will work as drawback. For example, one single null pointer dereferencing in a not-so-critical function would crash the entire application and one buffer-overflow vulnerability in some support or utility function would make entire application vulnerable to security attacks. So, it is preferable to design according to data flow needs. If there is no dependency between two independent tasks to be done then logic should be split and isolated in separate processes, for example, printing related functionalities can be isolated in a separate process from main clinical application. In cases where data dependencies are high it is still viable to use multiple processes with shared memory IPC with minimal overheads.
Multi-core considerations:
Multi-core Linux scheduler can pick any ready-to-run thread and queue it for execution on any core based on load on the core. When two threads sharing global data are run on different cores, it will result in dirty caches and subsequent cache line invalidation at the lowest level caches (L1) resulting in increased context switch times. Another aspect to be considered is thread migration, which is assigning a thread that was previously run on busy core to idle core, results in big overheads. Setting CPU affinity, restricting scheduler about cores on which a given process can be run, is one method of reducing this overhead. But setting CPU affinity for all threads of the same process will be counterproductive due to underutilization of other available cores. So, for multi-core processors it is better to design system with multiple processes (applications) with localized data as much as possible.
Conclusion:
Considering the points discussed above, multiple process approach should be the default choice in any new designs unless it is proven that it is impossible to accommodate IPC overheads even with shared memory because it is easier to combine two processes into single process when compared splitting single process application to multiple applications.
Well written and informative. Multiple processes definitely help in scaling up the system. Completely agree on the part that it needs to be considered during design.
Great article coming from your in depth experience. I would love to see how containerisation and virtualisation would add a different dimension to this approach.
Thanks for sharing your knowledge