Difficulties of a DPDK Implementation
In this article, my intent is to discuss the difficulties that teams would encounter, when using DPDK instead of using the Linux Kernel. Clearly, this is not an exhaustive list that I have.
When you use DPDK, your code bypasses the Linux kernel and all processing typically happens in User Space. (Please note that, I will not be going into what DPDK is).
Why would you use DPDK?
The one and only reason is performance. Other's can argue, and provide more reasons, but in my opinion, this is the main reason, if not the only reason. The performance that you might be able to achieve with a DPDK implementation could be several times of what you might get otherwise.
Now, let's talk about what you loose, what to watch out for and what difficulties you will face, when you make a decision to use DPDK.
1) Experience of Developers:
Most who work in a Linux environment at system or network level, would be quite comfortable in using Linux System Calls. There is little to no learning curve. When you ask these developers to implement software using DPDK, then, they will not just have to understand the DPDK library/API; but also understand somewhat of a different programming model. Understanding and implementing DPDK based applications was much harder years ago, when it first came out. However, lately, it has matured over time and now there are several sample/example apps available, which did not exist several years ago. These sample applications make learning DPDK a lot easier, than it was before.
Developers will probably do Packet reception using a polling mechanism. They will have to understand about pinning cores, how to allocate mapped memory for pkt buffers (mbufs), how to move the mbufs around between different processes, without making copies of the pkt data. How to efficiently move pkts through your pipeline without excessive use of multiple cores.
Weak programmers should not be put on this project.
2) Linux utilities:
When using DPDK, the network port/interface is high-jacked by DPDK, and the kernel is no longer aware of it. Which means, now, the loss of ifconfig, ip, tcpdump, etc. So, either use open source dpdk tools or build your own for debugging, and looking at counters, etc.
3) Arp support:
No kernel, no arp. So, take this into consideration. You can implement your own through sample code.
4) IP Stack:
No IP Stack without kernel, so no support for TCP termination. No IPSec support. So, if you application requires these, you might want to consider alternate kernel based implementation. But, you can say, that you can support these by integrating various opensource and building portions and glueing things together. Sure, you can, but that is extra work.
5) Unix/Linux System Calls:
When using DPDK, since all processing is happening in User Space instead of kernel, Linux System Calls are not used. One benefit is, the cost of the System Call is avoided, which includes, context switch from User mode to kernel mode (the trap), and copy in/out of data from user to kernel and back from kernel to user mode. Also, possible savings of NIC card interrupts. The disadvantage is that, the protection that the Linux kernel provides is lost.
6) Shim Layer:
Another consideration is, what if you want to run your applications, in an environment which does not have DPDK support. Would you rewrite your applications. If you envision such a scenario, consider a shim/API layer, which is used between DPDK and your application code. This could (possibly) become a big effort, if your application is too heavily dependent upon a lot of DPDK constructs.
7) Strength of Implementation:
With DPDK, if code is Elegantly written with meticulous care, you will have a highly performant software, which should achieve close to line rate with even a small pkt size (given today's powerful processors and nic cards). However, if sections of your software are poorly implemented, then, you are likely to have a debugging nightmare. The reason for this should be obvious, since you are working without kernel address space protection. One program error/bug can cause another program to crash, if you are sharing memory and or moving around your mbufs between processes, for example, in a pipeline processing model.
So, ask yourself, if your software needs to be highly performant (not talking about horizontal scaling here), if the answer is "absolutely", then, maybe you should consider DPDK. If the answer is "no", then, there really should be no reason for you to take the extra headache of using DPDK
Nice ..
They have added the IPsec support. Debugging is a nightmare in all Multi-threaded/Multi-core code.
Regarding (4), this isn't exactly true. You have the possibility to squirt the packet back into the kernel and let it be handled normally. A lot of event-driven networking code uses libevent for dispatching. libevent could be extended to use DPDK and a user-space TCP/IP (and UDP, of course) stack.