Locking Strategy in Dataplanes
Generally, a Dataplane (DP) operates on a multicore processor. One core has a thin layer which interacts with the Control Plane running outside. Let’s call this the Slowpath. The other cores run one busy loop per core which brings in packets from NIC (Rx) and sends packets back to the NIC (Tx). Let’s call this the Fastpath. There is a shared data structure between the Slowpath and the Fastpath. Let’s call it the Datamodel. Typically, the Slowpath would do read-write operations to the Datamodel based on signaling with the Control Plane. The Fastpath would consult the Datamodel which will drive the logic of switching the packet across the Fastpath. A schematic representation of this is in the following diagram
As the signaling happens on the Slowpath with the Control Plane, the Datamodel undergoes changes. So the Datamodel, depending on its complexity, can potentially be in an inconsistent state when the Fastpath tries to consult it. In the worst case it can prove fatal to the operations in the Fastpath. To prevent these problems a trivial solution is to take a global write-lock on the Datamodel at the Slowpath when the Datamodel is being modified by the Slowpath during the course of signaling with the Control Plane on a per message basis. Every time a message arrives at the Slowpath, the write-lock is taken, the Datamodel is modified and the write-lock is given up. The Fastpath takes the read-lock before beginning the processing of a message and gives up the read-lock after the processing of the message. This ensures that the Fastpath always sees a consistent state of the Datamodel during packet processing. When Slowpath is modifying the Datamodel, none of the Fastpath contexts are processing the packets and when any of the Fastpath contexts are processing the packets, the Slowpath is not modifying the Datamodel.
There is a reason why the strategy of the locking described previously was declared as a trivial solution. The write-locks by design have a higher preference otherwise the multiple readers can starve the writer. Thus, the Slowpath signaling hijacks the data processing by all the Fastpath contexts. As the signaling increases on Slowpath, the situation becomes even worse for the Fastpath contexts. If Slowpath runs at 100 % CPU load of Signaling, it means it is holding the write-lock almost all the time. This correspondingly means that the Fastpath cannot process any packet. In general if we have N cores where 1 core is running Slowpath and the other N-1 cores the Fastpath, then if C is the CPU utilization percentage of the Slowpath, then the effective capacity of the Fastpath is not N-1 cores but approximately equal to (100-C)*(N-1)/100. Consider this formula for three values of Slowpath CPU utilization. C = 0 % means no signaling, in this case the number of Fastpath cores obtained by the formula is N-1. This is intuitive in the sense that all the Fastpath cores are running at full capacity because there is no lock-contention with the Slowpath. C = 100 % means that signaling is happening at the max capacity of the Agent, in this case the number of Fastpath cores obtained by the formula is 0. This is again intuitive in the sense that all the Fastpath cores are blocked on the read-lock all the time because the write-lock is held all the time at the Slowpath. At C = 50 %, the number of Fastpath cores obtained is half of N-1. It is easy to see how the Slowpath is hijacking the data processing at the Fastpath. Can something better be done, is the question we are seeking answers to.
Let us assume that the Datamodel consists of some Records. Each Record has a key which makes it unique. Further, we take it that when the packets arrive on the Fastpath, it uses the fields in the packet to ultimately locate a Record in the Datamodel and then operates on this Record while the current packet is being processed. Let us also assume that each core would operate on a particular Record and two cores would not act on the same Record. Normally this a fair assumption since in any typical fastpath the NIC ensures that the packets of a single flow go on a single queue and a good fastpath design ensures that a single NIC queue is polled for packets on a single core. Given these assumptions, it is clear that there is no merit in locking the whole Datamodel while processing the packet. What needs to be locked at Fastpath is the Record itself. Likewise, when the Slowpath is manipulating the Record, it needs to lock that Record. Thus, each Record can be provisioned with its own lock, the RecordLock. This ensures that the lock contention between the Slowpath and the Fastpath is at a Record level and not at the Datamodel level. This is highly optimal. Say there are 10K records in the data model. If the signaling is currently modifying, say, the first 1K records and the Fastpath is operating on, say, the last 2K records, then there is no lock contention at all between the Fastpath and Slowpath because Record level locking is being used. The only case where the lock contention does happen is in the rarest of the rare scenario where the Fastpath is operating right now with a packet which is the very packet where the Signaling is also going on at the Slowpath. That rare contention too will affect only one core of the Fastpath which is handling that record and not the other cores handling the other records.
There, however, remains one problem. When the Records are added to or deleted from the Datamodel, it is the entire Datamodel which undergoes a change. The very first act performed at the Faspath, after reception of a packet, is to locate the Record in the Datamodel. So, this poses two challenges – one, how to organize the records in the Datamodel such that the Fastpath can locate it as quickly as possible, and second, how to ensure that there are no race conditions between the Fastpath locating the Record in the Datamodel and the Slowpath e.g. deleting that Record around the same time.
A typical way to ensure a quick lookup of Records in a Datamodel is to use a Hash table. The Hash would contain entries which map a key and a value. Let us say the Slowpath uses a Hash and while adding the records into the Datamodel, it adds an entry into the Hash with a key of the Record which makes the Record unique and a value which is equal to the index of the Record into the Datamodel. When the Fastpath receives a packet, it constructs the key and does a lookup into the Hash for that key, it obtains the value which is the index into the Datamodel and is thus able to access the Record using that index for further operations. We can see now that the Hash becomes a critical section datastructure. Typically, the Slowpath would do an addition, deletion and modification of records into the Hash and the Fastpath would look up the records there. To ensure that there are no race conditions, it is important to have a lock on the hash during these operations, let’s call it the HashLock. So, the HashLock acts like a global lock between Slowpath and the Fastpath. The Slowpath takes a write lock while adding or deleting entries into the Hash. The Fastpath also takes a write lock while looking for entries into the Hash. It is assumed here that the locks are held for a considerably shorter period of time than the global lock of the trivial solution introduced earlier as the lock is needed only for addition, deletion and lookup based on a key value pair by the Slowpath and the Fastpath. It is tempting to take just a read lock on the Hash at the Fastpath however this has the potential to starve the Fastpath in high signaling scenarios due to higher preference of the write lock. The whole system is still subject to the limitation of how fast the operations can be done in the Hash. Effectively one can measure how fast the operations can be done into the Hash on a single core without the locks, that will govern the whole system capacity, and this is thought to be so high that it wouldn’t affect the packets per second being handled by the Fastpath in practice.
Thus, the Hash acts as the gatekeeper. From the Fastpath perspective, if an entry is located into the Hash, then the Datamodel entry for that Record should not be acted upon by the Slowpath during the course of this message being processed. Eg. the Record in the Datamodel should not be deleted by the Slowpath while the Fastpath is still operating on the Record with this message. This can be accomplished by a sequence of lock operations with the HashLock and the RecordLock between the Fastpath and Slowpath thus –
At Fastpath –
. Packet comes in
. Take write-lock on the HashLock
(Now Slowpath will block on updating the Hash)
. Lookup the Hash with key/value pair, assume an entry found (otherwise release the write-lock on HashLock and return, Fastpath would presumably take appropriate action)
. Access the Record and take a read-lock on the RecordLock (note that we are taking the RecordLock before giving up the HashLock)
(Now the Slowpath will not be able to update the Record)
. Release the write-lock on HashLock
. Do Packet processing
. Release the read-lock on the RecordLock
At the Slowpath (Record Addition) –
. Signaling request comes in
. Add the Record in the Datamodel (since the Record entry is not in Hash yet, the Fastpath can never operate on it)
. Take write-lock on the HashLock
(Now Fastpath will block on any attempt to query the Hash)
. Add the entry of the Record into the Hash
. Release the write-lock on the HashLock
At the Slowpath (Record Modification) –
. Signaling request comes in
. Take write-lock on the RecordLock
(Now Fastpath will block on any attempt to operate on this Record)
. Modify the Record
. Release the write-lock on the RecordLock
At the Slowpath (Record Deletion) –
. Signaling request comes in
. Take write-lock on the HashLock
(Now Fastpath will block on any attempt to query the Hash)
. Remove the entry from Hash
(Now Fastpath will not find the entry into the Hash for this Record)
. Release the write-lock on the HashLock
. Take the write-lock on the RecordLock (obtaining this lock ensures Fastpath is not operating on this record)
. Drop the Record from the Datamodel (implementations may choose to release the write-lock on RecordLock if required)
Way to go Prashant! ("articulated"!, lesser mortals would normally Present ;-) )
Quite informative and interesting