Implementing performant scalable solutions in confidential computing

Implementing performant scalable solutions in confidential computing

Confidential computing (also known as Trusted Execution Environments or SGX enclaves) is a means to calculate secret information while maintaining confidence that malicious software installed on that machine cannot see or amend those calculations.

The traditional way to communicate with enclaves is expensive

The normal way to communicate with these enclaves is through machine generated code, made from an interface that you have designed using EDL files. These EDL (Enclave Definition Language) files follow the design patterns of older IDL style interfaces used in RPC, COM and CORBA. The EDL describes a set of functions on which Intel's code generator (the edger8r) converts into proxy and stub code that allows communication between host applications and enclaves.  

These calls, however, are expensive. For a conventional C style call, we typically push a few registers for the parameters before jumping to that function address. At a minimum, this takes one or two clock cycles. For enclave calls this is approximately 8000 cycles. It’s not a serious problem when setting up the environment of an enclave, but prohibitive when writing high throughput code.

Intel’s ‘switchless’ design has pros and cons

Intel's development team have been working on this challenge and have come up with the 'switchless' design. This involves marshalling data between the host and enclave, using separate threads, with the sender effectively suspended, until the receiver is complete. In a limited environment this locks valuable resources which then cannot be used for other purposes. The advantage is that the ocall is faster, as the thread does not need to be sanitised when calling into the other side of the enclave.  

Asynchronous programming for 100% capacity of enclave threads 

However, we think the most promising approach is to use our own queues and employ an asynchronous programming style throughout the entire enclave, so that while the work is being done on the other side of the enclave boundary, the thread can continue to do other work. This maximises the efficiencies of the enclave threads, allowing them to work at 100% capacity, something not possible with conventional and switchless ecall/ocall implementations.

We are investigating the use of SPSC queues using C++ atomic circular buffers, as we believe they look more efficient. The trick is to allocate the queues in host memory and then share their pointers with the enclave. Enclaves can read and write to host memory, but not the other way around, if you can set up this queue when the thread starts then you're good to go with sending messages between them.  

Performance test results

These are the test results when modifying Intel’s own switchless performance test app when doing 50,000 messages between host and enclave when sending 0 bytes of data:

No alt text provided for this image

Source: https://github.com/secretarium/demo_circular_buffer

As one can see there are impressive performance enhancements using this approach. Of course, this is not a real use case and if a large amount of data needs to be marshalled between host and enclave, secondary costs start mounting. You need to implement your own serialization logic across this queue, which adds complexity. With a large blob of serialized data, you also need to break it up into sections that fit into the blocks of the queue. This complexity increases when the data being marshalled is larger than the circular buffer itself. You also have to concern yourself with race conditions issues if multiple threads are wanting to work over the same channel.  

The nice thing about the edger8r and the Intel runtime is that all of this complexity is hidden from you. But this is at the cost of performance, and inefficient use of those precious resources in the enclave.

Don’t be frightened, Javascript is your friend

Although it might look quite hard, bear in mind that there is a lot of existing code out there that does this all the time, the best example of this is in javascript. This language is asynchronous from the ground up and is used by millions of programmers around the world to do complex network programming.  

Javascript is based on the libuv c library which is right in our territory, so why not use a hacked version of that to work in an enclave? Or the Asio C++ library. The core of each consists of an executor that receives pending callbacks and dispatches them when the work is complete.  

The other problem with asynchronous programming is that if your legacy code is synchronous then with conventional C++ 17 or earlier it looks as though you have a massive rewrite on your hands to make it asynchronous. Something that conventional organisations will baulk at.  

However if you are looking at C++ 20 we now have coroutines, these are regular functions that suspend when there is a blocking call and resume when an executor has data for them. The only change that you need to make to your functions is to wrap your return values in futures: use “co_return” instead of “return” and use “co_await” for every call into any coroutine that it needs to call. The costs of doing this radically reduces the costs in migrating your application into a much more efficient enclave implementation.  

So now you’re ready to increase the performance of your enclave applications by orders of magnitude. Without having to radically rewrite existing code bases. 

References:

https://www.ibr.cs.tu-bs.de/users/brenner/papers/2018-systex-fastorsecure.pdf

To view or add a comment, sign in

More articles by Edward Boggis-Rolfe

  • Intent is all you need

    We are heading towards a post programming language world, and LLMs are going to be a part of it. Developers who in some…

  • A little AI interlude

    Back in March I bumped into Lama2.c, for fun I rattled off in a day a confidential computing dll in Intel SGX that did…

  • Reading the past to prevent mistakes of the future

    Back in 1999 I was very unhappy to find out that my favorite computer mag was no longer being sold in England. Byte…

  • Verifiably secure SaaS

    This is a hot topic: only a few days ago it was announced that fraudsters spoofing the Bangladeshi Central Bank had…

  • What makes a software engineer tick?

    I am not a psychologist however there are several personalities in an organisation each with their own motivations…

    1 Comment
  • Is the way we buy software and services hurting the bottom line?

    There is a famous saying "nobody got sacked for buying IBM". Certainly in the 60’s and 70’s, the only game in town was…

  • SSL, isn't it best to shut the door before the horse has bolted?

    Now I have to admit some of the stuff in this article is not my zone of expertise in that the underlying maths and…

  • The world’s most valuable personality trait

    Once on holiday from school I met a film director, we got talking about education he was shocked and very angry when I…

  • What makes Fintech sexier? SaaS

    After working in software for over twenty years, the arguments for shrink wrapped code installed on a client site, with…

    2 Comments
  • sticker 1 - people 0

    People hard at work putting up a CloudMargin sticker..

    2 Comments

Others also viewed

Explore content categories