The efficiency of C++17 in embedded development, Part 2

Alexander Sopov

Published Sep 21, 2023

In the first article I shown that in C++17 it's efficient to return object by value and use std::optional instead of success flag. Another very common case is than the function can return some value or different errors could happen during the execution. On the "big" system the normal approach is "return result, if a error happens throw an exception". It's a good approach helps to make code clean, but in the embedded world the exceptions are disabled for the majority of applications. So the standard scenario - again - is to pass error code as a result and to assign actual result to the parameter passed by reference. And again, C++17 provides a better way: std::variant. It's a type-safe union constructed to handle one of many types at once, but for this scenario only two are needed: the error type and the result type. The following example shows how it can be used:

#include <variant>

enum class ErrorCode {
    Ok,
    Error,
    AnotherError
};

using ResultType = int;

bool externalCondition();
int externalValue();
void externalConsumer(int value);
void externalLog(Result result);

ErrorCode getResultOld(ResultType& value)
{
    if (externalCondition()) {
        value = externalValue();
        return ResultCode::Ok;
    }
    return ErrorCode::Error;
}

void consumerOld()
{
    ResultType value;
    if (const auto result = getResultOld(value); result == Result::Ok)
    {
        externalConsumer(value);
    }
    else
    {
        externalLog(result);
    }
}

std::variant<ErrorCode, ResultType> getResultNew()
{
    std::variant<ErrorCode, ResultType> result;
    if (externalCondition()) {
        result = externalValue();
    }
    result = ErrorCode::Error;
    return result;
}

void consumerNew()
{
    if (const auto result = getResultNew(); std::holds_alternative<ResultType>(result))
    {
        externalConsumer(std::get<ResultType>(result));
    }
    else
    {
        externalLog(std::get<ErrorCode>(result));
    }
}

After compilation by GCC 12.2 for ESP32 it will be:

getResultOld(int&):
        entry   sp, 32
        call8   _Z17externalConditionv
        movi.n  a8, 1
        beqz.n  a10, .L6
        call8   _Z13externalValuev
        s32i.n  a10, a2, 0
        movi.n  a8, 0
.L6:
        mov.n   a2, a8
        retw.n
consumerOld():
        entry   sp, 32
        call8   _Z17externalConditionv
        bnez.n  a10, .L11
        movi.n  a10, 1
        call8   _Z11externalLog9ErrorCode
        j       .L10
.L11:
        call8   _Z13externalValuev
        call8   _Z16externalConsumeri
.L10:
        retw.n
getResultNew():
        entry   sp, 48
        call8   _Z17externalConditionv
        movi.n  a8, 0
        movi.n  a2, 1
        beq     a10, a8, .L14
        call8   _Z13externalValuev
        mov.n   a2, a10
        movi.n  a8, 1
.L14:
        s8i     a8, sp, 4
        l32i.n  a3, sp, 4
        retw.n
consumerNew():
        entry   sp, 32
        call8   _Z17externalConditionv
        beqz.n  a10, .L18
        call8   _Z13externalValuev
        call8   _Z16externalConsumeri
        j       .L17
.L18:
        movi.n  a10, 1
        call8   _Z11externalLog9ErrorCode
.L17:
        retw.n

The final assembler code is very similar for both styles. The C++ code for getResultNew() function is a bit too explicit because current compilers - especially GCC - doesn't process non-mandatory copy elision well enough for the target embedded platforms. But if you don't fight for each byte of ROM and CPU circle, the shorter form also works well:

std::variant<ErrorCode, ResultType> getResultNew()
{
    if (externalCondition()) {
        return externalValue();
    }
    return Result::Error1;
}

And finally, it's time to take a look on the another part of function call: parameters passing. Once again, it was well known that the objects shall be passed by reference. Otherwise, they will be pushed to the stack and popped back. But current C++ ABI allows to pass small objects by value in registers. While it depends on C++ABI of the target platform, for both ESP32 and STM32 the size of an object that can be passed in registers is about 4 32-bit words. Very common and obvious example of such objects are std::string_view and std::span. Both contain a pointer and a size value and fit in 2 32-bit words. Of course, to directly use std::span we need C++20. But there are no language restriction to create the same thing using C++17. There is an example from my GitHub repository:

#include <cstddef>
#include <cstdint>
#include <array>
#include <string_view>

template <typename T>
class MemoryView
{
public:
    typedef T value_type;
    typedef T* pointer;
    typedef typename std::add_const<typename std::remove_const<T>::type>::type* const_pointer;
    typedef T &reference;
    typedef typename std::add_const<typename std::remove_const<T>::type>::type &const_reference;
    typedef pointer iterator;
    typedef const_pointer const_iterator;
    typedef std::size_t size_type;
    typedef std::ptrdiff_t difference_type;

    constexpr MemoryView() : begin_(nullptr), size_(0) {}
    constexpr MemoryView(T* p, size_type s) : begin_(p), size_(s) {}

    template<typename C = std::remove_const<T>, std::enable_if_t<std::is_same_v<T, std::add_const_t<C>> || std::is_same_v<T, C>, bool> = true>
    constexpr MemoryView(const MemoryView<C> &other) : begin_(other.begin()), size_(other.size()) {}

    template<std::size_t N, typename C = std::remove_const<T>, std::enable_if_t<std::is_same_v<T, std::add_const_t<C>> || std::is_same_v<T, C>, bool> = true>
    constexpr MemoryView(std::array<C, N> &ar) : begin_(ar.begin()), size_(ar.size()) {}

    template<std::size_t N>
    constexpr MemoryView(T(&ar)[N]) : begin_(ar), size_(N) {}

    template<typename C = T, std::enable_if_t<std::is_same_v<char, std::remove_const_t<C>>, bool> = true>
    constexpr operator std::string_view() const{ return { begin_, size_ }; }

    T &operator[](size_type i) { return begin_[i]; }
    constexpr T &operator[](size_type i) const { return begin_[i]; }
    constexpr auto begin() const { return begin_; }
    constexpr auto end() const { return begin_ + size_; }
    constexpr auto size() const { return size_; }

private:
    T* begin_;
    size_type size_;
};

The conversions from C-style array and std::array are made implicit to simplify usage and the same is done for conversion from the non-const to const values intervals. Also the implicit conversion to std::string_view is provided if the types are compatible. And the efficiency can be shown by a following trivial case:

const char* getEnd(const char* begin, size_t size)
{
    return begin + size;
}

const char* getEnd(std::string_view str)
{
    return str.begin() + str.size();
}

GCC 12.2 for ESP32:

getEnd(unsigned int, char const*):
        entry   sp, 32
        add.n   a2, a3, a2
        retw.n
getEnd(std::basic_string_view<char, std::char_traits<char> >):
        entry   sp, 48
        add.n   a2, a3, a2
        retw.n

So, this is a point. While effectively it's the same, there is a significant difference in logic. The (properly constructed) view object is internally consistent. You have no chance occasionally miss the size or calculate the end incorrectly. You can find other examples of the embedded code written in modern way on my GitHub page: https://github.com/gdex1974

To view or add a comment, sign in

The efficiency of C++17 in embedded development, Part 2

Alexander Sopov

More articles by Alexander Sopov

Explore content categories

More articles by Alexander Sopov

From throw to catch. How much time it takes in embedded system?

Cost of the exception handling in embedded development.

Building firmware for STM32 with LLVM toolchain

Battery-powered DIY air quality monitoring system.

The efficiency of C++17 in embedded development, Part 1

DIY embedded development and C++

Explore content categories