Function Cache in C++

Sometimes, certain functions that perform very expensive computation, you might want to perform the computations only once and cache the results. That way for the same sequence of arguments, the results doesn't get computed but it is just returned. These function cache is not ideal for a function with side effects. What do we mean by function with side effects, before I explain that let delve a little into functional programming

In functional programming we have two types of functions pure and impure functions.

Pure function

Pure functions are functions that have no side effects. Side effects are parts of a functions that causes it to yield different results even for the same sequence of arguments. A pure function always yields a value and for the same sequence of arguments to the function, the function still yields the same value.

Take the example below


# include <iostream>

int add(int x, int y) {

return x + y;

}

For such a function add(2, 3) will always yield 5 no matter where it is called.

Impure function

Such a function always yields side effects. Such function either don't yield a value or dont take any arguments. A function that relies on data from a network to perform some process would qualify as an impure function.

Our function cache would be a class that accepts in its constructor the function we want to call ( a callable be it a lambda and std::function object or a plain function pointer). We then overload the braces operator on the cache class that checks if the result has been computed already in the past. if it has been, it will return that result. If not it will compute it and store the value in a map and then return the result.

The constructor

template <typename Function>
struct Cache {
    Cache(Function func)
        : _func(func)
    {}
    
private:
    Function _func;
};        

The overloaded braces operator

template <typename... Args>
    auto operator()(Args&&... args) {
        using R = decltype(_func(std::forward<Args>(args) ...));


        static_assert(is_non_void<R>::value, "return type of function cannot be void");

        std::tuple<Args...> arguments = std::make_tuple(std::forward<Args>(args) ...);
        static std::map<std::tuple<Args...>, R> cacheMap;


        if (cacheMap.find(arguments) != cacheMap.end()) {
            return cacheMap.at(arguments);
        }
        else 
        {
            R result = _func(std::forward<Args>(args) ...);
            auto Pair = std::pair<std::tuple<Args...>, R>{arguments, result};
            cacheMap.insert(Pair);
            return result;
        }
}        


After the constructor has been designed. We overload the braces operator such that it takes variadic template of paramter types. We take rvalue references to this parameters so that we can pass them to the function to be invoked. If we dont do this the compiler would consider the parameters as lvalues since they would be associated with memory locations on the stack and we would not have all the arguments as a whole to pass to the called function and we would have no way of passing them to the called function. We then use perfect forwarding to pass the argument pack to the function to be called. Now as it stands we dont know the return type of the function. So we perform a compile time evaluation of the function, we then invoke the decltype specifier on the result which deduces the return type for us and keeps a type alias R to it.

We then create a map that stores an std::tuple of the argument types as a keys and the return type as the value type. so if the sequence of arguments has been encounted before by the std::tuple, the first if statement yields a true and it returns the result if not it computes the results and stores them in the cacheMap after which it returns the value. This cacheMap is made static so that it stays for the duration of the whole program.

All together the code becomes.

# include <iostream>
# include <tuple>
# include <map>

template <typename Function>
struct Cache {
    Cache(Function func)
        : _func(func)
    {}


    template <typename... Args>
    auto operator()(Args&&... args) {
        using R = decltype(_func(std::forward<Args>(args) ...));

        static_assert(is_non_void<R>::value, "return type of function cannot be void");
        std::tuple<Args...> arguments = std::make_tuple(std::forward<Args>(args) ...);
        static std::map<std::tuple<Args...>, R> cacheMap;

        if (cacheMap.find(arguments) != cacheMap.end()) {
            return cacheMap.at(arguments);
        }
        else 
        {
            R result = _func(std::forward<Args>(args) ...);
            auto Pair = std::pair<std::tuple<Args...>, R>{arguments, result};
            cacheMap.insert(Pair);
            return result;
        }
    }

private:
    Function _func;
};        

Lets test this implementation. In this example we implement a function that adds two integers and returns its sum. A delay is added before the value is computed, that way if we ecounter a delay then it means the function was actually executed but if the value is returned almost immediately then it means the value was returned from the cache.

int main()
{
    using namespace std::literals::chrono_literals;
    Cache fcache{[](int x, int y){
        std::this_thread::sleep_for(2s);
        return x + y;
    }};

    auto now = std::chrono::system_clock::now();
    auto result = fcache(10, 10);
    std::cout << result << std::endl;
    auto afterFirstResult = std::chrono::system_clock::now();

    std::cout << "Duration: " << std::chrono::duration_cast<std::chrono::seconds>(afterFirstResult - now).count() << std::endl;

    now = std::chrono::system_clock::now();
    std::cout << fcache(10, 10) << std::endl;
    auto afterSecondResult = std::chrono::system_clock::now();

    std::cout << "Duration: " << std::chrono::duration_cast<std::chrono::seconds>(afterSecondResult - now).count() << std::endl;
}        

We then create a lambda function which we then pass to constructor of the cache object. We then invoke the cache object. In the first instance the duration would be 2 seconds since the function is actually called. But in the second instance since the sequence of arguments are the same, the results are not computed but returned immediately so the duration would be 0.

No alt text provided for this image

In the first case the duration was 2 seconds and in the latter case duration was 0 as expected. That is one way to implement function cache in C++. Comment if you have better implementation or you find an issue with this design. Thank you.


Edit

From Kicki Frisch advice, I decided to extend implementation of the function cache to solve issues of memory wasted by the cached storing unused values over long periods of time. Here is an updated implementation. I will provide explanation in a later edit of this post.


# include <iostream
# include <tuple>
# include <map>
# include <chrono>


template <typename T>
struct is_non_void {
    using type = T;
    static constexpr bool value = true;
};


template<>
struct is_non_void<void> {
    static constexpr bool value = false;
};


template <typename K, typename V>
bool operator==(const std::pair<K,V> &rhs, const std::pair<K,V> &lhs) {
    return (rhs.first == lhs.first) &&
            (rhs.second == lhs.second);
}


template <typename K, typename V>
bool operator==(const std::pair<const K,V> &rhs, const std::pair<K,V> &lhs) {
    return (rhs.first == lhs.first) && 
            (rhs.second == lhs.second);
}


template <typename K, typename V>
bool operator==(const std::pair<K,V> &rhs, const std::pair<const K,V> &lhs) {
    return (rhs.first == lhs.first) &&
            (rhs.second == lhs.second);
}


template <typename Function, typename DurationType, uint64_t threshHoldDuration>
struct Cache {


    static const int64_t MAX_REF_COUNT = 0xF;
    static bool isPointerType;


    template <typename T>
    struct Data {
        T data;
        std::chrono::system_clock::time_point lastTimeAccessed;

        bool operator==(const Data<T> &p) const {
            return this->data == p.data;
        }

        bool operator!=(const Data<T> &p) const {
            return not operator==(p);
        }
    };


    Cache(Function func)
        : _func(func)
    {
        static_assert((threshHoldDuration != 0) , "threshHoldDuration must not be zero");
    }


    /**
        Instead of decrementing for every operation we decrement or increment at regular intervals,
        By this way we don't get imposed the heavy penalty of looping through the map every time, when a value is accessed or need to be updated.
        to decrement or increment the reference count.
        We also dont incur the problem of 
    */
    template <typename... Args>
    auto operator()(Args&&... args) {
        using R = decltype(_func(std::forward<Args>(args) ...));


        static_assert(is_non_void<R>::value, "return type of function cannot be void"); 
        using ArgData = Data<R>;


        std::tuple<Args...> arguments = std::tie(args...);
        static std::map<std::tuple<Args...>, ArgData> cacheMap;


        auto iter = cacheMap.find(arguments);
        --intervalCheckCount;


        auto freeCacheItem = [&](const std::pair<std::tuple<Args...>, ArgData> &p) mutable {
            for (auto& cachePair : cacheMap) {
                if (cachePair == p) {
                    continue;
                }
                else {
                    auto durationCount = timeSeparation(cachePair.second.lastTimeAccessed);
                    if (durationCount >= threshHoldDuration) {
                        cacheMap.erase(cachePair.first);
                    }
                }
            }
        };


        if (iter != cacheMap.end()) { 
            iter->second.lastTimeAccessed = now();
            if (intervalCheckCount <= 0) {
                auto pairToFree = std::make_pair(iter->first, iter->second);
                freeCacheItem(pairToFree);
                intervalCheckCount = MAX_REF_COUNT;
            }
            return cacheMap.at(arguments).data;
        }
        else 
        {
            R result = _func(std::forward<Args>(args) ...);
            ArgData data;
            data.data = result;
            data.lastTimeAccessed = std::chrono::system_clock::now();


            auto Pair = std::pair<std::tuple<Args...>, ArgData>{arguments, data};
            cacheMap.insert(Pair);


            if (intervalCheckCount <= 0) {
                freeCacheItem(Pair);
                intervalCheckCount = MAX_REF_COUNT;
            }
            return result;
        }
    }


private:
    typename std::conditional<std::is_function<Function>::value,
                              typename std::add_pointer<Function>::type,
                              Function>::type _func;
    int64_t intervalCheckCount{MAX_REF_COUNT};


    auto now() {
        return std::chrono::system_clock::now();
    }


    uint64_t timeSeparation(const std::chrono::system_clock::time_point &p) {
        auto nowDuration = now() - p;
        return std::chrono::duration_cast<DurationType>(nowDuration).count();
    }
};



int main()
{
    auto add = [](int x, int y) {
        return x + y;
    };


    Cache<decltype(add),std::chrono::milliseconds,1000> cache(add);
    auto&& res = cache(10, 10);


    std::cout << res << std::endl;
}        

As always comments and suggestions are welcomed. Please feel free to use the code for whatever you want. #turntabl

This is a great explanation, thanks Abdul Hameed. It’s an interesting concept. Is there any expiration on how long you cache values? E.g. we could run into memory usage issues. Is there a way to protect ourselves from using it e.g. with an impure function - or do we have to rely on the developer doing the right thing?

To view or add a comment, sign in

More articles by Abdul Hameed Oluwashegu Tade

Others also viewed

Explore content categories