Data-Parallel Types – A First Example

Data-Parallel Types – A First Example

After providing a theoretical introduction to the new C++ 26 feature in my last article, “Data-Parallel Types (SIMD),” I would like to follow up today with a practical example.

The following introductory example is from the experimental implementation of the SIMD library. This functionality has been fully adopted in the C++ 26 draft under the name Data-parallel types (SIMD). To port the program to the C++ 26 standard, it should be sufficient to replace the header <experimental/simd> with <simd> and the namespace std::experimental with std::datapar.

#include <experimental/simd>
#include <iostream>
#include <string_view>
namespace stdx = std::experimental;
 
void println(std::string_view name, auto const& a)
{
    std::cout << name << ": ";
    for (std::size_t i{}; i != std::size(a); ++i)
        std::cout << a[i] << ' ';
    std::cout << '\n';
}
 
template<class A>
stdx::simd<int, A> my_abs(stdx::simd<int, A> x)
{
    where(x < 0, x) = -x;
    return x;
}
 
int main()
{
    const stdx::native_simd<int> a = 1;
    println("a", a);
 
    const stdx::native_simd<int> b([](int i) { return i - 2; });
    println("b", b);
 
    const auto c = a + b;
    println("c", c);
 
    const auto d = my_abs(c);
    println("d", d);
 
    const auto e = d * d;
    println("e", e);
 
    const auto inner_product = stdx::reduce(e);
    std::cout << "inner product: " << inner_product << '\n';
 
    const stdx::fixed_size_simd<long double, 16> x([](int i) { return i; });
    println("x", x);
    println("cos²(x) + sin²(x)", stdx::pow(stdx::cos(x), 2) + stdx::pow(stdx::sin(x), 2));
}
        

Before I proceed with the program, I would like to introduce the output.

First, I would like to focus on the println and my_abs functions. The println function outputs the name and content of a SIMD vector, iterating through its elements. my_abs calculates the absolute value of each element in a SIMD vector with integers, using where to conditionally negate negative values.

The main function is much more interesting. In the SIMD vector a, each element is set to 1, whereas in the SIMD vector b, thanks to the lambda function, each element is initialized so that it has its index minus 2. By default, SSE2 instructions are used via const stdx::native_simd. These SIMD vectors are 128 bits in size. Now the arithmetic begins. Vector c is the element-wise sum of a and b, d is the element-wise absolute value of c, and vector e is the element-wise square of d. Finally, stdx::reduce(e) is used. This reduces vector e to its sum.

The expression const stdx::fixed_size_simd<long double, 16> x([](int i) { return i; }) is particularly interesting. It initializes the SIMD vector x with 16 long double values from 0 to 15. This is possible if the architecture is sufficiently modern and supports AVX-252. This applies, for example, to Intel’s Xeon Phi or AMD’s Zen 4 architecture. Similarly interesting is the line println(“cos²(x) + sin²(x)”, stdx::pow(stdx::cos(x), 2) + stdx::pow(stdx::sin(x), 2)). This calculates cos²(x) + sin²(x) for each element, which is 1 for all elements due to the trigonometric identity of Pythagoras. All functions in <cmath> except for the special mathematical functions for simd are overloaded. These include basic functions such as abs, min, and max. However, exponential, power, trigonometric, hyperbolic, and gamma functions can also be applied directly to SIMD vectors.

Now I would like to go into more detail about the width of the data type simd<T>.

Width of simd<T>

The width of the data type native_simd<T> is determined by the implementation at compile time. In contrast, the developer specifies the width of the data type fixed_size_simd<T>.

The class template simd has the following declaration:

template< class T, class Abi = simd_abi::compatible >
class simd;
        

Here, T stands for the element type, which cannot be bool. The Abi tag determines the number of elements and their memory.

There are two aliases for this class template:

template< class T, int N >
using fixed_size_simd = std::experimental::simd<T, std::experimental::simd_abi::fixed_size<N>>;
		
template< class T >
using native_simd = std::experimental::simd<T, std::experimental::simd_abi::native<T>>;
        

The following ABI tags are available:

  • scalar: storing a single element
  • fixed_size: storing a specified number of elements
  • compatible: ensures ABI compatibility
  • native: most efficient
  • max_fixed_size: maximum number of elements guaranteed to be supported by fixed_size

What’s next?

After this initial example of data parallel types, I would like to take a closer look at their functionality in the next article.

Great Read. Thanks Rainer. 😊 Why use Data-Parallel types(SIMD)? Performance: Exploits vector registers in CPUs. Portability: Abstracts platform-specific vector intrinsics. Ease of Use: Cleaner than writing AVX/NEON intrinsics directly. Safe Fallback: Falls back to scalar loops if hardware lacks SIMD.

To view or add a comment, sign in

More articles by Rainer Grimm

  • Charity run for ALS

    Tomorrow, on the 28th, there will be a charity run for ALS in Rottenburg. The course is exactly 1 km long.

    2 Comments
  • Small Safety Improvements in the C++ 26 Core Language

    Safety is an important concern in C++26. Contracts are probably the most important feature for safety.

    1 Comment
  • My ALS Journey (30/n): Cippi at the CppCon

    This week was very exciting for Cippi. She visited CppCon in Aurora, near Denver.

    2 Comments
  • Contracts: Evaluation Semantic

    After briefly presenting the details of contracts in my last article, “Contracts: A Deep Dive“, I would like to take a…

  • My ALS Journey (29/n): I feel Good

    I often receive messages asking about my health and wishing me well. I am very happy to receive these messages and just…

    5 Comments
  • Contracts: A Deep Dive

    August 25, 2025/in C++26/by Rainer GrimmI already introduced contracts in the article “Contracts in C++26”. In this…

  • My ALS Journey (28/n): Bureaucracy – The German Disease

    Today I want to write about a sad topic. Bureaucracy in the German healthcare system is becoming increasingly absurd.

    2 Comments
  • Data-Parallel Types: Algorithms

    The data-parallel types library has four special algorithms for SIMD vectors. The four special algorithms are min, max,…

  • My ALS Journey (27/n): An Emergency Call

    Firstly, I would like to say that I am doing very well and have made a full recovery from my incident. However, I would…

    8 Comments
  • Data-Parallel Types: Reduction

    In this article, I will discuss reduction and mask reduction for data-parallel types. Reduction A reduction reduces the…

Others also viewed

Explore content categories