Decoding Performance: A Scalability Showdown Between CalculiX and Code_Aster

Ruggero Poletto

Published Sep 16, 2025

In the demanding world of Finite Element Analysis (FEA), performance is king. When tackling complex engineering problems, especially with large datasets and intricate geometries, how efficiently your software scales with increasing computational resources can make or break your analysis time. This is precisely what we set out to investigate: a deep dive into the scalability of two prominent open-source FEA solvers: CalculiX and Code_Aster.

Our journey was driven by a common pain point for many users of open-source software: the sheer number of configuration options. With variables like geometry, mesh, loads, and conditions, it’s easy to get lost in the labyrinth of possibilities. To bring clarity, we performed a rigorous study, testing over 300 variations of the same FEA problem to establish some practical guidelines for users, even if they are highly specific to our benchmark.

Understanding Scalability and Speedup

Before we dive into the results, let’s quickly define our key metrics:

Scalability: Simply put, scalability is about how much the solution time decreases as we increase the number of hardware resources (like CPU cores) used.
SpeedUp (S): This measures how much faster a problem can be solved when increasing the number of CPUs. The formula is S = T1 / TN, where T1 is the time taken with one CPU and TN is the time taken with N CPUs.
Efficiency (E): This tells us how effectively the speedup is achieved relative to the increase in CPUs. The formula is E = S / N = T1 / (N * TN).

As we increase the number of CPUs, speedup generally increases until a certain limit is hit. Pushing beyond this limit can lead to diminishing returns, meaning low efficiency, no speedup at all, or even a reduction in performance. This limit, however, is not fixed and depends on several factors, including the software version, hardware used, and the nature of the simulation (e.g., the models involved and the mesh density).

Our Testbed: Hardware and Software

To conduct our experiments, we utilized a Cloud HPC environment, specifically leveraging:

CPUs: AMD EPYC 7B13 @ 2.8GHz and Intel(R) Xeon(R) PLATINUM 8581C CPU @ 2.9GHz. We explored configurations with varying core counts and threads per core (e.g., XX-YY where XX is the number of cores and YY is the number of threads per core).
RAM: 8 GB per vCPU allocated.

On the software front, we tested:

Code_Aster 17.0: Utilizing a Singularity container compiled in-house.
CalculiX 2.21: Publicly available version, as well as versions compiled in-house with PARDISO (v2.19) and PARDISO with MPI (v2.18).

The FEM Model: A Consistent Benchmark

To ensure a fair comparison, we employed a single, well-defined FEA model:

Analysis Type: Static linear analysis.
Elements: 3D elements with a tetrahedral mesh.
Loading: Constrained with gravity force only.
Post-processing: Not considered; only displacement was calculated.
Size: Approximately 3,984,763 cells and 865,432 nodes.

We also explored different solver configurations within both Code_Aster and CalculiX, including various MUMPS and PETSc options for Code_Aster, and default, Spooles, PARDISO MultiThread, and PARDISO MPI for CalculiX.

Key Findings and Observations:

Our extensive testing revealed some compelling trends:

1. Code Aster Scalability with Hyperthreading: Increasing the number of threads in CalculiX showed time reductions, but this was primarily observed with a limited number of threads per core (up to 4 threads per core). Specific solver configurations like PETS-4 and MULT-5 / MUMP-2 showed the most promising hyperthread scalability within this range.

2. Code Aster Scalability with Multi-Core: The time reduction with increasing CPU cores was most significant when combined with 2 threads per core. Similar to hyperthreading, certain solver configurations like PETS-4 and PETS-3 demonstrated better multi-core scalability under these conditions.

Recommended by LinkedIn

🚀 New preprint: energy and GPU portings

Salvo Cielo 8 months ago

Customizing Ollama Models: How I Optimized 3 Pre-built…

Vivek Kulkarni 1 month ago

Update: Validating Software-Defined Unified Memory in…

Karthik Thyagarajan 4 months ago

3. Intel vs. AMD Performance: When comparing Intel and AMD processors, we found no major difference in computational speed for the tested configurations. The primary distinction lay in the LLC (Last Level Cache) allocation, which didn’t translate into significant performance disparities in our benchmark.

4. NITER and Scalability: Testing with NITER = 10 (both hyperthread and multicore) provided further insights into how the number of iterations affects scalability. The results generally aligned with previous observations, with specific solvers showing better scaling tendencies.

5. STAT vs. MECA Scalability: Comparing MECA_STATIQUE and STAT_NON_LINE solver types in Code_Aster, we observed that MECA_STATIQUE generally offered better scalability across various configurations.

6. CalculiX vs. Code_Aster Comparison: This was a crucial part of our study. We observed that CalculiX, particularly with its PARDISO solver (both with and without MPI), often demonstrated superior scalability compared to Code_Aster in our benchmark. CalculiX with PARDISO could effectively scale up to 200,000 nodes per core, while Code_Aster’s thread scalability had a more pronounced limit (around 2-4 threads, with some solvers reaching up to 100,000 nodes per core but with significant RAM consumption).

Conclusions and Recommendations:

Our in-depth analysis leads to these key takeaways:

For Code_Aster:

MECA_STATIQUE is the preferred solver for better scalability.
PETSc solvers also show good performance.
Be mindful of thread limits; excessive threads (beyond 2, and at most 4) might not yield proportional speedups and can lead to increased RAM consumption.
Intel and AMD architectures showed comparable performance in our tests.

For CalculiX:

The PARDISO solver (with or without MPI) is a strong contender for achieving excellent scalability.
CalculiX demonstrates impressive scalability, handling up to 200,000 nodes per core effectively.

This study underscores the importance of understanding your software’s scalability characteristics and how they interact with your hardware and the specific problem you’re trying to solve. While our findings are based on a specific benchmark, they offer valuable insights for users looking to optimize their FEA workflows with open-source tools.

Have you had similar experiences with CalculiX or Code_Aster? Share your thoughts and findings in the comments below!

Tom Erik Harnes 7mo

Interesting work.

Julien Sigüenza 7mo

Great work 🙌 Congratulations 🎉👏 You might be interested looking at the open-source scientific python framework I'm developing: https://www.suffisciens.com/nuremics Might be also interesting for your community. I'm also developing this kind of benchmarking comparing SOFA Framework versus #OpenRadioss. You can discover on the website of the framework the #BuildInPublic initiative I'm developing using this framework. The first use case I'm looking at is a classical Computational Structural Mechanics benchmark of cantilever beam subjected to end shear force. I'm planning to compare 3D solids #FiniteElements, 2D shells and 1D beams modelizations. Looking forward to engage with you and your community! Take care, Julien

Cédric Renzi 7mo

Do I read correctly that Code_Aster was containerized while CalculiX was directly deployed on the host system?

Jeremy Theler 7mo

I didn't get it. Who wins? Is this linear elasticity only? Can I compare against FeenoX?

2 Reactions

Ali Sayed 7mo

I haven't used Aster in a long long time ... the UI is not very intuitive and I struggled doing even simple stuff. Its getting better but its not as simple as PrePoMax or FreeCAD ;) Calculix is more than just linear static analysis and it really excels at doing complex dynamics and even fluids simulations so I am not sure this is a fair comparison. Both are very capable pieces of code, and one should really use whatever tool gets the job done. Here's a list I put together years ago (things at run on Linux for FREE): Mystran Nastran Calculix LS-dyna Salmone - code_aster and CFD Elmer FEATool Moose Open Radioss Z88/Z88Aurora Could take a lifetime to learn them all !

Decoding Performance: A Scalability Showdown Between CalculiX and Code_Aster

Ruggero Poletto

Understanding Scalability and Speedup

Our Testbed: Hardware and Software

The FEM Model: A Consistent Benchmark

Key Findings and Observations:

Recommended by LinkedIn

Conclusions and Recommendations:

More articles by Ruggero Poletto

Others also viewed

What is std::simd in C++26?

5G open RAN virtual DU software frameworks

Unlocking Performance: How NUMA Tuning Can Triple Your CAE Simulation Speed on HPC

Why have GPUs become the go-to hardware for Machine Learning and Artificial Intelligence?

Stop RCE in NVIDIA Merlin Transformers4Rec now CVE 2025 23298 explained and fixed

Notes on Writing a Ray Tracer using CUDA

NHCE OpenPOWER Lab Inauguration and Technical Presentations

Architecting for the Trillion-Parameter Era: The Engineering Reality of "Thinking" Models

Splitting a long string in lines efficiently

Profiling Your LLM Training with Nsight Systems

Explore content categories

Understanding Scalability and Speedup

Our Testbed: Hardware and Software

The FEM Model: A Consistent Benchmark

Key Findings and Observations:

Recommended by LinkedIn

Conclusions and Recommendations:

More articles by Ruggero Poletto

Wildfire Simulation with FDS: From GIS Data to High-Performance Computing

Taming the Waves Within: Understanding Sloshing and Simulating It with DualSPHysics

The Dance of the Fluid: Simulating Von Kármán Vortex Shedding

Unlocking Structural Stability: Buckling Analysis with Code_Aster

Master the Layers: Optimizing snappyHexMesh for High-Fidelity Boundary Layers

Beyond the Prescriptive: Unlocking the Power of CFD for UNI/TS 9494-4

🔥 Beyond the Standard: ISO 834 Curve vs. Real Fire Simulation in Structural Validation

🚀 Revolutionizing Turbofan Design: CFD and AI Achieve Unprecedented Efficiency

Turbocharge Your CFD Optimization: Leveraging Cloud HPC for Turbomachinery Design

Understanding Pedestrian Dynamics with JuPedSim: A Guide to Egress Time Evaluation

Others also viewed

What is std::simd in C++26?

5G open RAN virtual DU software frameworks

Unlocking Performance: How NUMA Tuning Can Triple Your CAE Simulation Speed on HPC

Why have GPUs become the go-to hardware for Machine Learning and Artificial Intelligence?

Stop RCE in NVIDIA Merlin Transformers4Rec now CVE 2025 23298 explained and fixed

Notes on Writing a Ray Tracer using CUDA

NHCE OpenPOWER Lab Inauguration and Technical Presentations

Architecting for the Trillion-Parameter Era: The Engineering Reality of "Thinking" Models

Splitting a long string in lines efficiently

Profiling Your LLM Training with Nsight Systems

Explore content categories