High-Performance Computing in Simulation

Explore top LinkedIn content from expert professionals.

Summary

High-performance computing in simulation refers to the use of powerful supercomputers and advanced algorithms to solve complex scientific and engineering problems in virtual environments that would be impossible or extremely slow on regular computers. This technology enables researchers to model everything from rocket exhaust plumes to molecular interactions with incredible detail, speeding up discoveries and innovations across industries.

Explore automation: Streamline simulation workflows by adopting automated tools for mesh generation and data processing, reducing manual work and saving valuable time.
Harness parallel power: Run simulations across thousands of computing cores at once, enabling faster analysis of large-scale problems and more thorough exploration of possible solutions.
Integrate AI and quantum: Combine artificial intelligence and quantum-inspired methods with high-performance computing to tackle previously impossible simulation tasks, expanding the boundaries of what can be achieved in research and development.

Summarized by AI based on LinkedIn member posts

Nukri B.

🇺🇸 Founder Super Protocol | PhD Nuclear Physics | Architecting Secure, Private Swarm Intelligence at Scale

16,049 followers 5mo
Report this post
Quadrillion Calculations: How the Super Heavy Exhaust Was Simulated Scientists at Georgia Tech did the impossible — they simulated the turbulent exhaust plumes from 33 rocket engines at the same time. This is the largest fluid dynamics simulation in history, exceeding one quadrillion degrees of freedom. To understand the scale: a quadrillion is a number with 15 zeros. That’s how many independent variables the El Capitan supercomputer computed simultaneously to model how the scorching exhaust streams interact. Why is this even needed? Modern rockets like SpaceX’s Super Heavy use dozens of smaller engines instead of a few large ones. They are easier to manufacture, have redundancy, and are simpler to transport. But when all of them fire together, their exhaust plumes at Mach 10 create a hellish mixture. Hot gases can reflect back toward the rocket’s base and simply destroy it. Testing this in a wind tunnel is impossible — the conditions are too extreme. Simulation is the only option. But traditional methods would require weeks of calculations even for a simplified model. The team came up with a trick. Instead of traditional shock-wave capturing, they developed the IGR method — Information Geometric Regularization. It sounds complicated, but the essence is simple: they reformulated how the computer processes shock waves. The result was an 80× speed-up and a 25× reduction in memory usage. El Capitan is not just a powerful computer. It has a unique architecture with unified memory using AMD MI300A chips. The CPU and GPU work with the same physical memory. The team used all 11,136 nodes of the machine — more than 44,500 AMD accelerators. A simulation that once took weeks finished in hours. Energy consumption was reduced by a factor of five. But the most interesting part is the applications. The technology works not only for rockets. Aircraft noise prediction, biomedical hydrodynamics, any high-speed flows — anywhere turbulence needs to be modeled without artificial viscosity. The work has been nominated for the Gordon Bell Prize — the highest award in supercomputing. The winner will be announced on November 20 in St. Louis. Ironically, El Capitan was created for nuclear weapons simulation. And its first public use — to help SpaceX avoid burning its own rocket with its own exhaust. https://lnkd.in/g-gxB4hR
No more previous content

No more next content
11 Comments
Like Comment
Holger Marschall

𝗧𝘂𝗿𝗻𝗶𝗻𝗴 𝘂𝗻𝗰𝗲𝗿𝘁𝗮𝗶𝗻𝘁𝘆 𝗶𝗻𝘁𝗼 𝗶𝗻𝗳𝗼𝗿𝗺𝗲𝗱 𝗱𝗲𝗰𝗶𝘀𝗶𝗼𝗻𝘀 𝘁𝗵𝗿𝗼𝘂𝗴𝗵 𝘀𝗶𝗺𝘂𝗹𝗮𝘁𝗶𝗼𝗻 | Chief Product & Innovation Officer at IANUS Simulation | Professor at TU Darmstadt

37,317 followers 1y
Report this post
✈️ 𝗡𝗔𝗦𝗔'𝘀 𝗔𝘂𝘁𝗼𝗺𝗮𝘁𝗲𝗱 𝗠𝗲𝘀𝗵𝗶𝗻𝗴 𝗳𝗼𝗿 𝗛𝗶𝗴𝗵-𝗙𝗶𝗱𝗲𝗹𝗶𝘁𝘆 𝗔𝗲𝗿𝗼𝗱𝘆𝗻𝗮𝗺𝗶𝗰 𝗦𝗶𝗺𝘂𝗹𝗮𝘁𝗶𝗼𝗻𝘀 Advancements in 𝗪𝗮𝗹𝗹-𝗠𝗼𝗱𝗲𝗹𝗲𝗱 𝗟𝗮𝗿𝗴𝗲-𝗘𝗱𝗱𝘆 𝗦𝗶𝗺𝘂𝗹𝗮𝘁𝗶𝗼𝗻𝘀 (𝗪𝗠𝗟𝗘𝗦) are revolutionizing aerodynamics in the aerospace industry. At NASA Ames Research Center, a 𝗰𝘂𝘁𝘁𝗶𝗻𝗴-𝗲𝗱𝗴𝗲 𝘂𝗻𝘀𝘁𝗿𝘂𝗰𝘁𝘂𝗿𝗲𝗱 𝘀𝗼𝗹𝘃𝗲𝗿-𝗺𝗲𝘀𝗵𝗲𝗿 𝗽𝗮𝗶𝗿 𝘁𝗼𝗼𝗹 𝘄𝗶𝘁𝗵𝗶𝗻 𝘁𝗵𝗲 𝗟𝗔𝗩𝗔 𝗳𝗿𝗮𝗺𝗲𝘄𝗼𝗿𝗸 is addressing one of WMLES’s biggest hurdles: the 𝗰𝗿𝗲𝗮𝘁𝗶𝗼𝗻 𝗼𝗳 𝗹𝗮𝗿𝗴𝗲-𝘀𝗰𝗮𝗹𝗲, 𝗵𝗶𝗴𝗵-𝗾𝘂𝗮𝗹𝗶𝘁𝘆 𝗺𝗲𝘀𝗵𝗲𝘀 𝗳𝗼𝗿 𝗰𝗼𝗺𝗽𝗹𝗲𝘅 𝗴𝗲𝗼𝗺𝗲𝘁𝗿𝗶𝗲𝘀. Traditional mesh generation for high-lift configurations can take weeks and a team of specialists. With 𝗮𝘂𝘁𝗼𝗺𝗮𝘁𝗲𝗱 𝗩𝗼𝗿𝗼𝗻𝗼𝗶 𝗴𝗿𝗶𝗱𝘀, it’s down to a couple of days—requiring just one person! These highly scalable, body-conforming grids are delivering accurate, engineering-relevant results while slashing costs and time. 🚀 𝗛𝗶𝗴𝗵𝗹𝗶𝗴𝗵𝘁𝘀: ➡️ 𝗛𝗶𝗴𝗵 𝗥𝗲𝘀𝗼𝗹𝘂𝘁𝗶𝗼𝗻: Simulated the flow around half of a 60-meter-wingspan aircraft, resolving 4 mm flow details. ➡️ 𝗘𝗳𝗳𝗶𝗰𝗶𝗲𝗻𝘁 𝗖𝗼𝗺𝗽𝘂𝘁𝗮𝘁𝗶𝗼𝗻: Ran on 100 AMD “Rome” nodes for two days on NASA’s Aitken supercomputer. ➡️ 𝗔𝘂𝘁𝗼𝗺𝗮𝘁𝗶𝗼𝗻: Voronoi grids cut hands-on effort while matching the accuracy of custom-built meshes. ➡️ 𝗡𝗲𝘅𝘁 𝗦𝘁𝗲𝗽𝘀: Enhancing meshing via parallel load balancing and accelerating solvers with GPUs. ✒️ 𝗔𝘂𝘁𝗵𝗼𝗿 𝗰𝗿𝗲𝗱𝗶𝘁: Victor Sousa & Emre Sozer, NASA Ames Research Center 🎥 The 𝘃𝗶𝗱𝗲𝗼 shows "surface contours of the friction coefficient experienced by NASA High-Lift Common Research model at an angle-of-attack of 19.7 degrees, where small-turbulent fluctuations are visible. Dark regions highlight locations where flow separation is likely to occur." Enjoy! 📢 𝗖𝗙𝗗 𝗲𝘅𝗽𝗲𝗿𝘁𝘀, 𝗵𝗼𝘄 𝗱𝗼 𝘆𝗼𝘂 𝘀𝗲𝗲 𝗮𝘂𝘁𝗼𝗺𝗮𝘁𝗲𝗱 𝗺𝗲𝘀𝗵 𝗴𝗲𝗻𝗲𝗿𝗮𝘁𝗶𝗼𝗻 𝘀𝗵𝗮𝗽𝗶𝗻𝗴 𝘁𝗵𝗲 𝗳𝘂𝘁𝘂𝗿𝗲 𝗼𝗳 𝘀𝗶𝗺𝘂𝗹𝗮𝘁𝗶𝗼𝗻𝘀? 𝗦𝗵𝗮𝗿𝗲 𝘆𝗼𝘂𝗿 𝘁𝗵𝗼𝘂𝗴𝗵𝘁𝘀 𝗯𝗲𝗹𝗼𝘄! #CFD #Simulation #Technology #WMLES #Aerodynamics #NASAResearch #NASA #HighPerformanceComputing #Automation #CAE #Engineering #SimulationExcellence #TurbulenceModeling

High-Lift Aerodynamics Simulation with WMLES

7 Comments
Like Comment
Pradyumna Gupta

Building Infinita Lab - Uber of Materials Testing | Driving the Future of Semiconductors, EV, and Aerospace with R&D Excellence | Collaborated in Gorilla Glass's Invention | Material Scientist

20,796 followers 5mo
Report this post
A year ago we were simulating tens of thousands of atoms. Today, with exa-AMD and exascale GPU clusters, we can run millions of atoms in minutes, and thousands of materials in parallel. Everyone keeps bragging about “billions of atoms” simulations. But almost no one is asking the real question: What does that enable for real materials design? This isn’t about doing one simulation faster. It’s about automating discovery across entire chemistries at once. Instead of 5–10 candidate compounds in a PhD cycle, we can now screen 5,000, with structure, defects, dopants, and interfaces included. The bottleneck has officially shifted. Not “can we simulate this?” But “how fast can we validate what the compute is telling us?” And the contrarian truth is: Most people still think exascale is an academic toy, something for national labs to brag about. But it’s already becoming industrial infrastructure: battery companies, semiconductor fabs, electrolyzer startups, hydrogen catalyst labs, all building AI + exascale loops to collapse discovery time. The competitive edge for a materials lab won’t be how well you simulate one system. It’ll be how seamlessly we tie simulation → experiment → scale, on repeat. If our R&D stack isn’t ready to screen 100X more candidates by 2026, we’re obsolete. #MaterialsScience #Exascale #HighPerformanceComputing #ComputationalMaterials #exaAMD #AIForScience #DiscoveryAutomation #R&DTransformation
No more previous content

No more next content
1 Comment
Like Comment
Rut Lineswala

Founder & CTO | Innovating the Space of Simulations & Quantum Tech

5,086 followers 4mo
Report this post
Space innovation just reached a new height, and this time, the breakthrough happened inside a simulation! 🤯Researchers in the United States have achieved a milestone that meaningfully shifts what we consider “feasible” in high-fidelity CFD. Using the El Capitan exascale supercomputer at Lawrence Livermore National Laboratory, the team executed one of the most detailed rocket-exhaust plume simulations ever performed and setting a new global record for CFD resolution. Why this matters CFD has long been limited by compute. Capturing the beloved shock interactions, acoustic loads, and turbulent structures at extremely fine scales pushes memory, bandwidth, and solver stability to their limits. This new work pushes those limits far outward. The simulation resolved ~500 quadrillion degrees of freedom, representing hundreds of trillions of spatial points, each updating the flow variables at every timestep. A decade ago, the memory footprint alone would have made this impossible. On El Capitan, the job completed in hours! How El Capitan made it possible With 11,136 nodes built on AMD’s accelerated processors (and more than 44,500 compute engines active) the system delivered: ⚡~80× faster time-to-solution 🧠~25× lower memory use 🌱~5× lower energy consumption But hardware wasn’t the only enabler. Algorithm and innovation The team introduced Information Geometric Regularization (IGR) - novel numerical technique designed to stabilize the extreme flow fields without erasing critical physical detail (overcoming the SBLI challenges). Often forcing the unstable oscillations or extremely small timesteps, IGR suppresses non-physical behavior while preserving sharp gradients in pressure and density. Integrated into the existing framework, it enabled the solver to operate steadily at quadrillion-scale resolution, previously considered unrealistic. What this means for the future of CFD Breakthroughs like this show how much progress occurs when advanced hardware meets smarter mathematics. Exascale machines demonstrate what is technically possible, but they also highlight a practical truth: most engineering teams still depend on commercial CPU/GPU infrastructure with strict memory, runtime, and cost constraints. This is the space we focus on at BQP. Rather than relying on larger supercomputers, like El Capitan, we are exploring how quantum-inspired methods can accelerate the most computationally demanding parts of CFD on the hardware that industry already uses today. Our recent collaboration with NVIDIA and Classiq shows that hybrid quantum-classical techniques can deliver meaningful speedups even without exascale resources. (Full case study linked in the comments.) Spencer Bryngelson Florian Schaefer Abhishek Chopra Jash Minocha Aditya Singh Georgia Institute of Technology New York University #HPC #CFD #Quantum #DigitalTwin #SpaceX #Innovation https://lnkd.in/dNvFepaF

Fluid flow simulation on Frontier earns Gordon Bell finalist selection | ORNL ornl.gov

1 Comment
Like Comment
Keith King

Former White House Lead Communications Engineer, U.S. Dept of State, and Joint Chiefs of Staff in the Pentagon. Veteran U.S. Navy, Top Secret/SCI Security Clearance. Over 16,000+ direct connections & 44,000+ followers.

43,859 followers 6mo
Report this post
Headline: China’s Oceanlite Supercomputer Marries AI and Quantum Science—37 Million Cores Simulate Molecular Quantum Chemistry Introduction: In a milestone achievement, Chinese researchers have fused artificial intelligence with traditional supercomputing to simulate complex quantum chemistry at molecular scale—without using a quantum computer. Using the Oceanlite supercomputer powered by 37 million processing cores, the Sunway team has achieved a feat previously deemed impossible on classical machines. Key Insights: 1. Bridging AI and Quantum Physics Quantum chemistry models the probabilistic behavior of particles like electrons within molecules, governed by the wavefunction (Ψ). Such simulations are normally restricted to small molecules due to the exponential complexity of quantum states. To overcome this barrier, the Sunway team used neural-network quantum states (NNQS), allowing machine learning to approximate molecular wavefunctions with quantum-level accuracy. 2. Record-Breaking Simulation Researchers modeled a molecular system containing 120 spin orbitals—the largest AI-driven quantum chemistry simulation ever conducted on a classical supercomputer. The NNQS trained to predict electron energy distributions and refined itself iteratively until it mirrored true molecular quantum behavior. This approach demonstrates that deep learning frameworks can replicate quantum effects at unprecedented scale. 3. Oceanlite’s Engineering Triumph The experiment ran on the Sunway SW26010-Pro CPU, each chip featuring 384 cores optimized for high-performance computing (HPC). Engineers built a hierarchical communication model where management cores coordinated millions of lightweight compute processing elements (CPEs). Achieved 92% strong scaling and 98% weak scaling efficiency, indicating near-perfect hardware-software synchronization—an exceptional accomplishment in exascale computing. 4. Strategic and Scientific Impact Marks a leap forward for China’s AI and quantum research sectors, blending HPC power with neural architectures. The achievement positions China at the frontier of simulating quantum systems without quantum hardware. Why It Matters: This breakthrough redefines the boundary between classical and quantum computing, offering a path to simulate and design complex molecules—essential for materials science, drug discovery, and clean energy research—using today’s infrastructure. It also signals China’s deepening command of exascale computing and its integration with AI, setting a new global benchmark in scientific computing innovation. I share daily insights with 28,000+ followers and 10,000+ professional contacts across defense, tech, and policy. If this topic resonates, I invite you to connect and continue the conversation. Keith King https://lnkd.in/gHPvUttw
No more previous content

No more next content
1 Comment
Like Comment
Alicia Welden

Quantum Chemistry | Quantum Computing | AI

3,835 followers 4mo
Report this post
Your HPC simulations probably run at <3% of peak performance. Here’s why, and what SC25 revealed 👇 1/ FLOPs don’t predict scientific performance The Top500 uses Linpack, a benchmark for dense linear algebra. But most scientific codes (MD, DFT, MLIPs, climate) are: sparse, communication-heavy, memory-bound, irregular. That’s why even exascale machines deliver 0.6%–3% of peak on real workloads. 2/ HPCG (high-performance conjugate gradient) is a more honest test for real simulation work. HPCG measures the building blocks of scientific computing: sparse matrix–vector multiply, multigrid V-cycles, communication collectives, irregular memory access. It reveals how well a machine handles real simulation patterns, not theoretical FLOPs. That’s why the HPCG Top10 looks nothing like the Top500. 3/ The actual bottleneck = data movement Jack Dongarra said it best: “Arithmetic is inexpensive and oversubscribed.” What slows your job down is: memory bandwidth, interconnect latency, node-to-node communication, data locality. Your simulation is movement-limited rather than compute-limited. 4/ HPC systems are now fully heterogeneous 2025 systems include: AMD MI300A NVIDIA Grace + GH200 Intel Max GPUs ARM A64FX cloud-native HPC nodes No two machines are built the same anymore. Your software and workflows must be ready to adapt. 5/ Precision is shifting 64-bit used to dominate simulation. But mixed-precision and adaptive-precision methods are becoming practical (thanks to AI + hardware changes). The future is right-precision computing instead of “max precision by default.” If you run scientific simulations, the key question isn’t FLOPs, but rather: “How fast can I move data, and how well does my algorithm tolerate irregularity?” This will shape the next decade of scientific computing. Have you ever profiled your simulation to understand where it’s actually limited (bandwidth? latency? compute?) What did you find? #HPC #Supercomputing #ScientificComputing #Top500 #SC25 #ComputationalScience #AIInfrastructure #MaterialsScience #Exascale
No more previous content

No more next content
Like Comment
Dirk Hartmann

Head of Simcenter Technology Innovation | Full Professor TU Darmstadt | Siemens Distinguished Key Expert | Siemens Top Innovator and Inventor of the Year

9,881 followers 1y
Report this post
🚀 𝐀𝐜𝐜𝐞𝐥𝐞𝐫𝐚𝐭𝐢𝐧𝐠 𝐂𝐅𝐃 - 𝐏𝐫𝐞𝐩𝐫𝐢𝐧𝐭 𝐀𝐥𝐞𝐫𝐭! 🚀 Excited to share our latest preprint: "Accelerating Computational Fluid Dynamics with Transported Memory Networks" 🧠🌊 In this work, we introduce Transported Memory Networks (TMNs) - inspired by Sepp Hochreiter’s LSTMs - as a novel approach to to capture the effect of unresolved scales in #FluidDynamics by means of #MachineLearning. Or more boldly: to 𝐚𝐜𝐜𝐞𝐥𝐞𝐫𝐚𝐭𝐞 𝐂𝐨𝐦𝐩𝐮𝐭𝐚t𝐢𝐨𝐧𝐚𝐥 𝐅𝐥𝐮𝐢𝐝 𝐃𝐲𝐧𝐚𝐦𝐢𝐜𝐬 (#CFD) 🔎The key #Insight? 🔍In CFD, CNNs do not just capture spatial relations but rather infer temporal dependencies. Essentially, convolutions collect information further downstream, thus effectively reaching back in time ⏳. Our approach takes a memory-based perspective where information is advected along the flow (Eulerian view). By leveraging LSTM-inspired architectures that explicitly incorporate gradient information, we demonstrate that the memory is de-facto transported. The network is trained via an autoregressive approach (solver-in-the-loop), ensuring robustness and better alignment with physics ⚙️. Why does this matter? ✅ More physically consistent #ML extended solvers 🏗️ ✅ Better suited for high-performance industrial CFD 🏭 ✅ A step closer to scalable, ML-based physics solvers 🔬 Of course, challenges remain, but we're making rapid progress toward ML-driven CFD for real-world applications! 💡 Big kudos to the main contributors Matthias Schulz and Gwendal Jouan, as well as Stefan Gavranovic and Daniel Berger for making this possible! 🎉 Check it out! Link in the comments! And of course always appreciate any feedback or thoughts. #MachineLearning #CFD #DeepLearning #FluidDynamics #AIForScience

15 Comments
Like Comment
Dr.-Ing. Michael Blank

Building Particle-Based Simulation Solutions for Additive Manufacturing

4,472 followers 1y
Report this post
Over the past few months, I’ve extensively refactored my simulation software to significantly enhance GPU performance and enable seamless integration with the concurrently developed graphical user interface. A key milestone: the software is now capable of performing full-part simulations directly from G-code input, with all parsing and processing handled internally. The video below demonstrates a Directed Energy Deposition (DED) simulation of a cone structure, built using a 1 mm diameter wire made from a zirconium-based alloy and melted by a ring-shaped laser. The simulation represents 40 minutes of real-world process time and was computed over two days on a single GPU. The results highlight the importance of fine-tuning process parameters—especially as layer count increases. Identifying optimal settings throughout the build process is a complex challenge. This is where the software excels: it enables efficient parameter studies and rapid exploration of process variations, significantly accelerating the path to optimized results. Simulated on AMD Ryzen 7950X3D and NVIDIA RTX 4090 Video rendered in Blender

26 Comments
Like Comment

High-Performance Computing in Simulation

Summary

High-Lift Aerodynamics Simulation with WMLES

More in Mechanical Engineering Innovations

Explore categories