Niels Cautaerts’ Post

My former colleague Hossein Ghorbanfekr and I recently wrote a book on GPU computing in Python. While many Python programmers, data scientists, and researchers rely on GPU acceleration through high-level frameworks like PyTorch, we noticed that few grasp what’s happening under the hood. Historically, low-level GPU programming was the domain of C/C++ developers, leaving Python users dependent on high-level libraries that wrap low-level code written by someone else. These days, tools like the Numba JIT compiler and the Numba-CUDA backend enable Python developers to write high-performance, low-level GPU code without switching languages. Our book, GPU-Accelerated Computing with Python 3 and CUDA, aims to make CUDA accessible to Python programmers who want to dig one level deeper or need more control over their GPU-accelerated code. We start with the fundamentals: writing and executing CUDA kernels, managing streams, profiling performance, and understanding memory hierarchies. Everything is taught through Python, using Numba-CUDA. We then try to connect these concepts to high-level libraries like CuPy and RAPIDS, which integrate seamlessly with the scientific Python ecosystem. We also included JAX as a flexible framework for differentiable programming and machine learning on GPUs and other accelerators. In the last third of the book, everything is combined to address practical applications, including solving the heat equation, detecting objects in images, simulating atomic interactions, and building + training a small transformer-based language model. This project took a lot of evenings, weekends, and holidays over 1.5 years, but we hope we managed to make something that will benefit other researchers, data scientists, and engineers. We’re grateful to Packt for the opportunity to bring this book to life. The e-book is available now on Amazon (https://a.co/d/03VXXelq), and the print version will be out in a few weeks. This is not an April fool's joke. #gpu #hpc #python #CUDA #numba #scientificcomputing #machinelearning #RAPIDS #cupy #JAX

To view or add a comment, sign in

Explore content categories