Deep learning: Building a GPU Computer for experimental Particle Physics
At the ATLAS group in the Niels Bohr Institute, and in experimental particle physics in general, we have several open issues where deep learning can make a huge improvement respect to more classical approaches.
Signal vs background classification:
- Electron particle identification. ATLAS uses a likelihood, we want to explore the improvement using Deep Neural Networks. We know we can do better using current information and probably event better if using cluster level information together with a digital imaging approach (Convolutional Neural Networks).
- Boosted Di-boson decays: Identify real di-boson events vs multijet background passing our signal selection.
Regression:
- Top mass fitting: Use Deep Learning to extract the mass of the top quark from leptonic variables.
- Use DNN to convert the readout bit pattern to drift time distance for drift tube gas detectors.
We approached these problems using our current computing facilities: traditional CPU clusters. The total computing time to get final results could span for more than a week. We started to build a GPU based computer to speed up (and be able to test more complex neural networks). This is a developing machine to evaluate future investments we should aim in our computing facilities.
Machine configuration:
- 2 x NVIDIA GeForce GT 1080: 8.8 TFlops each and 8 GB memory. Today a better option will be 1080 Ti or Titan Xp (11 and 12 GB memory).
- Intel Xeon E5 1620. Fundamental to have 40 PCI lanes to use full bus bandwidth to feed both GPUs at the same time.
- 32 GB DDR memory with very low latency
- Gigabit X99 Motherboard with support for 2 x PCIx16 (be aware that not many non-professional grade boards support two PCIx16)
- 512 GB SSD from Intel for OS and heavy used datasets (M.2).
- 3 TB HDD to store large datasets.
- A large full ATX Tower to host at least 2 GPUs (possible to upgrade to 4). Remember to have root for cooling.
- Three large fans to dissipate the heat produced by the dual GPUs and liquid cooling for the CPU.
Total Budget Spent:
- 19000 DKK - 2500 EUR - 2970 USD.
- The components were bought at the beginning of 2017. Today this machine will be cheaper and with updated GPUs will be more powerful.
Software:
- OS: Ubuntu 16.04 (NVIDIA is using it for their own built machines....)
- Python libraries: numpy, scipy, matplotlib, scikit-learn, h5py, rootpy, root-numpy, pandas, scikit-image.
- Machine learning libraries: Keras, Theano, Tensorflow used as backend for Keras with great support out-of-the-box for MultiGPU.
- Root 6.08. This version includes interface between Keras and TMVA.
- CUDA and CudaNN. This is the only tricky part to install. I will make a post with additional information on how to make it work.