Terrill Dicki
Might 13, 2026 17:28
NVIDIA’s XANI workflow slashes nanoscale imaging information evaluation from 9 months to below 4 hours utilizing Grace Blackwell Superchips.
NVIDIA has unveiled a serious breakthrough in nanoscale imaging with its Accelerated X-ray Evaluation for Nanoscale Imaging (XANI) workflow. Utilizing its Grace Blackwell Superchips, the corporate has reduce down information processing time for X-ray free-electron laser (XFEL) amenities from 9 months to below 4 hours—an enchancment of over 1,000x.
XFEL amenities, reminiscent of LCLS-II within the U.S. and European XFEL in Germany, generate large datasets whereas probing the atomic and digital dynamics of superior supplies like semiconductors, batteries, and catalysts. These amenities produce as much as 1 million X-ray pulses per second, capturing structural shifts on the atomic stage in actual time. Nonetheless, analyzing the ensuing terabytes of multidimensional information has historically been a computational bottleneck.
NVIDIA’s XANI answer leverages the GB200 Grace Blackwell Superchips to speed up this course of. By combining GPU-based processing with CUDA Python and distributed computing, the workforce compressed the evaluation of 42 terabytes of information to below 4 hours whereas sustaining precision. It is a stark distinction to conventional CPU-bound workflows, which frequently course of simply 10% of a dataset throughout experiments.
Key Improvements in XANI
A number of technical developments underpin XANI’s efficiency:
GPU Acceleration: XANI achieved a 43x speedup on a single GPU and a 1,000x enhance on 64 GPUs in comparison with earlier CPU-based strategies.
cuPyNumeric Libraries: New libraries, like LMFIT and multithreaded HDF5, improved GPU utilization and enabled 165x quicker I/O throughput.
GPUDirect Storage (GDS): By instantly loading information into GPU reminiscence, XANI bypasses CPU bottlenecks, enabling learn speeds of as much as 700GB/s throughout 16 Grace Blackwell nodes.
The workflow additionally introduces a distributed reminiscence structure that simplifies scientific computing. By swapping NumPy imports for cuPyNumeric, researchers can robotically parallelize operations throughout clusters with out writing advanced MPI code. This makes XANI accessible to fields past physics, together with supplies chemistry and quantum computing.
Scaling for Subsequent-Gen Analysis
The XANI structure is designed for scalability. With its GPU-centric distributed mannequin, scientists can now analyze information in actual time, offering dwell suggestions throughout experiments. This functionality may redefine how XFEL amenities function, lowering delays between information assortment and actionable insights.
Because of advances in nonlinear least-squares algorithms and batched GPU computation, XANI can course of high-resolution imaging information right down to the pixel stage. The workflow’s capability to suit damped oscillations to detector information in parallel ensures quicker and extra exact outcomes than ever earlier than.
Implications for Scientific Discovery
NVIDIA’s XANI workflow represents a paradigm shift for high-performance computing in scientific analysis. By lowering evaluation occasions from months to hours, it accelerates discoveries in supplies science, quantum physics, and past. XFEL amenities worldwide now stand to profit from these efficiencies, unlocking new alternatives for real-time experimentation.
For researchers, the implications are clear: superior GPU-based methods like Grace Blackwell Superchips have gotten indispensable instruments in tackling the info challenges of recent science.
Picture supply: Shutterstock








