Low GPU utilization (<25%) when accelerating DMRG with GPU

Ang · September 8, 2025, 7:34am

I’m currently using a GPU to accelerate DMRG computations. My MPS is not large, so the computational scale and memory usage are also small (less than 3GB), which are all within my expectations. However, the GPU utilization is quite low (less than 25%).

I’m not familiar with GPU computing in Julia, but I think low GPU utilization might result from two possibilities:

The algorithm itself doesn’t fully leverage the GPU’s capabilities;
An inherent issue caused by the small computational scale.

If it’s the latter, I’m considering trying to compute other ITensor tasks in parallel to utilize the idle GPU resources. But I have no knowledge about parallelization within a single GPU. Of course, I’m not asking about the specific implementation details of parallelization. Since you’re familiar with ITensor’s GPU acceleration, I’d just like to ask about the feasibility of this approach.

mtfishman · September 8, 2025, 1:03pm

What are the bond dimensions of the MPO and MPS in your DMRG calculation? Additionally, are you conserving QNs? If so, see this post: TEBD time evolution with CUDA backend

Ang · September 9, 2025, 2:05am

The maximal bond dimension is 7 for the MPO, and is about 20 for the MPS most of the time. I am not using any conserving QNs.

mtfishman · September 9, 2025, 12:26pm

Those bond dimensions are very small, I’m not surprised that you don’t see good GPU utilization in that case. In general we see that bigger tensors get better speedups on GPU (since there are more elements of the tensors for the GPU to parallelize over). Regarding parallelization, for such small bond dimensions I think the only relevant parallelization would be real space parallelization ([1301.3494] Real-Space Parallel Density Matrix Renormalization Group) but that would only be effective for large system sizes and even then I think it would be limited in how much speedup you would see, i.e. you can only speed it up up to the number of real space partitions you split up your system into.

Ang · September 10, 2025, 2:14pm

Thank you for your reply！

Topic		Replies	Views
Memory bottleneck with GPU backend for DMRG ITensor Julia Questions julia , dmrg	1	574	September 12, 2022
GPU is not faster CPU ITensor Julia Questions dmrg	3	74	November 2, 2024
Large RAM and vRAM usage in DMRG DMRG and Numerical Methods	6	84	January 30, 2025
Naive use of CUDA for DMRG leads to MethodError ITensor Julia Questions	4	65	November 8, 2024
Pooling GPU memories with ITensorMPS ITensor Julia Questions	1	37	June 11, 2025

Low GPU utilization (<25%) when accelerating DMRG with GPU

Related topics