Extreme large RAM usage on tdvp

Hi,

I’m running a basic TDVP code using ITensor which I believe is showing some extreme RAM usage. For example, a calculation with 140 sites, with maxdim set at 1024, has the following information from the system monitors:

 Info: TDVP time : 2.0
After sweep 1: maxlinkdim=330 maxerr=9.99E-13 current_time=0.0 - 0.125im time=46.038
After sweep 2: maxlinkdim=338 maxerr=9.99E-13 current_time=0.0 - 0.25im time=46.171
 92.210864 seconds (414.42 M allocations: 264.396 GiB, 6.78% gc time)
 Info: TDVP time : 2.25
After sweep 1: maxlinkdim=348 maxerr=1.00E-12 current_time=0.0 - 0.125im time=47.948
After sweep 2: maxlinkdim=357 maxerr=9.99E-13 current_time=0.0 - 0.25im time=49.685
 97.634782 seconds (419.41 M allocations: 287.083 GiB, 5.80% gc time)

I would think from some very crude estimate, (the tensors are block sparse fermions), in the worst situation when the tensors are completely saturated, I would have
16 \ (\text{Complex F64}) \times 1024^2\ \text{(bond dim)} \times 2 \ \text{(physics dim)} \times 140 \sim 5\text{GB}

of memory usage, which should be no where near the memory allocation indicated here.

Or if we are concerned about the eigendecomp cost, etc, the effective local Hamiltonian should also have dim \sim (1024 \times 2)^2, which shouldn’t be expensive at all for something like ARPACK, or other solvers.

The above was on my personal computer with a single thread, and on the cluster I’m using the same code but with multithreading (-t N option), however I don’t think that changes the nature of the problem either since the threads should share memory (The jobs would mostly crash because of outofmemory error, the exact same simulation parameters when given 16G memory).

Is there something I’m missing here? I’ve been following the HPC guide on not re-allocating before GC kicks in, etc.

That is reporting the output of @time, which shows the total accumulated memory usage over the course of the calculation, not the peak memory usage. Tensors are allocated and de-allocated quite a lot during the calculation.

I see. Is there a way to monitor peak memory usage in tdvp?

For example, I ran across this example from:

https://itensor.github.io/ITensors.jl/dev/examples/DMRG.html#Monitoring-the-Memory-Usage-of-DMRG

https://itensor.github.io/ITensors.jl/dev/Observer.html#observer

But this seems to be specifically designed for DMRG, and a direct copy of the code does not work with tdvp. I’m wondering if there’s a version for tdvp as well,

1 Like

You can see an example of using an observer with TDVP here: ITensorMPS.jl/examples/04_tdvp_observers.jl at main · ITensor/ITensorMPS.jl · GitHub. You can add a function to the observer that records Base.summarysize of the state and reduced_operator.

EDIT: I realized the current version of ITensorTDVP/ITensorMPS doesn’t pass reduced_operator to the observer, I’m fixing that in Pass reduced_operator to observer by mtfishman · Pull Request #87 · ITensor/ITensorTDVP.jl · GitHub.

Julia has an internal function recording the peak memory calld Sys.maxrss(). I learned it from this post: https://itensor.discourse.group/t/memory-usage-in-dmrg-with-julia-1-x/1092 . You can do something like @printf "Max. RSS: %9.3f GB\n" Sys.maxrss()/2^30. A more deatiled discussion is in How to track total memory usage of Julia process over time.

3 Likes