Large memory used when performing tdvp on cluster

Hello!
I am trying to use GSE-TDVP (global subspace expansion TDVP ) to do a real time-evolution.

Doing a subspace expansion before every time step of 1-site TDVP until the maximum bond dimension grows to the upper bound.

My Halmitonian is just an one-dimensional Bose-Hubbard model without long-range interactions, which starting from a product state.

When the length of the chain is 24, I set the upper bound to be 600 and every time step of 1-site TDVP, it costs about 28GB of memory after the reaching of upper bound.

After sweep 1: maxlinkdim=602 maxerr=0.00E+00 current_time=0.0 - 0.05im time=24.817
 24.818610 seconds (2.61 M allocations: 28.250 GiB, 1.06% gc time)
tsofar = 100.0
maxlinkdim(psi) = 602
Size of psi: 34220216 bytes 

But disaster occurs when I grow the size of the chain to 36 and the upper bound of bond dimension to 800, TDVP costs about 108GB of memory which results in an out-of-memory error and jobs on cluster get killed. ( near 256 GB of RAM of my cluster)

After sweep 1: maxlinkdim=819 maxerr=0.00E+00 current_time=0.0 - 0.05im time=122.393
122.396214 seconds (4.70 M allocations: 120.135 GiB, 2.65% gc time)
Free memory before GC is 44.067 GiB
Free memory after GC is 44.088 GiB
tsofar = 5.1000000000000005
maxlinkdim(psi) = 819
Size of psi: 140848968 bytes

I have used GC.gc() to free the memory after every time step of 1-site TDVP as mentioned in Large memory issue in time evolution using ITensors.jl with apply function - ITensor Julia Questions - ITensor Discourse, but it doesn’t play much role.

My julia version is 1.8.5.

So, is there any way to handle it? It bothers me for a long time.
Any help would be grateful!

 if maxlinkdim(psi) < max
            phis = Vector{MPS}(undef, kdim)
            for j=1:kdim
                prev = j == 1 ? psi : phis[j - 1]
                # phis[j] = prev-im*tstep*(apply(HAL, prev;cutoff=cutoff1,method=met))
                phis[j] = apply(gates,prev;cutoff=cutoff1,method=met)
                normalize!(phis[j])
            end
            psi=ITensorTDVP.extend(psi,phis;cutoff=cutoff2)
        end
        phis=0

        Nsite=1
        @time psi = tdvp(
            HAL,
            -im*tstep,
            psi;
            nsweeps=1,
            maxdim=max,
            cutoff=0,
            nsite=Nsite,
            outputlevel=1,
            normalize=true,
            # solver_backend="applyexp"
        )
GC.gc(true)

I thought it was may be caused by the competition of different computation jobs in the same node on cluster which share a same RAM. So, I want to ask is there any way to control or constraint the resource TDVP function use in Julia or ITensors?

If your job is sharing resources with another one on the same node, then that can definitely constrain the amount of RAM available and lead to crashes. You should try to set up your job request to request an entire node for your job, if possible.

Regarding ways to make Julia ITensor codes (hopefully) use less memory, we have added a small guide recently here:
https://itensor.github.io/ITensors.jl/dev/faq/HPC.html