Newer version of TDVP seems to be much slower

Hi,

I have a question regarding the update of ITensorTDVP from v0.3 to v0.4. It seems that the algorithm is much slower in the latest version. The tdvp step in the following code example is four times slower when using v0.4.10 as opposed v0.3.0. Is this to do with the added feature implementing the global subspace expansion? Any help/explanation would be greatly appreciated.

using Pkg
Pkg.rm("ITensorTDVP")
Pkg.add(PackageSpec(name="ITensorTDVP", version="0.4.10"))
using ITensorTDVP
using ITensors

@show(pkgversion(ITensorTDVP))
@show(pkgversion(ITensors))

N = 50
sites = siteinds("S=1/2",N)
os = OpSum()
for j=1:N-2
  os += "Sz",j,"Sz",j+2
  os += 1/2,"S+",j,"S-",j+2
  os += 1/2,"S-",j,"S+",j+2
end
H = MPO(os,sites)
psi0 = randomMPS(sites;linkdims=10)
nsweeps = 5
maxdim = 50
cutoff = [1E-10]
@time psi = ITensorTDVP.tdvp(H,-0.1im,psi0;nsweeps,maxdim,cutoff,outputlevel=1)

Sorry to hear you are finding it to be slower, we wouldn’t expect that. The global subspace expansion isn’t enabled by default so shouldn’t affect the code you are running.

Can you please format your code (Please Read: Make It Easier to Help You) so it is easier to read?

Also, could you try running:

# Compile the code first
ITensorTDVP.tdvp(H,-0.1im,psi0;nsweeps,maxdim,cutoff,outputlevel=0)
# Now run and time
@time psi = ITensorTDVP.tdvp(H,-0.1im,psi0;nsweeps,maxdim,cutoff,outputlevel=1)

i.e. run that line twice instead of once? Julia compiles the code the first time it is run, you may be measuring both the runtime and compile time and perhaps the compile time, not the running time, got slower.

Notice that the API changed for tdvp between the two versions, so the current time doesn’t match up. With the adjustment I do still see the speed differences though (with Julia 1.11.0)

Latest version

pkgversion(ITensorTDVP) = v"0.4.10"
pkgversion(ITensors) = v"0.6.21"
After sweep 1: maxlinkdim=33 maxerr=1.00E-10 current_time=0.0 - 0.02im time=26.513
After sweep 2: maxlinkdim=47 maxerr=1.00E-10 current_time=0.0 - 0.04im time=8.961
After sweep 3: maxlinkdim=48 maxerr=9.83E-11 current_time=0.0 - 0.06im time=10.235
After sweep 4: maxlinkdim=49 maxerr=9.91E-11 current_time=0.0 - 0.08im time=10.598
After sweep 5: maxlinkdim=50 maxerr=9.96E-11 current_time=0.0 - 0.1im time=11.026
 68.485207 seconds (191.49 M allocations: 162.395 GiB, 13.04% gc time, 35.25% compilation time)
After sweep 1: maxlinkdim=33 maxerr=1.00E-10 current_time=0.0 - 0.02im time=3.019
After sweep 2: maxlinkdim=47 maxerr=1.00E-10 current_time=0.0 - 0.04im time=8.188
After sweep 3: maxlinkdim=48 maxerr=9.83E-11 current_time=0.0 - 0.06im time=10.44
After sweep 4: maxlinkdim=49 maxerr=9.91E-11 current_time=0.0 - 0.08im time=13.526
After sweep 5: maxlinkdim=50 maxerr=9.96E-11 current_time=0.0 - 0.1im time=10.706
 45.911007 seconds (57.27 M allocations: 155.603 GiB, 17.04% gc time)

Older version
with @time psi = ITensorTDVP.tdvp(H,-0.02im,...

pkgversion(ITensorTDVP) = v"0.3.0"
pkgversion(ITensors) = v"0.5.8"
After sweep 1: maxlinkdim=32 maxerr=9.96E-11 current_time=0.0 - 0.02im time=0.346
After sweep 2: maxlinkdim=47 maxerr=9.97E-11 current_time=0.0 - 0.04im time=1.114
After sweep 3: maxlinkdim=48 maxerr=9.65E-11 current_time=0.0 - 0.06im time=1.337
After sweep 4: maxlinkdim=50 maxerr=1.00E-10 current_time=0.0 - 0.08im time=1.382
After sweep 5: maxlinkdim=50 maxerr=1.08E-10 current_time=0.0 - 0.1im time=1.413
  5.608748 seconds (3.95 M allocations: 23.300 GiB, 19.56% gc time)
After sweep 1: maxlinkdim=32 maxerr=9.96E-11 current_time=0.0 - 0.02im time=0.342
After sweep 2: maxlinkdim=47 maxerr=9.97E-11 current_time=0.0 - 0.04im time=1.09
After sweep 3: maxlinkdim=48 maxerr=9.65E-11 current_time=0.0 - 0.06im time=1.343
After sweep 4: maxlinkdim=50 maxerr=1.00E-10 current_time=0.0 - 0.08im time=1.487
After sweep 5: maxlinkdim=50 maxerr=1.08E-10 current_time=0.0 - 0.1im time=1.448
  5.717802 seconds (3.95 M allocations: 23.300 GiB, 18.74% gc time)

2 Likes

I will reformat the code and create a new ticket. Thanks very much for the swift reply!

Note that the change in the convention for the time input that @ryanlevy pointed out is explained here: GitHub - ITensor/ITensorTDVP.jl: Time dependent variational principle (TDVP) of MPS based on ITensors.jl.

Please still edit the formatting of your original post, let’s not split the threads and instead continue the discussion here.

EDIT: I misread the post by @ryanlevy, that’s still a big timing discrepancy, even with the two versions seemingly running with the same time step up to the same total amount of time. Maybe the parameters being passed to the Krylov solver are different between the different versions.

I don’t seem to be able to edit the original post for some reason, but here is a formatted version of the code:

using ITensorTDVP
using ITensors

@show(pkgversion(ITensorTDVP))
@show(pkgversion(ITensors))

N = 50

sites = siteinds("S=1/2",N)

os = OpSum()

for j=1:N-2
    os += "Sz",j,"Sz",j+2
    os += 1/2,"S+",j,"S-",j+2
    os += 1/2,"S-",j,"S+",j+2
end

H = MPO(os,sites)

psi0 = randomMPS(sites;linkdims=10)
nsweeps = 5
maxdim = 50
cutoff = [1E-10]

if pkgversion(ITensorTDVP) == v"0.3.0"
    t = 0.02
else
    t = 0.1
end

ITensorTDVP.tdvp(H,-t*im,psi0;nsweeps,maxdim,cutoff,outputlevel=0)

@time psi = ITensorTDVP.tdvp(H,-t*im,psi0;nsweeps,maxdim,cutoff,outputlevel=1)

When using v"0.3.0", the output is

pkgversion(ITensorTDVP) = v"0.3.0"
pkgversion(ITensors) = v"0.5.8"
After sweep 1: maxlinkdim=33 maxerr=9.98E-11 current_time=0.0 - 0.02im time=3.346
After sweep 2: maxlinkdim=48 maxerr=9.88E-11 current_time=0.0 - 0.04im time=6.234
After sweep 3: maxlinkdim=49 maxerr=9.99E-11 current_time=0.0 - 0.06im time=6.667
After sweep 4: maxlinkdim=50 maxerr=9.97E-11 current_time=0.0 - 0.08im time=7.068
After sweep 5: maxlinkdim=50 maxerr=1.21E-10 current_time=0.0 - 0.1im time=7.084
 30.581937 seconds (4.28 M allocations: 23.469 GiB, 8.43% gc time)

When using v"0.4.9", the output is

pkgversion(ITensorTDVP) = v"0.4.9"
pkgversion(ITensors) = v"0.6.19"

After sweep 1: maxlinkdim=33 maxerr=9.84E-11 current_time=0.0 - 0.02im time=16.549
After sweep 2: maxlinkdim=46 maxerr=9.96E-11 current_time=0.0 - 0.04im time=33.129
After sweep 3: maxlinkdim=48 maxerr=9.76E-11 current_time=0.0 - 0.06im time=37.469
After sweep 4: maxlinkdim=50 maxerr=1.00E-10 current_time=0.0 - 0.08im time=40.801
After sweep 5: maxlinkdim=50 maxerr=1.06E-10 current_time=0.0 - 0.1im time=43.797
172.581258 seconds (60.85 M allocations: 152.748 GiB, 11.50% gc time, 0.41% compilation time: 21% of which was recompilation)

This is using julia 1.9.2 but the discrepancy also arises using later versions.

@DavidStrachan1 could you try:

@time psi = tdvp(H, -t*im, psi0; nsweeps, maxdim, cutoff, updater_kwargs=(; eager=true), outputlevel=1)

when using ITensorTDVP.jl v0.4 and see if that fixes the performances discrepancy between v0.3 and v0.4?

I think the performance discrepancy is due to the fact that in ITensorTDVP.jl v0.3, when we called the local solver/updater KrylovKit.exponentiate, we overrode the default keyword argument eager=false (as seen here: Functions of matrices and linear maps · KrylovKit.jl) and replaced it with eager=true, which eagerly checks for convergence of the solver and stops early, avoiding applications of the local reduced operator. In ITensorTDVP.jl v0.4 we are just using the default value of eager=false, while we should override it to eager=true.

For reference, the PR to ITensorTDVP.jl that made that change is here: ITensorTDVP.jl v0.4 by mtfishman · Pull Request #74 · ITensor/ITensorTDVP.jl · GitHub, where you can see eager was being set explicitly in ITensorTDVP.jl v0.3 but no longer is set in ITensorTDVP.jl v0.4.

I raised an issue about this in ITensorTDVP.jl here: Set `eager=true` in `exponentiate` backend · Issue #93 · ITensor/ITensorTDVP.jl · GitHub

Great! This completely fixes the discrepancy for me. Thanks for your help!

1 Like

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.