Can MPS be stored to h5 in the midst of a sweep?

Dear there,
I have a quick question about if it possible to store MPS in the middle of 1 sweep, I mean before one sweep finishes. Becasue the cluster I’m using has 24-hour time limit, and the system I’m calculating is a bit complicated which requires roughly 30 hours for even 1 sweep… So just wonder is there any checkpoint-like method in itensor to store intermediate MPS so that I can continue another job with that. Thank you so much!

See the discussion here for one possible solution

1 Like

Thanks for your reply!! I’ve read the suggestion there and want to double check my understanding. I saw by overloading the measure! function with an observer, I can print out psi, bond, energy etc during each sweep, then would the idea be: I print and save the wavefunction to h5 after each bond optimization, then say when time is out, it just optimized the 20th bond between sites (20,21) during the first half sweep (but what if it doesn’t exactly finish that 20th bond optimization but is in the middle process like iteration etc?) Then I save this psi to h5, and load it in my next calcution, but in the next calculation how could I start dmrg from the 20th bond? (I’m a little confused…)

Thank you!!

A few comments:

  • A word of caution: you can decide how often you want to save the MPS, as depending on the size/speed of the writing the hdf5 file, lost calculations may be faster to redo than to save constantly.
  • If the DMRG optimization hasn’t finished the 20th bond optimization then it won’t have reached measure! to be saved, so the partial calculations will be lost
  • At the moment, there’s no way I know of to restart from a specific bond, you’d just load the last saved MPS and start a new DMRG calculation.
    • Maybe Miles/Matt can comment on if there’s plans to add this in
1 Like

Oh I see! So the intermediate psi is used as a better initial MPS for next DMRG calculation, the idea is not to continue the previous one from exactly where it was interrupted, is this right?

And thanks for reminding me I can change the frequency of saving! I try to write a short function that saves psi every 10 bonds (or other number), does this look right? (let me know if I didn’t understand overloading measure! function correctly) Thank you so much:)

function ITensors.measure!(o::SizeObserver; bond, sweep, half_sweep, psi, projected_operator, kwargs...)
  if bond%10==0 
    bond = kwargs[:bond]
    energy = kwargs[:energy]
    f = h5open("wavefunction.h5","w")
    write(f,"PGS",psi)
    close(f)
    println("Now at bond $bond, current energy is $energy")
  end
end

Looks good to me! And yes, trying to save whatever has been partially “swept” as a better start to your restart.

And let me recommend the do open syntax since its safer:

h5open("wavefunction.h5","w") do f
   write(f,"PGS",psi)
end
1 Like

Thank you for confirming and letting me know this safer way for saving!

I got one more problem…I tried the following(attached), basically what we’ve discussed, but I receive an error message saying: ERROR: LoadError: type NamedTuple has no field bond. Do you happen to know what does this mean? (I feel it means I don’t kwargs bond, but I don’t why…) The complete error message is shown below.
Thank you!

function ITensors.measure!(o::SizeObserver; bond, sweep, half_sweep, psi, projected_operator, kwargs...)
  if bond%8==0 
    bond = kwargs[:bond]
    energy = kwargs[:energy]
    h5open("tem_wavefunction.h5","w") do f
    write(f,"PGS",psi)
    end
    println("Now at bond $bond, current energy is $energy")
  end
  if bond==1 && half_sweep==2
    psi_size =  Base.format_bytes(Base.summarysize(psi))
    PH_size =  Base.format_bytes(Base.summarysize(projected_operator))
    println("After sweep $sweep, |psi| = $psi_size, |PH| = $PH_size")
  end
end

ERROR: LoadError: type NamedTuple has no field bond
Stacktrace:
[1] getindex
@ ./namedtuple.jl:137 [inlined]
[2] getindex(v::Base.Pairs{Symbol, Any, NTuple{4, Symbol}, NamedTuple{(:energy, :spec, :outputlevel, :sweep_is_done), Tuple{Float64, Spectrum{Vector{Float64}, Float64}, Int64, Bool}}}, key::Symbol)
@ Base.Iterators ./iterators.jl:282
[3] measure!(o::SizeObserver; bond::Int64, sweep::Int64, half_sweep::Int64, psi::MPS, projected_operator::ITensors.DiskProjMPO, kwargs::Base.Pairs{Symbol, Any, NTuple{4, Symbol}, NamedTuple{(:energy, :spec, :outputlevel, :sweep_is_done), Tuple{Float64, Spectrum{Vector{Float64}, Float64}, Int64, Bool}}})
@ Main kks_dmrg.jl:9
[4] macro expansion
@ ~/.julia/packages/ITensors/HjjU3/src/mps/dmrg.jl:328 [inlined]
[5] macro expansion
@ ./timing.jl:382 [inlined]
[6] dmrg(PH::ProjMPO, psi0::MPS, sweeps::Sweeps; kwargs::Base.Pairs{Symbol, Any, Tuple{Symbol, Symbol, Symbol}, NamedTuple{(:observer, :eigsolve_tol, :write_when_maxdim_exceeds), Tuple{SizeObserver, Float64, Int64}}})
@ ITensors ~/.julia/packages/ITensors/HjjU3/src/mps/dmrg.jl:228
[7] dmrg#1035
@ ~/.julia/packages/ITensors/HjjU3/src/mps/dmrg.jl:27 [inlined]
[8] top-level scope
@ kks_dmrg.jl:241
in expression starting at kks_dmrg.jl:24

You explicitly listed bond as an argument to ITensors.measure! so it won’t show up in the remainder of the splatted kwargs.... In fact, you can see the if statement already uses it so you can just remove the line bond = kwargs[:bond] and things should work. I suspect if you had written it as ITensors.measure!(o::SizeObserver;kwargs...) then you would have had to define bond before your if statement.

See also Keyward Arguments

1 Like

Thank you!