I’m looking to seed an initial MPS after using the randomMPS function.
I found this thread: Initialize the random seed of MPS(sites) - ITensor Support Q&A
(I guess for cpp) but I can’t find the function in the Julia docs.
Could someone point me in the right direction?
Best wishes!

Hi Ryan,
I’m finding that for sweeping DMRG on larger cyclinders (16*4), even seeded as you have suggested, that the system starts to differ in energy, and this becomes servere, making reproduciblity hard. Do you have any idea what is going on?

E.g.

output after run 1:

After sweep 1 energy=-24.548721454478176 maxlinkdim=500 maxerr=1.27E-05 time=54.388

output after run 2:

After sweep 1 energy=-24.548721431052165 maxlinkdim=500 maxerr=1.27E-05 time=54.907

Can you please comment more on what you’re doing? Are you running the exact same thing on the same hardware? If for example there are degeneracies in some of the linear algebra operations that could be machine dependent.

I’m running the exact same thing on my laptop, the issue seems to be when I turn on threaded block sparse and use qn conservation.

Taking your example and adding conserve_qns aswell threaded_block_sparse:

using Random
using ITensors
using LinearAlgebra
ITensors.Strided.set_num_threads(1)
BLAS.set_num_threads(1)
ITensors.enable_threaded_blocksparse()
n_sites = 10
Random.seed!(42)
s = siteinds("S=1/2",n_sites, conserve_qns=true)
state = [isodd(i) ? "Up" : "Dn" for i in 1:n_sites]
psi = randomMPS(s, state, 20)
@show expect(psi,"Sz")[1]

output:

$ for i in {0..4}; do
$ julia_itensors -t 4 seed_threaded.jl
$ done
(expect(psi, "Sz"))[1] = -0.04727285332442094
(expect(psi, "Sz"))[1] = -0.04727285332442144
(expect(psi, "Sz"))[1] = -0.04727285332442113
(expect(psi, "Sz"))[1] = -0.047272853324421216
(expect(psi, "Sz"))[1] = -0.047272853324422354

Again these errors accumalate significantly once starting the DMRG.

Threads can change the order of operations (work can be assigned to different threads in different runs or complete in different orders, etc.), which can lead to floating point round-off differences between runs.

I don’t think so, but the calculation should be reproducible up to the number of significant digits of the calculation. What is your goal for having reproducible digits beyond that?

It may be possible to get more reproducibility using the :static mode of @threads (Multi-Threading · The Julia Language), though that would require modifying the source code where the block sparse multithreading is implemented, though I still would wonder what the motivation is.

I see your results are different to 1e-8, which is single precision, which is about as accurate as one might expect DMRG to be. More specifically, you can look at the singular values of the MPS, the smallest singular value gives you an idea for the order of magnitude of accuracy you might expect from expectation values of that MPS.

Luke, I was wondering if any of your example outputs above were meant to show cases where the results were completely different? Because as Matt said the results look rather consistent to many leading digits. Did you also mean you have some cases where the relative differences are much larger, like on the > 1E-3 level or similar?

HI Miles,
I think I do have some cases where the relative differences are higher: run 1:

After sweep 17 energy=27987.492667159288 maxlinkdim=2000 maxerr=6.22E-08 time=347.034
After sweep 18 energy=27987.41267875716 maxlinkdim=2000 maxerr=4.77E-07 time=350.437
After sweep 19 energy=27987.304568541054 maxlinkdim=2000 maxerr=2.39E-06 time=371.835
sweep 20: n_edge=0.8920142027758831, n_bulk=0.8885262476093074

run 2:

After sweep 17 energy=27987.441380926404 maxlinkdim=2000 maxerr=3.71E-08 time=307.634
After sweep 18 energy=27987.397756963106 maxlinkdim=2000 maxerr=1.20E-07 time=300.288
After sweep 19 energy=27987.35044974371 maxlinkdim=2000 maxerr=2.06E-07 time=299.732
sweep 20: n_edge=0.8928772387378088, n_bulk=0.8876536908169983

I realise these are not yet converged results, and this might just be a bad example. What Matt has said makes a lot of sense and seems to apply in the most cases, but here I was just surprised to see such a different value in a local density calculation when the two runs had the same seed. (Here n_edge and n_bulk correpsond to electron density).

I’m not accustomed to seeing energies that big. Is your system very large and/or does it have very large terms in the Hamiltonian or a wide separation in energy scales? It may be that your initial state is just very far from the ground state and it’s taking DMRG a long time to converge, so you may need a better initial state and to do a lot of sweeps.

I once did a set of projects studying systems with a small lattice scale (approaching the continuum) and for that project would routinely do hundreds of sweeps and needed to do a lot of work to prepare good initial states.

This is exactly the case miles. I’m apply a large term to one half of a cyclinder geometry, which clearly is making things challenging. I am currently doing hundreds of sweeps to try and get a converged ground state. I just still find it surprising that I cannot seed the calculation and get the same results, but I will focus on convergence for now! Thanks for all the help

That’s helpful context. So a few things back to you:

as Matt explained above, it’s not likely that DMRG will give you the same results to all digits even with the same random seed, due to complexities of floating point and multi-threading. It’s conceptually possible to make a kind of DMRG that would do this, but we would have to very carefully design every step and (probably) use our own custom linear algebra routines and avoid multi-threading etc. In practice the reproducability is quite good to many digits for systems that are converged.

Your results are just not very well converged, which is a different thing from reproducability. So it will be helpful for you to think of what you’re experiencing as “DMRG isn’t converged yet and is converging slowly” versus thinking about random number seeds and so on. Those considerations only matter in a very high-precision regime you haven’t reached yet.

It’s probably better to think of DMRG, especially in the regime you are in, as more of a “fast optimizer” similar to gradient descent (but faster) versus thinking of it as a deterministic algorithm like sorting or computing the QR factorization of a matrix or something like that. For those deterministic algorithms it makes more sense to imagine repeating exactly the same steps and getting the same answer each time. For DMRG it’s possible to think of all the steps but again, in the regime you’re in (not yet converged) it’s probably better to focus for now on how you can speed up convergence.

One tip for you beyond thinking of better initial states you could choose would be to do very many initial sweeps at a quite low maxdim, so like one hundred or even two hundred sweeps at maxdim=10 then only after that start raising the maxdim parameter.