truncate(...; site_range=j:j+1) and then on j+1:j+2 is not equivalent to truncate(...; site_range=j:j+2)?

Hi all,
I am confused by the following behaviour of the truncate function with respect to the site_range parameter.
Say I have an MPS v, and I want to truncate its 3rd and 4th bonds by setting a maximum bond dimension.
I would expect that the following two ways yield the same result:

  1. truncate both bonds at once, with a single call to truncate,
  2. truncate first the 3rd bond, then the 4th one, separately.

From looking at the source code of truncate I’d say that they should be equivalent, however the result does not seem to be the same with randomly drawn MPSs:

julia> N = 10; s = siteinds("Boson", N; dim=8);

julia> v = random_mps(ComplexF64, s; linkdims=20);

julia> d = 4;

julia> v1 = truncate(truncate(v; maxdim=d, site_range=3:4); maxdim=d, site_range=4:5);

julia> v2 = truncate(v; maxdim=d, site_range=3:5);

julia> norm(v1 - v2)
0.4065128260105488

I don’t know whether there is something fundamental about MPSs that I’m not understanding, or if the truncate function does not do what I expect it to (or maybe the error is somewhere else…). Can some help me find the issue here?

Truncation occurs in reverse order. If I switch the order to truncate(truncate(v; maxdim=d, site_range=4:5); maxdim=d, site_range=3:4); then I get norm(v1 - v2) = 0

I saw that ITensor performs the truncation in reverse order, but I didn’t think it would make a difference. Sure, if I reverse the order of the bonds I might get a technically different MPS, but I would expect that all these MPS be equivalent from a physical point of view, i.e. the same physical state, once contracted. Maybe my expectations are incorrect… I’ll think about it.

The order is important. I’m having trouble coming up with a great explanation why, so I’ll rely on analogy.

Consider the goal is to maximize a function f(x, y), starting from (x_0, y_0). However, in a single optimization step you are constrained to only update either x or y at a time, never both. Say you start with x, keeping y_0 fixed, to find the global maximum x_1 = \argmax_x f(x, y_0). Then repeat to get y_1 = \argmax_y f(x_1, y). This new point (x_1, y_1) and value f(x_1, y_1) are almost certain to be different from the result if you stepped along y first and then x.

In this analogy, x and y are the MPS tensors, and f is the fidelity with the initial state you are truncating. In the truncation you only do a single optimization pass, which corresponds to reaching (x_1, y_1), further iteration is not standard.

In fact, the MPS optimization is even more order dependent than this simple analogy would suggest. This is because after you update the first tensor to get \ket{\psi'}, when you go to truncate the second tensor you are no longer maximizing the overlap with the initial state \ket{\psi}, but with the truncated state \ket{\psi'}. This is analogous to the function f changing after every single step.

Some of these limitations can be overcome by using a method akin to DMRG to do the truncation instead. With this method you can do multiple optimization steps (sweeps), and are always optimizing the same target (overlap with the initial state). You can use this with ITensorMPS.apply(MPO("I", sites), psi; alg="fit", maxdim, nsweeps).

Ah yes, of course, you’re right. I forgot that the SVD compression is only locally optimal. Then everything makes sense. Also I’m using random tensors with small bond dimensions so I understand why taking a different path for the truncation leads to a wildly different result.

Thank you also for pointing me to the “fit” algorithm for apply! I wasn’t aware of it.