Slicing of MPS / MPO and in-place contract / apply

Hello,

I have been leveraging ITensors quite extensively for applying tensor networks to path integral simulations. It has been a pleasure to use the library.

I was trying to optimize the performance of my code, and realized a huge part of the bottleneck was allocations. As an outline of a naive implementation of the TEMPO method (Efficient non-Markovian quantum dynamics using time-evolving matrix product operators | Nature Communications), here are the steps:

  1. At every time-point, the amplitudes of the paths is stored as an MPS. For the n -th time point, there are The influence functional that captures the system-environment interaction is encoded in an MPO. The final result at a time-point t is the result of contracting over the MPO-MPS application.
  2. The next time-point, we allocate a new MPS for the amplitudes, with an extra site (n+1 sites); copy over the old MPS, and fill in the n+1 th site tensor with an SVD decomposition of the propagator. The MPOs are extended in a similar manner and rinse and repeat.

Now, as is obvious, this naive implementation suffers from a huge number of allocations, deallocations and copies. I was thinking if there are idiomatic ways of decreasing this using ITensors. A very simple optimization might be to preallocate a much larger MPS and MPO, and use the correct number of sites for every time step. This would require the idea of slicing into MPSs and MPOs and creating non-allocating views. The apply / contract methods need to be able to understand such a view-based call as well. Additionally, one can also provide mutating versions of apply / contract. I don’t think ITensors / ITensorsMPS has this feature at this point of time. I am thinking of something along the lines of:

sites = siteinds("S=1/2", 100)
psi = randomMPS(sites)
oper = randomMPO(sites)

psi1 = apply(oper, psi, sites[1:20]) # use only the sites mentioned for contracting...
@assert length(psi1) == 20
@assert siteinds(psi1) == sites[1:20]

# In-place apply! / contract!:
psiout = MPS(sites)
apply!(oper, psi, psiout) # no return value. `psiout` overwritten.

The topic Exact application of an array of ITensors to an MPS. dealt with a similar set of issues. I wonder if it has been incorporated in the ITensorNetworks.jl package. I haven’t played around with ITensorNetworks to know enough about the internals, but I can try to implement this. However, I do think that it’d be good to have this functionality in ITensorMPS as well. What do you feel? Having in-place versions of these “primitive” operations would really help me improve the performance of my code. Any suggestions?

Thanks!
Amartya

We could do that in principle, but the devil is in the details. Depending on the algorithm backend, it may be easier or harder to make use of the input data. For example, some backends dynamically determine the bond dimension of the output as you go along based on the singular value decomposition, so you would need to account for that when writing into the input.

I think our strategy would be to try to make allocations faster (say with packages like GitHub - MasonProtter/Bumper.jl: Bring Your Own Stack, which we could use for internal/temporary data) or decrease allocations within functions like apply, which I think we could do with various strategies and at various levels of the code, but it is complicated and sometimes you just can’t avoid allocating.

Happy new year Matt! Yeah, it seemed extremely fiddly. I will take a look at Bumper and see if I can leverage it at my end directly. Thanks!