ITensors Development Question

Hello

I’d appreciate some feedback/opinions on the following.

For a few years now I have been developing a package built on top of ITensors. But I am always rethinking and brainstorming ways I could implement my code better, frequently refactoring code when I find better ways of doing things in other peoples code (like on ITensors github) or the Julia discourse forums.

To date I have basically used ITensors types directly. What I mean is that, I have my own types but they are built up using ITensors. For example, I am solving partial differential equations for fluid flow so I might have something like this:

struct FlowSimulation
  u::MPS
  v::MPS
  ddx::MPO
  ddy::MPO
  etc...
end

I have some methods I have defined that only make sense for MPS representing flow fields. They make some assumptions about the ordering of indices and how MPS/MPO are created. So far I have what I think is a pretty naive implementation that looks like this.

function do_something_to_1D_MPS(mps::MPS)
  # code here...
end

function do_something_to_2D_MPS(mps::MPS)
  # code here...
end

function do_something_to_3D_MPS(mps::MPS)
  # code here...
end

I don’t feel this is very idiomatic though. I’m considering the following two options but I am not sure which is better and for what reasons.

Here is approach one.

abstract type Abstract_MPS_ND end
mutable struct MPS_ND{dim} <: Abstract_MPS_ND 
  mps::MPS
  indexing_scheme::String
end

Then I could define constructors for these that determine dim (not shown below) and then I can dispatch my special methods on these types.

const MPS_1D = MPS_ND{1}
const MPS_2D = MPS_ND{2}
const MPS_3D = MPS_ND{3}

function do_something_to_MPS(mps::MPS_1D)
  # code here...
end

function do_something_to_MPS(mps::MPS_2D)
  # code here...
end

function do_something_to_MPS(mps::MPS_3D)
  # code here...
end

And then I’d need to overload the ITensor methods to know how to use my type. So something along these lines. I would have to do this for every ITensor method that I needed to use.

ITensors.contract(a::T, b::V; kwargs...) where {T<:Abstract_MPS_ND, V<:Abstract_MPO_ND}
  return MPS_ND{ndims(a)}(
                  contract(a.mps, b.mpo; kwargs...),
                  a.indexing_scheme
            )
end

# in-place example
ITensors.truncate!(a::T; kwargs...) where {T<:Abstract_MPS_ND} = truncate!(a.mps; kwargs...)

I feel this is more elegant than what I have now. The only downside is I have to overload all the ITensor methods but that isn’t that much work in the grand scheme of things.

The other idea I had was to subtype AbstractMPS from ITensors. I think this would require defining MPS_ND to have the same fields as what is expected by AbstractMPS. I would just attach one extra field (indexing_scheme) that my own methods would use. This way I would not have to overload all the ITensor methods except where I needed special behavior. I think I’d also need to overload certain methods (like contract) so they returned my type instead of an MPS.

Does that make sense? If so, I think it would look something like this.

mutable struct MPS_ND{dim} <: AbstractMPS
   data::Vector{ITensor}
   llim::Int
   rlim::Int
   indexing_scheme::String
end

Of these two approaches, which do you think is better? Or is there a third I haven’t even thought of that is even better?

What does the dimension represent?

The dimensionality of the field the MPS represents. For example, 1D, 2D, or 3D velocity field (or pressure field, or temperature field, etc.).

It gets set when I construct the MPS the first time and I use various functions whose behavior has to change depending on the dimensionality. Encoding it as a type parameter seemed better than do the func_1D func_2D etc.

There’s definitely many way one could go about designing that, and I don’t think it is obvious.

Is the idea that the MPS has some set of site indices per dimension? How are the site indices grouped?

I’m asking partially because you may want to have some flexibility in the ordering of the site indices, which would indicate that you might want the number of dimensions to be a runtime value instead of a compile time value. For example, one design could be to store an object:

struct MPSFunction{SiteMap} <: AbstractMPS
  state::MPS
  site_map::SiteMap
end

where site_map stores some information about which sites of the MPS corresponds with which dimension of the function, I’m being purposefully vague about the type of that field, but it could be a dictionary mapping the MPS site to the dimension of that site and/or the opposite of that mapping. That can be runtime information.

One big factor about whether to make the dimensionality of the function runtime or compile time information is if there is really a need to have entirely separate functions for each dimension, or if you can often manage to write a single function that is generic to any number of dimensions (which would be ideal, since having just one function is better than many, if you can design the function in a smart way where it isn’t too complicated). In Julia, you can always hoist runtime information to compile time information with an object like Val, at the expense of a call to runtime dispatch, so if you sometimes need functions that are dispatched on the dimension you could do it that way as needed.

Regarding whether or not you should directly store an MPS as a field or recreate the structure of the MPS type yourself, that’s up to you, but a design I would go with is storing an MPS as I show in the MPSFunction type above, and then forward function calls to that field as needed, i.e.:

f(x::MPSFunction) = f(x.state)

You can even do that with fields by overloading Julia’s Base.getproperty function.

Hey @mtfishman, thanks for taking a look at this.

Yes, that is the idea. The x, y, and z dimensions all each have their own sets of indices. The indices can be either sequential (x1, x2, x2, …, y1, y2, y3) or interleaved (x1, y1, z1, x2, y2, z2). Hence why I was wanting to attach the indexing scheme to the type since I need certain behavior depending on the indexing scheme (for example when collapsing the MPS back into a matrix in the case of 2D simulations). The scheme is fixed though, once the MPS is created I don’t need to change it afterward in a given simulation.

I like the idea of a ::SiteMap but I’m not sure how that would work (I will play around with the Dict idea). Right now I have functions that require the scheme be given when called, like func(mps; scheme="sequential"). With the system from my original post, I could just extend my existing methods easy enough (I think) by doing func(mps) = func(mps; scheme=mps.scheme).

I think this is required. The operations that are performed are different depending on whether the MPS represents a 1D, 2D, or 3D field.

I did consider this when prototyping the code in my original post but I am not familiar enough with Val to know precisely how to take advantage of it how you describe. I understand it can make bits type information dispatchable but I am not sure about the implementation.

Maybe something like this? Let’s say dim is a field of the struct instead of a parameter. Then I could do

func(m::my_type, ::Val{m.dim}) where {Val==1}? (I just wrote that out, may not even run).

Yes, I like that idea. I’ll definitely consider it as I flesh this out.

I would at least try to think through if that is really required. It might be, but my suggestion would be to start from the premise that you can write generic functions for any number of dimensions and see how far you get, you may be surprised (or I may be wrong!).

Here is an actual working example, assuming I am using Val correctly. Though it isn’t obvious to me this is better than embedding the dimension as a type parameter?

mutable struct MPS_ND <: Abstract_MPS_ND 
  mps::MPS
  dim::Int64
  indexing_scheme::String
end

Val(mps_nd::MPS_ND) = Val(mps_nd.dim)

function func(mps_nd, ::Val{1})
  # 1D behavior
end

func(mps_nd) = func(mps_nd, Val(mps_nd))

That’s essentially what I was suggesting. Again, the main reason for designing it that way is if you can manage to design your code where most of the time you don’t need the dimension as a type parameter, say if most functionality can be written generically for any dimension (which is true of all of the operations that just forward to the MPS as general MPS functionality, for example). If most of your functions are dependent on the dimensionality, it may be better to make it a type parameter.

More broadly, the question of whether to make certain information of a type compile time or runtime is a very nuanced one, and something we discuss a lot and also experiment with different designs over time. It can have subtle implications for performance of various operations as well as compilation times. The primary questions I find helpful when making that choice is how often that information is needed by the compiler: does it help with optimization? Is it commonly needed for dispatch? Is the information you want to expose at the type level actually static in general? Sometimes the choice is driven by the data structure itself, but sometimes it isn’t. It can involve making a guess at first, seeing how a certain design choice goes, and trying to redesign the code another way and seeing if it has implications for performance, simplifying the code design, etc.

1 Like

That is good to hear and sounds very familiar.

I think for now I am going the route of using Val. I don’t have that many functions where I would need the parametrized types to dispatch on. So then I think Val makes sense, at least for now.

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.