Multithreading in iTensor

Hi there,

I have a question about the performance of iTensor when it comes to multithreading.

Basically, I have a number of different configurations (in this case let’s assume they are based on the same physical system but each with different amount of site disorder). And I’m running a DMRG calculation, solving for the lowest ~20 excited states. Now, my naive approach in speeding up the calculation, was to run the script with

julia -t NUM mydmrg.jl

Where NUM is the number of threads that I have. And then with in the script, I simply have a line

Threads.@threads for case=1:numcases
 run_dmrg(args, case)
end 

Where each instance of run_dmrg() is a standalone dmrg calculation for a particular configuration. What I noticed however, was that this is much slower than if I’m just running the cases sequentially. That is, the script with numcases=1, runs for about an hour, where a calculation of numcases=40 takes about 15 hours.

I assume these are separate instances of the function and there are no internal communications between them whatsoever, so iTensor must have some multithreading built into the evaluation of dmrg? On the other hand, for the singular calculation (numcases=1), memory usage is only at 2.7% so there seems to be something else going on.

Julia threads and BLAS/LAPACK threads (used internally in ITensor for matrix multiplications and factorizations) can clash with each other.

Please try disabling BLAS/LAPACK threading with:

using LinearAlgebra
BLAS.set_num_threads(1)

at the beginning of your script and see if that helps.