run each iTensor code on a specific number of CPUs

javahedi · September 26, 2022, 11:13am

Dear iTensor

I am running a few independent codes, c++ iTensor (v3), on my Linux (ubuntu) with 64 CPUs.
Is there a way to run each code on a specific number of CPUs?
Let’s say if I have 8 jobs, each should occupy 8 CPUs and not interfere with each other.
I am asking this, as I noticed without control over the way codes are spreading on CPUs, breaks down their speed hugely!!.

below, I attached the message after compiling the code on my machine.

g++ -m64 -std=c++17 -fconcepts -fPIC -c -I. -I'/home/pgi3/itensor_new'  -O2 -DNDEBUG -Wall -Wno-unknown-pragmas -Wno-unused-variable -o main.o main.cc
g++ -m64 -std=c++17 -fconcepts -fPIC -I. -I'/home/pgi3/itensor_new'  -O2 -DNDEBUG -Wall -Wno-unknown-pragmas -Wno-unused-variable main.o -o main -L'/home/pgi3/itensor_new/lib' -litensor -L/opt/intel/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_rt -lmkl_core -liomp5 -lpthread

Regards
Javad

miles · September 26, 2022, 1:30pm

Hi Javad,
When you say you are running 8 jobs, do you mean on a shared computer cluster? Or on a single machine (your machine, say)?

If on a cluster, then you will need to configure your job submission script to request a certain number of CPUs. For example, if your cluster uses the “slurm” system for managing jobs, there are certain flags or options you can pass to control this.

If on your own machine, then you are asking about multithreading and using a certain number of CPU cores. This is controlled by command line variables. The main ones to know about are the ones controlling the multithreading behavior of BLAS such as MKL_NUM_THREADS (since I see you’re using MKL). The other kind of multithreading present in ITensor is done using OpenMP and is multithreading over separate non-zero blocks (if present) due to conserved quantum numbers and symmetries. You can control the amount of this multithreading by setting the variable OMP_NUM_THREADS. A best practice for this situation is to set MKL_NUM_THREADS=1 if you are setting OMP_NUM_THREADS to something greater than 1.

Lastly, if you are talking about a single machine, you just might not be able to run 8 separate processes eight times faster than a single process. CPUs share a lot of cache memory and other resources and multithreading often does not offer ideal speedups beyond a certain point.

javahedi · September 26, 2022, 2:18pm

Dear Miles

Many thanks for your fast and detailed reply.

Topic		Replies	Views
Multithreading in C++ ITensor C++ Questions	2	313	December 7, 2022
question about multi-threading in ITensorParallel ITensor Julia Questions julia , dmrg	2	133	April 26, 2024
Installation of iTensor in cluster ITensor C++ Questions	3	318	June 7, 2022
ITensor Multithreading for Dense MPS ITensor Julia Questions julia , multithreading	1	300	January 24, 2024
Multithreading in iTensor ITensor Julia Questions	1	400	July 27, 2022

run each iTensor code on a specific number of CPUs

Related topics