Speeding up epwwrite (if possible)

Post here questions linked with issue while running the EPW code

Moderator: stiwari

Post Reply
andreyl
Posts: 24
Joined: Sun Mar 26, 2017 12:22 pm
Affiliation:

Speeding up epwwrite (if possible)

Post by andreyl »

Dear EPW developers and users,

I have a question regarding the speed of epwwrite step of EPW.

My workflow, starting from the point, where wannierisation is finished is the following:

-calculate kmaps separately
-calculate epmatw* files
-reuse epmatw* files for all futher (selfens, nesting function...) calculations.

My inputs are the following:
https://paper.dropbox.com/doc/EPW-epwwrite-minimal-setup-I0KKTtMQhuQuGtz5GMuhw?_tk=share_copylink

I run epw like that (through slurm and all):

Code: Select all

mpirun.sh -np 16 $QE/bin/epw.x -npool 16 < epw_init.in > epw_init.out


On a single node with 16 cores.

So:
epwwrite is a major bottleneck for me, depending on the parameters this step lasts for days, if not weeks. I would be quite happy to speed it up.
What can be done for that?
I can think of a couple of options:

-set etf_mem = 0
will it work on this step?

-use more nodes
I doubt it will work, since we need npools=ncores, but maybe something has changed?

Maybe something else is of use here?

Best wishes,
Andrei

sponce
Site Admin
Posts: 616
Joined: Wed Jan 13, 2016 7:25 pm
Affiliation: EPFL

Re: Speeding up epwwrite (if possible)

Post by sponce »

Dear Andrei,

The calculation of the epmatwp can indeed be a bottleneck.

Possible solution to speed things up:
- use etf_mem = 0: This should speed things up a little bit but is usually very memory demanding. In most cases you will
not have enough memory on your nodes.
- use more nodes: Use the maximum number of nodes possible. The limit is set by your total number of k-points in the coarse grid.
If you have a 4x4x4 coarse k-point grid, then you can use 64 cores:

Code: Select all

mpirun.sh -np 64 $QE/bin/epw.x -npool 64 < epw_init.in > epw_init.out


If you have a lot of nodes, there is another option. You can use parallelization on bands via the "image parallelization". To do this, you can look into
EPW/tests/Inputs/t01

Code: Select all

mpirun -np 4 ../../../../bin/pw.x < scf_epw.in > scf_epw.out
mpirun -np 2 ../../../../bin/pw.x -npool 2 < nscf_epw.in > nscf_epw.out
mpirun -np 4 ../../../src/epw.x -npool 2 -nimage 2 < epw5.in > epw5.out


Best,
Samuel
Prof. Samuel Poncé
Chercheur qualifié F.R.S.-FNRS / Professeur UCLouvain
Institute of Condensed Matter and Nanosciences
UCLouvain, Belgium
Web: https://www.samuelponce.com

andreyl
Posts: 24
Joined: Sun Mar 26, 2017 12:22 pm
Affiliation:

Re: Speeding up epwwrite (if possible)

Post by andreyl »

sponce wrote:Dear Andrei,

The calculation of the epmatwp can indeed be a bottleneck.

Possible solution to speed things up:
- use etf_mem = 0: This should speed things up a little bit but is usually very memory demanding. In most cases you will
not have enough memory on your nodes.
- use more nodes: Use the maximum number of nodes possible. The limit is set by your total number of k-points in the coarse grid.
If you have a 4x4x4 coarse k-point grid, then you can use 64 cores:

Code: Select all

mpirun.sh -np 64 $QE/bin/epw.x -npool 64 < epw_init.in > epw_init.out


If you have a lot of nodes, there is another option. You can use parallelization on bands via the "image parallelization". To do this, you can look into
EPW/tests/Inputs/t01

Code: Select all

mpirun -np 4 ../../../../bin/pw.x < scf_epw.in > scf_epw.out
mpirun -np 2 ../../../../bin/pw.x -npool 2 < nscf_epw.in > nscf_epw.out
mpirun -np 4 ../../../src/epw.x -npool 2 -nimage 2 < epw5.in > epw5.out


Best,
Samuel


Dear Samuel,

thank you for the advice!

I had an impression, that

Code: Select all

kmaps       = .false.
epwwrite    =  .true.

are pool dependent (np and npool should be strictly) same as scf and nscf runs. Well if it is not,
I am quite happy:)


etf_mem = 0

allowed me to advance a few steps (same time_limit) if compared to etf_mem = 1 (default), I have some high_mem nodes
among the available resources, which I use anyway, thus I will apply it.

Best wishes,
Andrei

Post Reply