GaAs runtime reduction
Posted: Thu Jun 30, 2016 10:32 am
Dear EPW Users,
I have tested EPW-5.4.0 output on diamond example files against those provided in the software package and I get comparable runtime so the following is likely not installation related.
I have attempted to calculate electron self-energy of GaAs using the following input:
The grids were taken from the PNAS,112,5291,2015 paper on GaAs. These fine grids are even less stringent that 100x100x100 for both q- and k-grid reported recently (arXiv:1606.07074).
After 240 hours (maximum time allowed on our cluster) on 64 cpus, there was no result as EPW was still calculating. I tried a 30x30x30 and 60x60x60 k- and q-grids, respectively and that finished in 83 hours on 64 cpus but these grids are insufficient for convergence. I have tried the followings to reduce runtime:
1. Use fsthick=2eV as suggested by Professor Giustino (viewtopic.php?f=3&t=18) to speed up calculation
2. Use etf_mem=.false as nothing seems to happen for over a week. According to the manual, etf_mem=.true. is faster although the user can see the progress in the output in terms of the q-points done. For the 240 hour run, I had set etf_mem=.true.
3. Parallelize over q instead of k. The recent EPW paper suggests q-parallelization is possible but even the diamond example segfaults on 8 cpus with 24Gb of memory. It runs fine with k-parallelization.
I was wondering if these is something else I can do to reduce the runtime.
Thank you,
Vahid
Vahid Askarpour
Department of Physics and Atmospheric Science
Dalhousie University,
Halifax, NS, Canada
I have tested EPW-5.4.0 output on diamond example files against those provided in the software package and I get comparable runtime so the following is likely not installation related.
I have attempted to calculate electron self-energy of GaAs using the following input:
Code: Select all
--
&inputepw
prefix = 'gaas'
amass(1) = 69.723
amass(2) = 74.92160
outdir = './'
iverbosity = 0
elph = .true.
epbwrite = .true.
epbread = .false.
lpolar = .true.
!etf_mem = .false.
epwwrite = .true.
epwread = .false.
nbndsub = 8
nbndskip = 0
wannierize = .true.
num_iter = 500
iprint = 2
dis_win_max = 15
dis_froz_max= 4.7
proj(1) = 'Ga:s;px;py;pz'
proj(2) = 'As:s;px;py;pz'
elinterp = .true.
phinterp = .true.
tshuffle2 = .true.
tphases = .false.
elecselfen = .true.
phonselfen = .false.
a2f = .false.
parallel_k = .true.
parallel_q = .false.
fsthick = 2 ! eV
eptemp = 300 ! K
degaussw = 0.01 ! eV
dvscf_dir = '../phonons/save'
filukk = './gaas.ukk'
nkf1 = 40
nkf2 = 40
nkf3 = 40
nqf1 = 80
nqf2 = 80
nqf3 = 80
nk1 = 8
nk2 = 8
nk3 = 8
nq1 = 8
nq2 = 8
nq3 = 8
/
29 cartesian
0.000000000 0.000000000 0.000000000 0.0039062
-0.125000000 0.125000000 -0.125000000 0.0312500
-0.250000000 0.250000000 -0.250000000 0.0312500
-0.375000000 0.375000000 -0.375000000 0.0312500
0.500000000 -0.500000000 0.500000000 0.0156250
0.000000000 0.250000000 0.000000000 0.0234375
-0.125000000 0.375000000 -0.125000000 0.0937500
-0.250000000 0.500000000 -0.250000000 0.0937500
0.625000000 -0.375000000 0.625000000 0.0937500
0.500000000 -0.250000000 0.500000000 0.0937500
0.375000000 -0.125000000 0.375000000 0.0937500
0.250000000 0.000000000 0.250000000 0.0468750
0.000000000 0.500000000 0.000000000 0.0234375
-0.125000000 0.625000000 -0.125000000 0.0937500
0.750000000 -0.250000000 0.750000000 0.0937500
0.625000000 -0.125000000 0.625000000 0.0937500
0.500000000 0.000000000 0.500000000 0.0468750
0.000000000 0.750000000 0.000000000 0.0234375
0.875000000 -0.125000000 0.875000000 0.0937500
0.750000000 0.000000000 0.750000000 0.0468750
0.000000000 -1.000000000 0.000000000 0.0117188
-0.250000000 0.500000000 0.000000000 0.0937500
0.625000000 -0.375000000 0.875000000 0.1875000
0.500000000 -0.250000000 0.750000000 0.0937500
0.750000000 -0.250000000 1.000000000 0.0937500
0.625000000 -0.125000000 0.875000000 0.1875000
0.500000000 0.000000000 0.750000000 0.0937500
-0.250000000 -1.000000000 0.000000000 0.0468750
-0.500000000 -1.000000000 0.000000000 0.0234375
The grids were taken from the PNAS,112,5291,2015 paper on GaAs. These fine grids are even less stringent that 100x100x100 for both q- and k-grid reported recently (arXiv:1606.07074).
After 240 hours (maximum time allowed on our cluster) on 64 cpus, there was no result as EPW was still calculating. I tried a 30x30x30 and 60x60x60 k- and q-grids, respectively and that finished in 83 hours on 64 cpus but these grids are insufficient for convergence. I have tried the followings to reduce runtime:
1. Use fsthick=2eV as suggested by Professor Giustino (viewtopic.php?f=3&t=18) to speed up calculation
2. Use etf_mem=.false as nothing seems to happen for over a week. According to the manual, etf_mem=.true. is faster although the user can see the progress in the output in terms of the q-points done. For the 240 hour run, I had set etf_mem=.true.
3. Parallelize over q instead of k. The recent EPW paper suggests q-parallelization is possible but even the diamond example segfaults on 8 cpus with 24Gb of memory. It runs fine with k-parallelization.
I was wondering if these is something else I can do to reduce the runtime.
Thank you,
Vahid
Vahid Askarpour
Department of Physics and Atmospheric Science
Dalhousie University,
Halifax, NS, Canada