Dears Professors and develops.
When practice sic example, it is found that epw.x runs on AMD Zen4 cpus is much slower than inteL cpus when calculating self-energy. epw.x on both was compiled with intel icc and ifort with mkl.
Do you have any solution?
epw.x runs on AMD Zen4 cpus is much slow than inteL cpus
Moderator: stiwari
Re: epw.x runs on AMD Zen4 cpus is much slow than inteL cpus
Hi
Can you kindly post the system size (k-grid, q-grid, and number of bands (nbndsub)) along with parallelization setting used for running epw.x? Also, can you post the output files?
The possible reason for such drop in performance can be the compilation of epw. It may be useful to try compiling epw using gcc compiler. Also, if you find a large difference between wall time and cpu time, try using etf_mem = 1 for such case.
Best regards,
Sabya.
Can you kindly post the system size (k-grid, q-grid, and number of bands (nbndsub)) along with parallelization setting used for running epw.x? Also, can you post the output files?
The possible reason for such drop in performance can be the compilation of epw. It may be useful to try compiling epw using gcc compiler. Also, if you find a large difference between wall time and cpu time, try using etf_mem = 1 for such case.
Best regards,
Sabya.
Re: epw.x runs on AMD Zen4 cpus is much slow than inteL cpus
Dear stwari
Sorry for my very late reply.
There is no email messages for notification
Since this and the traning plan stop for several months, i did not log in this forum during these time.
k-grid, q-grid are the same as 8,8,1. i don't kown where i should go through to read number of bands (nbndsub). Could you specify?
icx and ifortran in intel oneapi toolset are preferred for me.
yes, i just read the input help on epw website. etf_mem = 1 is good choice, and i will try later.
Yours
Sincerly
Sorry for my very late reply.
There is no email messages for notification
Since this and the traning plan stop for several months, i did not log in this forum during these time.
k-grid, q-grid are the same as 8,8,1. i don't kown where i should go through to read number of bands (nbndsub). Could you specify?
icx and ifortran in intel oneapi toolset are preferred for me.
yes, i just read the input help on epw website. etf_mem = 1 is good choice, and i will try later.
Yours
Sincerly
Last edited by zcy on Fri Aug 15, 2025 9:18 am, edited 1 time in total.
Re: epw.x runs on AMD Zen4 cpus is much slow than inteL cpus
Hi,
nbndsub corresponds to the total number of Wannier projections used in your calculations.
You may try setting etf_mem = 1 or etf_mem = 0 together with epw_memdist = .true. This might help.
If you can share your input and output files, we will be able to suggest a more optimized setup.
Best regards,
Shashi
nbndsub corresponds to the total number of Wannier projections used in your calculations.
You may try setting etf_mem = 1 or etf_mem = 0 together with epw_memdist = .true. This might help.
If you can share your input and output files, we will be able to suggest a more optimized setup.
Best regards,
Shashi