Page 1 of 1

Weird problem in running gan example

Posted: Sun Jun 19, 2016 3:04 pm
by amosyang
Dear all EPW users,

recently, I am running the gan example. However, I am troubled by some weird problem during the epw calculation:"forrtl: severe (36): attempt to access non-existent record, unit 24, file /homea/jhpc37/jhpc371/QE/espresso-5.4.0/EPW/examples/gan/epw/./gan.igk34"

I use 24 cores and 24 pools to calculate the phonons. During calculate the epw, I firstly use 48 cores and 24 pools to calculate scf, then 48 cores and 48 pools to calculate nscf and finally 48 cores and 48 pools to compute the epw. At the beginning all things go well unitil reading of phonons' file, then it stops after reading the first q point
" ===================================================================
443 irreducible q point # 1
444 ===================================================================
445
446 Symmetries of small group of q: 12
447 in addition sym. q -> -q+G:
448
449 Number of q in the star = 1
450 List of q in the star:
451 1 0.000000000 0.000000000 0.000000000
452 Imposing acoustic sum rule on the dynamical matrix
453 Read dielectric tensor and effective charges
454
455 q( 1 ) = ( 0.0000000 0.0000000 0.0000000 )
456 BMN calculated
457
"
Then the errors listed above appears. On the other hand, I try another cluster and use the same number of cores and pools, but no errors appear except the memory problem.
So what causes that? Please help me.

Best wishes,

Jiayue Yang
RWTH Aachen University

Re: Weird problem in running gan example

Posted: Mon Jun 20, 2016 10:20 am
by sponce
Dear Jiayue,

It seems like it cannot access the igk files.

If EPW finish correctly those files are deleted. On the other hand, with a crash like that, you should still see those files named "gan.igkXX" where XX depend on your number of pools (you should therefore have 48 here).

Can you see them?

Did you modify in any way the example files (scf, nscf and epw) ? Specifically the " outdir" variable?

Otherwise I do not really know. I can suggest you to try with lower number of cores.
You do not need to redo the phonons. Maybe try scf, nscf and epw with 12 cores and 12 pools.

PS: If you later have memory problems with this example, I suggest to add the etf_mem = .false. variable to the input epw.in file.

Best,

Samuel

Re: Weird problem in running gan example

Posted: Mon Jun 20, 2016 3:10 pm
by amosyang
Dear Samuel,

Thanks for your reply. I have checked the igk files and there are 48 such files named upto gan.igk48 (since I use the npool of 48).
Besides, I change nothing on the input file of scf.in, nscf.in and epw.in.
Have compared the output files of phonons for gan where were performed on two different clusters, I find that the size of fildvscf files are different. One is about 135 MB and the other is about 34 MB. The input files are the same but are ran at different clusters, why there are so large differences. For the small-size fildvscf, there are no errors except the memory problem. But for the large-size fildvscf, the errors appear upon reading the first q points. So what cause that?

I cannot figure it out. Please help me.

Best regards,
Jiayue Yang
RWTH Aachen University

Re: Weird problem in running gan example

Posted: Mon Jun 20, 2016 3:42 pm
by sponce
Dear Jiayue,

Happy to hear that it works on one of the two clusters.

The dvscf files should indeed be the exact same size if you used the same input files.

You might consider posting this issue in the Quantum Espresso mailing list since its an issue at the level of the PH code.

Best,

Samuel