Crash without error

Post here questions linked with issue while running the EPW code

Moderator: stiwari

Post Reply
identaton
Posts: 11
Joined: Wed Aug 01, 2018 7:00 am
Affiliation:

Crash without error

Post by identaton »

Dear admin and users,
I tried the calculations using epw.x. Things are fine at the beginning. However, it then crashes without giving any error messages. The bottom of the output is shown below1. I think the wannier step is fine, since it output like below2. Can you please tell me what should I do to continue the calculations? Thanks a lot.


Below1: Bottom of the output:
===================================================================
irreducible q point # 10
===================================================================

Symmetries of small group of q: 48
in addition sym. q -> -q+G:

Number of q in the star = 1
List of q in the star:
1 -0.500000000 -0.500000000 -0.500000000

q( 64 ) = ( -0.5000000 -0.5000000 -0.5000000 )

Writing epmatq on .epb files


The .epb files have been correctly written


band disentanglement is used: nbndsub = 12


Below2: wannier spread (lattice parameters is more than 3.5 Ang)
Wannier Function centers (cartesian, alat) and spreads (ang):

( 0.49996 0.50000 0.50000) : 1.33937
( 0.49998 0.50001 0.50000) : 1.33824
( 0.49996 0.50002 0.50000) : 1.33675
( 0.49971 0.50002 -0.00001) : 1.62680
( 0.49939 0.49995 0.00000) : 1.38263
( 0.49943 0.49983 0.00000) : 1.37303
( 0.49948 0.00003 0.50000) : 1.26842
( 0.49942 0.00001 0.50000) : 1.27835
( 0.49967 -0.00017 0.50000) : 1.55813
( 0.00000 0.50002 0.50000) : 1.00613
( -0.00016 0.49999 0.50000) : 1.32065
( 0.00000 0.49990 0.50000) : 1.00748

carla.verdi
Posts: 155
Joined: Thu Jan 14, 2016 10:52 am
Affiliation:

Re: Crash without error

Post by carla.verdi »

Hi,

It is difficult to identify the possible cause without any additional information, such as the input file. Is there no error file too?
Also, have you been able to run some of the examples / tutorial successfully?

Best
Carla

identaton
Posts: 11
Joined: Wed Aug 01, 2018 7:00 am
Affiliation:

Re: Crash without error

Post by identaton »

carla.verdi wrote:Hi,

It is difficult to identify the possible cause without any additional information, such as the input file. Is there no error file too?
Also, have you been able to run some of the examples / tutorial successfully?

Best
Carla


Dear Carla,
Thanks a lot for your reply. Below is my epw.in. I tested the examples of mgb2, and it is successful when using etf_mem=0, but crashes without error if using etf_mem=1. In my case, I use etf_mem=0, the wannier process is fine (relative small spread). However, after finishing the procedure about the coarse q-points from the output of ph.x, it just crashes, without any error messages related to the input of epw. Another thing that makes me quite confused is, when using 1 node * 16 cores, it just stuck at the very beginning (before starting the wannier procedure), but the pbs jobs are running; when using 3 nodes * 16 cores, it can run just at the wannier procedure, but is soon crashed; when using 2 nodes * 16 cores, it can run well until finishing the procedure about the coarse q-points (as I mentioned before) and then crash.
I am wondering how sensitive is the EPW code to the compiler? For example, do you have recommended compiler for the code? My choice is intel/18.0.2, mkl/18.0.2, impi/5.1.2 (mpiifort). I also tried other options such as openmpi, and lower version of intel, mkl, etc. All options are fine with pw.x and ph.x, but none of them really work for EPW. (The example of mgb2 is fine to finish, but my case can not.) Could you please give me some suggestions? Thanks a lot for your help.
_____
&inputepw
prefix = 'pwscf',
amass(1)=XXX,
amass(2)=XXX,
amass(3)=XXX,
outdir = './'

ep_coupling = .true.
elph = .true.
kmaps = .false.
epbwrite = .true.
epbread = .false.

epwwrite = .true.
epwread = .false.

etf_mem = 0

nbndsub = 12,
nbndskip = 11,

wannierize = .true.
num_iter = 500
dis_froz_min = 8
dis_froz_max = 13
dis_win_min=0
dis_win_max=16

proj(1) = 'XXX: dxy, dyz, dxz'
proj(2) = 'XXX: p'

iverbosity = 2


eps_acustic = 15.0 ! Lowest boundary for the
ephwrite = .true. ! Writes .ephmat files used when wliasberg = .true.

fsthick = 0.4 ! eV
eptemp = 300 ! K
degaussw = 0.10 ! eV
nsmear = 1
delta_smear = 0.04 ! eV

degaussq = 0.5 ! meV
nqstep = 500

eliashberg = .true.

laniso = .true.
limag = .true.
lpade = .true.

conv_thr_iaxis = 1.0d-4

wscut = 1.0 ! eV Upper limit over frequency integration/summation in the Elisashberg eq

nstemp = 1
tempsmin = 0.05
tempsmax = 0.1

nsiter = 500

muc = 0.16

dvscf_dir = '../phonon/save'

nk1 = 6
nk2 = 6
nk3 = 6

nq1 = 3
nq2 = 3
nq3 = 3

mp_mesh_k = .true.
nkf1 = 10
nkf2 = 10
nkf3 = 10

nqf1 = 10
nqf2 = 10
nqf3 = 10
/
4 cartesian
0.000000000 0.000000000 0.000000000 0.03703703703
0.000000000 0.000000000 0.333333333 0.22222222222
0.000000000 0.333333333 0.333333333 0.44444444444
0.333333333 0.333333333 0.333333333 0.29629629629
_______

identaton
Posts: 11
Joined: Wed Aug 01, 2018 7:00 am
Affiliation:

Re: Crash without error

Post by identaton »

carla.verdi wrote:Hi,

It is difficult to identify the possible cause without any additional information, such as the input file. Is there no error file too?
Also, have you been able to run some of the examples / tutorial successfully?

Best
Carla


Dear Carla,
By changing the ways to compile the code, now I manage to finish the following procedure:
____
Finished writing .ikmap file


Finished mapping k+sign*q onto the fine irreducibe k-mesh

Nr irreducible k-points within the Fermi shell = 16 out of 56
Progression iq (fine) = 50/ 1000
Progression iq (fine) = 100/ 1000
Progression iq (fine) = 150/ 1000
Progression iq (fine) = 200/ 1000
Progression iq (fine) = 250/ 1000
Progression iq (fine) = 300/ 1000
Progression iq (fine) = 350/ 1000
Progression iq (fine) = 400/ 1000
Progression iq (fine) = 450/ 1000
Progression iq (fine) = 500/ 1000
Progression iq (fine) = 550/ 1000
Progression iq (fine) = 600/ 1000
Progression iq (fine) = 650/ 1000
Progression iq (fine) = 700/ 1000
Progression iq (fine) = 750/ 1000
Progression iq (fine) = 800/ 1000
Progression iq (fine) = 850/ 1000
Progression iq (fine) = 900/ 1000
Progression iq (fine) = 950/ 1000
Progression iq (fine) = 1000/ 1000
Fermi level (eV) = 0.101238743405294D+02
DOS(states/spin/eV/Unit Cell) = 0.759032970711409D+00
Electron smearing (eV) = 0.100000000000000D+00
Fermi window (eV) = 0.400000000000000D+00

Finished writing .ephmat files
===================================================================
Memory usage: VmHWM = 1385Mb
VmPeak = 1803Mb
===================================================================
_____

Then, after finishing the above procedure in the EPW.out file, the pbs gives the message like "epw.x:10844 terminated with signal 6 at PC=2b9cd90b0495 SP=7ffdea89a1c8. Backtrace:
/lib64/libc.so.6(gsignal+0x35)[0x2b9cd90b0495]
/lib64/libc.so.6(abort+0x175)[0x2b9cd90b1c75]
/lib64/libc.so.6(+0x703a7)[0x2b9cd90ee3a7]
/lib64/libc.so.6(+0x75dee)[0x2b9cd90f3dee]
/lib64/libc.so.6(+0x78c80)[0x2b9cd90f6c80]
~/qe-6.3/bin/epw.x[0xeef16d]
~/qe-6.3/bin/epw.x[0x40dcbe]
~/qe-6.3/bin/epw.x[0x409c6b]
~/qe-6.3/bin/epw.x[0x408fde]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x2b9cd909cd1d]
~/qe-6.3/bin/epw.x[0x408ee9]"
Although there is error message like that, the calculations does not end. In other words, the pbs jobs are still there in running, but without updating anything. Could you please tell me what is wrong? Thanks a lot.

simran_kumari
Posts: 5
Joined: Mon Aug 06, 2018 8:13 am
Affiliation:

Re: Crash without error

Post by simran_kumari »

Hello,

I am having some similar issues can you please tell me what changes you made? What compilers you used and no. of processors etc. ?

Thanks a lot in advance.

Simran
THEOS, EPFL

identaton
Posts: 11
Joined: Wed Aug 01, 2018 7:00 am
Affiliation:

Re: Crash without error

Post by identaton »

simran_kumari wrote:Hello,

I am having some similar issues can you please tell me what changes you made? What compilers you used and no. of processors etc. ?

Thanks a lot in advance.

Simran
THEOS, EPFL


Hello, my compiler choice is "intel/18.0.2 mkl/18.0.2 impi/5.1.2", this works for QE. However, for epw, it still crashes with only the message of "epw.x:9678 terminated with signal 11 at PC=. Backtrace:"
I am quite confused about this situation. Can anyone give me some methods to fix such issue? THanks a lot.

sponce
Site Admin
Posts: 616
Joined: Wed Jan 13, 2016 7:25 pm
Affiliation: EPFL

Re: Crash without error

Post by sponce »

Hello,

From the different report, it seems there might be an issue with intel18/mkl18 and associated impi.

Can you try compiling with an older version of intel/mkl/impi. For example 17 is tested by the automatic test farm and should work ?

Best wishes,
Samuel
Prof. Samuel Poncé
Chercheur qualifié F.R.S.-FNRS / Professeur UCLouvain
Institute of Condensed Matter and Nanosciences
UCLouvain, Belgium
Web: https://www.samuelponce.com

identaton
Posts: 11
Joined: Wed Aug 01, 2018 7:00 am
Affiliation:

Re: Crash without error

Post by identaton »

sponce wrote:Hello,

From the different report, it seems there might be an issue with intel18/mkl18 and associated impi.

Can you try compiling with an older version of intel/mkl/impi. For example 17 is tested by the automatic test farm and should work ?

Best wishes,
Samuel


Dear Admin,
Thanks a lot for your kind reply. I have tried various combinations of intel, mkl, openmpi, impi, etc. None of them (including the 17 version or older version of compiler) works. It lead me to think if there is something I neglected during the compiling.
I simply use the ./configure to generate the make.inc file, and then type make all, make epw to compile the code. The calculations with QE, and wannier is fine, while the calculations for EPW have my above mentioned issues.
When reading the guide of QE for the configure process, it provide various options, e.g. see below. Do I need to specific some the the below options? I will also appreciate it very much if you can give me other any suggestions (e.g. what is the best recommended compilers, parameters, and the MPIF90, F90, F77, and CC options to compile the code). Thanks a lot.
______
-enable-parallel compile for parallel (MPI) execution if possible (default: yes)
-enable-openmp compile for OpenMP execution if possible (default: no)
-enable-shared use shared libraries if available (default: yes;
"no" is implemented, untested, in only a few cases)
-enable-debug compile with debug flags (only for selected cases; default: no)
-enable-signals enable signal trapping (default: disabled)
-with-internal-blas compile with internal BLAS (default: no)
-with-internal-lapack compile with internal LAPACK (default: no)
-with-scalapack (yes|no|intel) Use scalapack if available.
Set to intel to use Intel MPI and blacs. (default: USE openMPI)
-with-elpa-include Specify full path ELPA include and modules headers (default: no)
-with-elpa-lib Specify full path ELPA static or dynamic library (default: no)
-with-elpa-version Specify ELPA version, only year (2015 or 2016, default: 2016)
-with-hdf5 (no | <path>) Use HDF5, a valid <path> must be specified (default: no)

MPIF90 = mpiifort
F90 = ifort
CC = gcc
F77 = ifort
_______

identaton
Posts: 11
Joined: Wed Aug 01, 2018 7:00 am
Affiliation:

Re: Crash without error

Post by identaton »

sponce wrote:Hello,

From the different report, it seems there might be an issue with intel18/mkl18 and associated impi.

Can you try compiling with an older version of intel/mkl/impi. For example 17 is tested by the automatic test farm and should work ?

Best wishes,
Samuel


Dear admin,
Again, thanks for your quick and kind reply. Now I find the issues of my cases. I use the paw potentials rather then the normal-conserving potentials, and I think this is probably the reason for my issues above (by searching this forum, I just realize that for now the EPW only support norm-conserving potentials). So, when changing my potentials to pw-mt-fhi type, the problem is solved.
I will be very happy if my experience is helpful also to others.
Thanks.

identaton
Posts: 11
Joined: Wed Aug 01, 2018 7:00 am
Affiliation:

Re: Crash without error

Post by identaton »

simran_kumari wrote:Hello,

I am having some similar issues can you please tell me what changes you made? What compilers you used and no. of processors etc. ?

Thanks a lot in advance.

Simran
THEOS, EPFL


Hello, I solved my problems by change to the norm-conserving potentials (XX.pw-mt_fhi.UPF).

Post Reply