Phonon interpolation problem with EPW

hlee · Post by **hlee** » Thu Oct 29, 2020 6:49 pm

Dear Mehmet:

In addition to my previous message posted today:

You can also skip writing epb files by adding both "epbwrite=.false." and "epbread=.false.".
In most cases, epb files are not necessary since usually we restart with the electron-phonon vertex on coarse grids in Wannier representation.

However, it is still good to eventually solve your overflow error.

Sincerely,

H. Lee

mdogan · Post by **mdogan** » Fri Oct 30, 2020 4:45 am

Dear H. Lee,

Thank you once again for your suggestions. I recompiled QE-6.6 after making sure the make.inc file for the compilation was as similar to that of QE-6.4.1 as possible. I added both to the folder if you'd like to take a look. I then repeated the test, which was successfully completed. However, when I run scf, nscf and then epw1, I get the same exact error. I did a bit more investigation on the issue. There are 32 epb files that should be created (1 for each processor), each of which is 3.4 GB in size. Actually, they are all exactly the same size (3626047868 bytes). These files can be created without a problem with EPW-5.1. With EPW-5.3, however, each time, 20 of them seem to fail, and not the same 20. I tried this calculation 3 times now, and I can see that each time a different set of 20 failed to write. However, the remaining 12 seem to write to disk properly, with the correct size. So it's not that the operation reliably fails for each file of this size, but somehow a certain number of them fail and a certain number of them don't. Do you have any ideas about what might be going on?

Thank you for your suggestion about skipping writing the epb files altogether. I'll try that as well and see if I can complete all the calculations I need that way.

Best,
Mehmet

hlee · Post by **hlee** » Fri Oct 30, 2020 3:32 pm

Dear Mehmet:

Could you make the following parts of your make.inc in QE 6.6 the same as those in QE 6.4.1?
I just would like to know whether or not this issue originates from the building environments.

Code: Select all

53c56
< IFLAGS         = -I$(TOPDIR)/include -I$(TOPDIR)/FoX/finclude -I$(TOPDIR)/S3DE/iotk/include/ -I/opt/intel/compilers_and_libraries_2019.5.281/linux/mkl/include
---
> IFLAGS         = -I$(TOPDIR)/include -I$(TOPDIR)/FoX/finclude -I$(TOPDIR)/S3DE/iotk/include/ -I/opt/intel/compilers_and_libraries_2020.1.217/linux/mkl/include
79c83
< MPIF90         = mpif90
---
> MPIF90         = mpiifort
98c102
< CPPFLAGS       = -P -traditional $(DFLAGS) $(IFLAGS)
---
> CPPFLAGS       = -P -traditional -Uvector $(DFLAGS) $(IFLAGS)
122c127
< LD             = mpif90
---
> LD             = mpiifort

Sincerely,

H. Lee

mdogan · Post by **mdogan** » Fri Oct 30, 2020 5:10 pm

Dear H. Lee,

Actually I had added the wrong make.inc corresponding to QE-6.6. That was the first one I compiled, but then I compiled a second version which should be built in essentially the saw way as my QE-6.4.1. Now I uploaded the correct make.inc for QE-6.6. As I said, with both versions of QE-6.6, I get the same error with epw. Sorry for the inconvenience!

Best,
Mehmet

mdogan · Post by **mdogan** » Fri Oct 30, 2020 7:16 pm

Dear H. Lee,

In the meantime, in a separate folder, I'm going through the steps without writing the epb files to the disk. In the second epw calculation, I get the error:

Code: Select all

     Using q-mesh file: scf_band.kpt

 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
     Error in routine loadqmesh_serial (1):
     ERROR: Specify either crystal or cartesian coordinates in the filqf file
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

     stopping ...

It looks like I need to define the coordinate type for this file, starting at EPW 5.2. However, I couldn't find where to make that definition in my input file. I added the "filkf" file to the folder for your reference. Thank you!

Best,
Mehmet

hlee · Post by **hlee** » Fri Oct 30, 2020 7:33 pm

Dear Mehmet:

You can find the format of the q-point list file by looking at the source code as below:

From grid.f90 in EPW/src directory:

Code: Select all

    IF (mpime == ionode_id) THEN
      IF (filqf /= '') THEN ! load from file
        !
        ! Each pool gets its own copy from the action=read statement
        !
        WRITE(stdout, *) '    Using q-mesh file: ', TRIM(filqf)
        IF (lscreen) WRITE(stdout, *) '     WARNING: if lscreen=.TRUE., q-mesh needs to be [-0.5:0.5] (crystal)'
        OPEN(UNIT = iunqf, FILE = filqf, STATUS = 'old', FORM = 'formatted', IOSTAT = ios)
        IF (ios /= 0) CALL errore('loadqmesh_serial', 'opening file ' // filqf, ABS(ios))
        READ(iunqf, *) nqtotf, coordinate_type
        IF (TRIM(coordinate_type) .EQ. ' ') coordinate_type = 'crystal'
        IF (.NOT. imatches("crystal", coordinate_type) .AND. .NOT. imatches("cartesian", coordinate_type)) &
          CALL errore('loadqmesh_serial', 'ERROR: Specify either crystal or cartesian coordinates in the filqf file', 1)
        !
        ALLOCATE(xqf(3, nqtotf), STAT = ierr)
        IF (ierr /= 0) CALL errore('loadqmesh_serial', 'Error allocating xqf', 1)
        ALLOCATE(wqf(nqtotf), STAT = ierr)
        IF (ierr /= 0) CALL errore('loadqmesh_serial', 'Error allocating wqf', 1)
        !
        DO iq = 1, nqtotf
          !
          READ (iunqf, *) xqf(:, iq), wqf(iq)
          !
        ENDDO
        CLOSE(iunqf)
        IF (imatches("cartesian", coordinate_type)) THEN
          CALL cryst_to_cart(nqtotf, xqf, at, -1)
        ENDIF
...

It seems that the part below might lead to the problem in some cases.

Code: Select all

       READ(iunqf, *) nqtotf, coordinate_type
        IF (TRIM(coordinate_type) .EQ. ' ') coordinate_type = 'crystal'
        IF (.NOT. imatches("crystal", coordinate_type) .AND. .NOT. imatches("cartesian", coordinate_type)) &
          CALL errore('loadqmesh_serial', 'ERROR: Specify either crystal or cartesian coordinates in the filqf file', 1)

To be safe, please specify explicitly the type of coordinates as below:
297 crystal
or
297 cartesian

Regarding your overflow error:
To be honest, I still have no idea. It is not easy to identify the origin of error unless I reproduce your calculations from scratch.

Sincerely,

H. Lee

mdogan · Post by **mdogan** » Fri Oct 30, 2020 9:43 pm

Dear H. Lee,

Thank you! Adding "crystal" next to the number of k-points fixed the error.

For the overflow error, I'm retrying the calculation with 64 processors instead of 32 to see if that makes a difference. I will post an update when I get the results.

Best,
Mehmet

mdogan · Post by **mdogan** » Sun Nov 01, 2020 10:57 pm

Dear H. Lee,

When I increased the number of processors from 32 to 64, the epb files ended up being written correctly. So somehow the file sizes were too big for EPW 5.3 even though EPW 5.1 had no problem writing them to the disk. Thanks for all your help!

Best,
Mehmet

EPW Forum

Phonon interpolation problem with EPW

Re: Phonon interpolation problem with EPW

Re: Phonon interpolation problem with EPW

Re: Phonon interpolation problem with EPW

Re: Phonon interpolation problem with EPW

Re: Phonon interpolation problem with EPW

Re: Phonon interpolation problem with EPW

Re: Phonon interpolation problem with EPW

Re: Phonon interpolation problem with EPW