Why gap vs temperature is not the same as tutorial of MgB2?
Moderator: stiwari
Why gap vs temperature is not the same as tutorial of MgB2?
Dear developers,
I tried to run EPW/examples/mgb2 according to the tutorial here http://epw.org.uk/Documentation/MgB2
Then I want to the same graph as fig.4, However, this is what I got, after change "nstemp = 1" to "nstemp = 6", see picture https://pasteboard.co/GKreH4s.png . You can the the peak shape is quite different. I want to know how to set calculating parameter, to make the plot the same as that in the tutorial.
best regards
I tried to run EPW/examples/mgb2 according to the tutorial here http://epw.org.uk/Documentation/MgB2
Then I want to the same graph as fig.4, However, this is what I got, after change "nstemp = 1" to "nstemp = 6", see picture https://pasteboard.co/GKreH4s.png . You can the the peak shape is quite different. I want to know how to set calculating parameter, to make the plot the same as that in the tutorial.
best regards
Re: Why gap vs temperature is not the same as tutorial of Mg
Dear balabi,
You need to push the convergence further. Converged parameter are given in the EPW paper: http://www.sciencedirect.com/science/ar ... 5516302260
See Fig. 19 for example.
Best,
Samuel
You need to push the convergence further. Converged parameter are given in the EPW paper: http://www.sciencedirect.com/science/ar ... 5516302260
See Fig. 19 for example.
Best,
Samuel
Prof. Samuel Poncé
Chercheur qualifié F.R.S.-FNRS / Professeur UCLouvain
Institute of Condensed Matter and Nanosciences
UCLouvain, Belgium
Web: https://www.samuelponce.com
Chercheur qualifié F.R.S.-FNRS / Professeur UCLouvain
Institute of Condensed Matter and Nanosciences
UCLouvain, Belgium
Web: https://www.samuelponce.com
Re: Why gap vs temperature is not the same as tutorial of Mg
sponce wrote:Dear balabi,
You need to push the convergence further. Converged parameter are given in the EPW paper: http://www.sciencedirect.com/science/ar ... 5516302260
See Fig. 19 for example.
Best,
Samuel
Dear Samuel,
Thank you so much for reply.
I am not sure if I understand "You need to push the convergence further". Do you mean that I should increase nkf1,nkf2,nkf3 and nqf1,nqf2,nqf3 ? or do I need to increase phonon q mesh or scf k mesh which is before epw run? Or do I need to increase wannier nscf mesh? Which one is more important for correct result?
According to the paper, I think maybe you mean to increase nkf and nqf.
First, I tried to increase them to nkf1=30,nkf2=30,nkf3=30 and nqf1=30,nqf2=30,nqf3=30, and the plot seems a little better ( see here https://pasteboard.co/GKXlzH1.png ). Note that I actually set T from 15K to 60K. but T=60K got convergence problem when solving anisotropic Eliashberg equations on imaginary-axis. The last iteration is
iter = 500 relerr = 9.3846423209E-02 abserr = 3.6523260131E-10 Znormi(1) = 1.8329993694E+00 Deltai(1) = 1.2461454162E-08
I got confused here. And several questions
1. Why abserr is already small, but relerr is still large? What is the relerr relative to ?
2. Does it mean that T=60K is too high for convergence. Or it is still possible to get converged if we increase iter?
3. If I only want to refine details around transition temperature, how to set epw to restart and do not do repeated works?
Second, I tried to increased them to nkf1=60,nkf2=60,nkf3=60 and nqf1=30,nqf2=30,nqf3=30 as the paper said. I found it run for hours and doesn't seem to complete. This is much much slow than previous run. So I check epw.out and found that it stuck at "Solve anisotropic Eliashberg equations on imaginary-axis " and the first temperature T=15K. For 6 hours, it only done 12 iterations. I notice that there is a sentence before iteration
Size of required memory per pool : ~= 8.4274 Gb
AKeri is calculated on the fly since its size exceedes max_memlt
I only got 64GB on one node, and I am running 16 mpi thread, that is as large as 16x8=128. So
1. What is Akeri?
2. What is "calculated on the fly" compared to "not on the fly"? Is "on the fly" the reason that cause the slowness?
3. How to estimate memory per pool before calculation?
3. any suggestions for speed up things for poor memory users ? Do I have to parallel across node? How to? Does epw perform well across node?
I really appreciate your help.
best regards
Re: Why gap vs temperature is not the same as tutorial of Mg
Dear Samuel,
I just tried to restart the calculation after aborting it at "Solve anisotropic Eliashberg equations on imaginary-axis?
I set below parameters
However, I got this error
What is wrong with it?
best regards
I just tried to restart the calculation after aborting it at "Solve anisotropic Eliashberg equations on imaginary-axis?
I set below parameters
Code: Select all
epwread=.true.
kmaps=.true.
elph=.false.
wannierize=.false.
However, I got this error
Solve anisotropic Eliashberg equations
===================================================================
Finish reading .freq file
Fermi level (eV) = 7.4669054777E+00
DOS(states/spin/eV/Unit Cell) = 3.5407008461E-01
Electron smearing (eV) = 1.0000000000E-01
Fermi window (eV) = 4.0000000000E-01
Nr irreducible k-points within the Fermi shell = 816 out of 3234
2 bands within the Fermi window
Finish reading .egnv file
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Error in routine invmat (1):
error in DGETRF
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
stopping ...
What is wrong with it?
best regards
Re: Why gap vs temperature is not the same as tutorial of Mg
Dear balabi,
Roxana is the expert in the superconducting part of EPW. I will try to answer what I can:
- Yes you need 60x60x60 k and 30x30x30 q and yes its going to be quite expansive.
- for the absolute and relative error. You can have the difference of two very small numbers that is sizeable but their absolute value very small
- T=60K is above the T_c, therefore the code cannot solve the Eliashberg equation. The superconducting gap are 0 at that temperature.
- Note that the Tc depend on the fine grids you are using. With 30x30x30 k-grid it might not be fully converged
- AKeri is the kernel used in Eliashberg. You can try to raise max_memlt to make it fast (everything in memory). On the fly means its reading/writing data instead of having everything in memory. Off course you need to have enough memory on your cluster.
- You should definitely parallelize across nodes using MPI. EPW perform well, see scaling tests: http://epw.org.uk/Main/Benchmarks
- For the restart, I think you need to set elph to true.
Best,
Samuel
Roxana is the expert in the superconducting part of EPW. I will try to answer what I can:
- Yes you need 60x60x60 k and 30x30x30 q and yes its going to be quite expansive.
- for the absolute and relative error. You can have the difference of two very small numbers that is sizeable but their absolute value very small
- T=60K is above the T_c, therefore the code cannot solve the Eliashberg equation. The superconducting gap are 0 at that temperature.
- Note that the Tc depend on the fine grids you are using. With 30x30x30 k-grid it might not be fully converged
- AKeri is the kernel used in Eliashberg. You can try to raise max_memlt to make it fast (everything in memory). On the fly means its reading/writing data instead of having everything in memory. Off course you need to have enough memory on your cluster.
- You should definitely parallelize across nodes using MPI. EPW perform well, see scaling tests: http://epw.org.uk/Main/Benchmarks
- For the restart, I think you need to set elph to true.
Best,
Samuel
Prof. Samuel Poncé
Chercheur qualifié F.R.S.-FNRS / Professeur UCLouvain
Institute of Condensed Matter and Nanosciences
UCLouvain, Belgium
Web: https://www.samuelponce.com
Chercheur qualifié F.R.S.-FNRS / Professeur UCLouvain
Institute of Condensed Matter and Nanosciences
UCLouvain, Belgium
Web: https://www.samuelponce.com
Re: Why gap vs temperature is not the same as tutorial of Mg
sponce wrote:Dear balabi,
Roxana is the expert in the superconducting part of EPW. I will try to answer what I can:
- Yes you need 60x60x60 k and 30x30x30 q and yes its going to be quite expansive.
- for the absolute and relative error. You can have the difference of two very small numbers that is sizeable but their absolute value very small
- T=60K is above the T_c, therefore the code cannot solve the Eliashberg equation. The superconducting gap are 0 at that temperature.
- Note that the Tc depend on the fine grids you are using. With 30x30x30 k-grid it might not be fully converged
- AKeri is the kernel used in Eliashberg. You can try to raise max_memlt to make it fast (everything in memory). On the fly means its reading/writing data instead of having everything in memory. Off course you need to have enough memory on your cluster.
- You should definitely parallelize across nodes using MPI. EPW perform well, see scaling tests: http://epw.org.uk/Main/Benchmarks
- For the restart, I think you need to set elph to true.
Best,
Samuel
Dear Samuel,
Thank you so much for patient explanation.
About restart, I am quite confused. I think EPW could make restart a little easier. At the present time, there is so much switches to care about.
I now have tried several combinations of below parameters
Code: Select all
ep_coupling = .true.
elph = .true.
kmaps = .false.
epbwrite = .true.
epbread = .false.
epwwrite = .true.
epwread = .false.
ephwrite = .true.
wannierize = .true.
The above setting is non-restart setting.
Then I tried to modify the switches. First, I set
Code: Select all
epwwrite = .false. epwread = .true. kmaps=.true.
this is according to the input doc. But this is not enough, I will got "must use same w90 rotation matrix for entire run" until I set
Code: Select all
wannierize=.false.
At this stage, the epw.out print out
------------------------------------------------------------------------
RESTART - RESTART - RESTART - RESTART
Restart is done without reading PWSCF save file.
Be aware that some consistency checks are therefore not done.
------------------------------------------------------------------------
it seems that I am on the right track. And yes it works, but with some repeated work.
First, it recalculate epb file. So I tried to set
Code: Select all
epbwrite = .false. epbread = .true.
in order to read epb file instead of recalculate epb file. But surprisingly, it doesn't work. It also brings error below
forrtl: severe (67): input statement requires too much data, unit 105, file /fs10/home/qhw_wang/HPC-nj/quantum_espresso/qe-dev-20170913/q-e/EPW/examples/mgb2-new/coarse_epw_grid_10/./MgB2.epb15
Image PC Routine Line Source
epw.x 0000000000F03356 Unknown Unknown Unknown
epw.x 0000000000F36A1E Unknown Unknown Unknown
epw.x 0000000000447B91 elphon_shuffle_wr 698 elphon_shuffle_wrap.f90
epw.x 0000000000407914 MAIN__ 150 epw.f90
epw.x 0000000000406C5E Unknown Unknown Unknown
libc-2.17.so 00002B2DA9553B35 __libc_start_main Unknown Unknown
epw.x 0000000000406B69 Unknown Unknown Unknown
forrtl: severe (67): input statement requires too much data, unit 105, file /fs10/home/qhw_wang/HPC-nj/quantum_espresso/qe-dev-20170913/q-e/EPW/examples/mgb2-new/coarse_epw_grid_10/./MgB2.epb1
Image PC Routine Line Source
epw.x 0000000000F03356 Unknown Unknown Unknown
.....
....
It turns out I have to set both option to false, like
Code: Select all
epbwrite = .false. epbread = .false.
But why?
Second, it recalculate ephmat. So I set
Code: Select all
ephwrite = .false.
OK, now it doesn't recalculate ephmat. But I notice there is a step which is still quite time consuming for fine grid, that is
Number of ep-matrix elements per pool : 16443 ~= 128.46 Kb (@ 8 bytes/ DP)
Progression iq (fine) = 50/ 64000
Progression iq (fine) = 100/ 64000
Progression iq (fine) = 150/ 64000
Progression iq (fine) = 200/ 64000
Progression iq (fine) = 250/ 64000
...
...
I want to know what is this step? the above result is from fine grid 40x40x40. And it increases with 50 as a step, until it reaches 64000. But it took an hour only reach 49000 on a 16 core machine. Is it possible to skip this step? I feel like this is also repeated work. Am I right?
Finally, in the input doc, item "eliashberg", there is a sentence
Note: To reuse .ephmat, .freq, .egnv, .ikmap files obtained in a previous run, one needs to set ep_coupling=.false., elph=.false., and ephwrite=.false. in the input file.
But you said to make elph=.true. I don't understand what does this sentence mean.
Anyway, I tried to set below further
Code: Select all
ep_coupling=.false., elph=.false.
But I got errors
Error in routine invmat (1): error in DGETRF
Why this error?
finally, I still want to know How to estimate memory per pool before calculation, why this memory usage scale with grid so badly?
best regards
Re: Why gap vs temperature is not the same as tutorial of Mg
Hello,
A typical restart calculation (so you want to restart the interpolation part) at this point not within Eliashberg is done by doing:
This means you restart from the epmatwp1 file (el-ph in Wannier representation) and you just redo the interpolation.
If you have the latest EPW, you can also do (does not work with Eliashberg):
This allows to restart in the middle of the interpolation (if you have a lot of q-points).
Now, if you have computed the Eliashberg stuff on file, you should be also able to restart. As written in the Eliashberg input variable on the website:
So I think it should be:
Let me know if that works. Be sure to have the 4 files .ephmat, .freq, .egnv, .ikmap in the directory.
For the scaling. The memory scales with the number of k-points per cpu. If you keep the nb of CPU constant but go from a 30x30x30 to a 60x60x60 k-grid, then the memory will scale as 2^3 = 8 times more memory.
Therefore if you are using the etf_mem and increase your number of pools, it should decrease the memory.
Best,
Samuel
A typical restart calculation (so you want to restart the interpolation part) at this point not within Eliashberg is done by doing:
Code: Select all
elph = .true.
kmaps = .true.
epbwrite = .false.
epbread = .false.
epwwrite = .false.
epwread = .true.
wannierize = .false.
This means you restart from the epmatwp1 file (el-ph in Wannier representation) and you just redo the interpolation.
If you have the latest EPW, you can also do (does not work with Eliashberg):
Code: Select all
restart = .true.
restart_freq = 500
This allows to restart in the middle of the interpolation (if you have a lot of q-points).
Now, if you have computed the Eliashberg stuff on file, you should be also able to restart. As written in the Eliashberg input variable on the website:
.To reuse .ephmat, .freq, .egnv, .ikmap files obtained
in a previous run, one needs to set ep_coupling=.false.,
elph=.false., and ephwrite=.false. in the input file.
So I think it should be:
Code: Select all
elph = .false.
kmaps = .true.
epbwrite = .false.
epbread = .false.
epwwrite = .false.
epwread = .true.
wannierize = .false.
ephwrite = .false.
ep_coupling=.false.
Let me know if that works. Be sure to have the 4 files .ephmat, .freq, .egnv, .ikmap in the directory.
For the scaling. The memory scales with the number of k-points per cpu. If you keep the nb of CPU constant but go from a 30x30x30 to a 60x60x60 k-grid, then the memory will scale as 2^3 = 8 times more memory.
Therefore if you are using the etf_mem and increase your number of pools, it should decrease the memory.
Best,
Samuel
Prof. Samuel Poncé
Chercheur qualifié F.R.S.-FNRS / Professeur UCLouvain
Institute of Condensed Matter and Nanosciences
UCLouvain, Belgium
Web: https://www.samuelponce.com
Chercheur qualifié F.R.S.-FNRS / Professeur UCLouvain
Institute of Condensed Matter and Nanosciences
UCLouvain, Belgium
Web: https://www.samuelponce.com
Re: Why gap vs temperature is not the same as tutorial of Mg
Dear Samuel:
I have very minor issues and questions related to your answers above:
(1) When restarting in the step of Eliashberg stuff, you indicate that we need to set as follows:
elph = .false.
kmaps = .true.
epbwrite = .false.
epbread = .false.
epwwrite = .false.
epwread = .true.
wannierize = .false.
ephwrite = .false.
ep_coupling=.false.
But, I am wondering whether we have all necessary data in this case.
For example, when writing ".lambda_FS" and ".lambda_*.cube" files in the subroutine of evaluate_a2f_lambda, bg is imported from cell_base; I think that if we restart in this way, we don't have bg information.
(2) In the subroutines of mem_size_eliashberg and mem_integer_size_eliashberg,
the program stops with error messages when memlt_pool is larger than max_memlt.
However, I think that it is not necessary.
For example, the subroutine of eliashberg_memlt_aniso_iaxis is implemented for the case in which memlt_pool>max_memlt; when melt_pool > max_memlt, as you said, the calculation proceeds in the "on the fly" mode.
(3) You indicated that we need to set kmaps=.true. in the restart mode.
Also, EPW stops when (epwread .AND. .not. kmaps .AND. .not. epbread) is true in the subroutine of eps_readin.
However, to the best of my knowledge, *.kmap and *.kgmap files are only necessary until calculation of e-ph matrix elements, etc in the Wannier base on the coarse grid.
I think that once we have *.fmt and *.epmatwp1, *.epmatwe1, etc, we don't need *.kmap and *.kgmap any more.
Indeed, createkmap, createkmap_pw2, readgmap subroutines are not called in the interpolation stage from coarse to fine.
Sincerely,
Hyungjun Lee
I have very minor issues and questions related to your answers above:
(1) When restarting in the step of Eliashberg stuff, you indicate that we need to set as follows:
elph = .false.
kmaps = .true.
epbwrite = .false.
epbread = .false.
epwwrite = .false.
epwread = .true.
wannierize = .false.
ephwrite = .false.
ep_coupling=.false.
But, I am wondering whether we have all necessary data in this case.
For example, when writing ".lambda_FS" and ".lambda_*.cube" files in the subroutine of evaluate_a2f_lambda, bg is imported from cell_base; I think that if we restart in this way, we don't have bg information.
(2) In the subroutines of mem_size_eliashberg and mem_integer_size_eliashberg,
the program stops with error messages when memlt_pool is larger than max_memlt.
However, I think that it is not necessary.
For example, the subroutine of eliashberg_memlt_aniso_iaxis is implemented for the case in which memlt_pool>max_memlt; when melt_pool > max_memlt, as you said, the calculation proceeds in the "on the fly" mode.
(3) You indicated that we need to set kmaps=.true. in the restart mode.
Also, EPW stops when (epwread .AND. .not. kmaps .AND. .not. epbread) is true in the subroutine of eps_readin.
However, to the best of my knowledge, *.kmap and *.kgmap files are only necessary until calculation of e-ph matrix elements, etc in the Wannier base on the coarse grid.
I think that once we have *.fmt and *.epmatwp1, *.epmatwe1, etc, we don't need *.kmap and *.kgmap any more.
Indeed, createkmap, createkmap_pw2, readgmap subroutines are not called in the interpolation stage from coarse to fine.
Sincerely,
Hyungjun Lee
Re: Why gap vs temperature is not the same as tutorial of Mg
After some digging into the code, I realised that the item (2) in my previous post is not correct; It is necessary. The motivation for this check seems more broad.
Sorry for inconveniences.
Sincerely,
Hyungjun Lee
Sorry for inconveniences.
Sincerely,
Hyungjun Lee
Re: Why gap vs temperature is not the same as tutorial of Mg
Hi,
To restart an Eliashberg calculation you will need to:
1) have .ephmat, .freq, .egnv, and .ikmap files from a previous run
2) set the following parameters in the input file
ep_coupling=.false.
elph=.false.
kmaps = .true.
epbwrite = .false.
epbread = .true.
epwwrite = .false.
epwread = .true.
wannierize = .false.
ephwrite=.false.
eliashberg = .true.
To answer Hyungjun question, it is correct that we don't need *.kmap and *.kgmap for Eliashberg calculations once the e-ph matrix elements on the fine grids are calculated. Some of the EPW stop messages such as (epwread .AND. .not. kmaps .AND. .not. epbread) should not be true for a restarted eliashberg calculation and will be updated in the near future.
Best,
Roxana
To restart an Eliashberg calculation you will need to:
1) have .ephmat, .freq, .egnv, and .ikmap files from a previous run
2) set the following parameters in the input file
ep_coupling=.false.
elph=.false.
kmaps = .true.
epbwrite = .false.
epbread = .true.
epwwrite = .false.
epwread = .true.
wannierize = .false.
ephwrite=.false.
eliashberg = .true.
To answer Hyungjun question, it is correct that we don't need *.kmap and *.kgmap for Eliashberg calculations once the e-ph matrix elements on the fine grids are calculated. Some of the EPW stop messages such as (epwread .AND. .not. kmaps .AND. .not. epbread) should not be true for a restarted eliashberg calculation and will be updated in the near future.
Best,
Roxana
Roxana Margine
Associate Professor
Department of Physics, Applied Physics and Astronomy
Binghamton University, State University of New York
Associate Professor
Department of Physics, Applied Physics and Astronomy
Binghamton University, State University of New York