Page 1 of 1

How to evaluate the memory needed for a computation?

Posted: Tue Dec 19, 2023 2:35 pm
by since2012
Dear all,

(To be honest, I have post the same topic two days ago, but I can not find it now.)

When I proferm calculation for a 3D system (38 wannier functions), the process always reports an error about memory (Specifically, not enough memory can be used).

My job is cut into three steps. The first step is wannier interpolation. The second step is used to calculating the matrix of EPC. The third step is used to calculate the gap related properties.

When I'm doing the third step with a default value for max_memlt, the code calls erro information as following:

Size of required memory per pool: ~= 22.8741 Gb

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Error in routine mem_size_eliashberg (1):
Size of required memory exceeds max_memlt
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%


So i enhance the max_memlt to 32 GB, and I resubimit the job with 256 cores (32 nodes, 256 GB & 8 cores per node). This seems to solve the problem above. But unfortunately, the process terminates with the error imformation as
Size of allocated memory per pool: ~= 22.8741 Gb

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 30 PID 14207 RUNNING AT cn022
= KILLED BY SIGNAL: 9 (Killed)
===================================================================================


Why is this happening? Can anyone give me some advice and help?

How to evaluate the memory a job needs?

For my job, the nkf and nqf set as 40*40*40 and 20*20*20 can give excellent lambda comparing to the results calculating via QE. In terms of computing resources, I have used almost all the nodes available to me. That means I can not increase the number of node for my job. What parameters can I change to solve my problem?

Any advice and comment are welcomed.

Thanks and Best Regards!

Jianguo Si
-------------------------------------------------
Songshan Lake Materials Laboratory
Building A1, University Innovation Park, Songshan Lake, Dongguan, Guangdong, CHINA

Re: How to evaluate the memory needed for a computation?

Posted: Thu Dec 21, 2023 9:58 am
by LiuHD
Dear Si,
Thanks for your question about this problem! We are usually puzzled by it.

Sincerely hope the problem can be resolved!

Best wishes,
Liu.

Re: How to evaluate the memory needed for a computation?

Posted: Fri Dec 22, 2023 11:14 pm
by hmori
Hi Jianguo,

I suppose you are working on the anisotropic Eliashberg calculation.

Please check the following points:
(1) Did you use mp_mesh_k? If not, please use "mp_mesh_k = .true."
(2) Which value of iverbosity did you set? You do not need to set "iverbosity = 2" to solve the ME equations, "iverbosity = 2" sometimes requires a large momory. It is used for outputting additional files like .cube or .frmsf files. Firstly, try with the default value of iverbosity.
(3) Can you make fsthick smaller? If you can complete the calculation on small grids, perform convergence tests for fsthick on the grids to find the bare minimum of fsthick. (It may be almost impossible to perform the calculation even if you take smaller grids in this case.)
(4) We recently improved some of anisotropic Eliashberg subroutines to reduce memory use in the latest version. If you are still struggling with the memory problem, try the latest version. Note that prefix.ephmat output by older versions is no longer compatible with the latest version. You can reuse prefix.epb and prefix.epmatwp but should get prefix.ephmat again. If you have any questions about use of the latest version, please feel free to ask.

Your system is too large, so trying all of these may not solve the memory problem.

Best regards,
Hitoshi

Re: How to evaluate the memory needed for a computation?

Posted: Wed Dec 27, 2023 2:33 am
by since2012
Dear Hitoshi,

Thanks for your kind reply.

(1) I didn't use the mp_mesh_k=.true. before, and I will resubmit the job with " mp_mesh_k=.true."

(2) Indeed, I used "iverbosity = 2" in my jobs due to I want to obtain the lambda or gap distribution on the Fermi surface. I will turn it as default value to perform my job firstly.

(3) The fsthick i set is 1 eV, due to that there is a partly flat band around 1 eV and crosses the Fermi level. Considering the partly flat band may have non-negligible contribution to the EPC as well as the Tc,I'd like to maintain the value i used before. I also submited a job with smaller k-mesh during the nscf calculation, but the lambda after interpolation is rather large.

(4) The version i used is EPW-v5.7. I will resubmit the job according to your suggestions above. If it not work, I will try the v5.8.

I will modify the parameters and perform corresponding calculations based on your suggestions as soon as possible.

Thanks again and best Regards!
Jianguo