Page 1 of 1

the compilation shut down at Calculating kgmap

Posted: Thu Mar 31, 2022 1:16 pm
by zhaoyi zhu
The error info in the terminal as following:

Code: Select all

Abort(70327566) on node 1 (rank 1 in comm 0): Fatal error in PMPI_Bcast: Message truncated, error stack:
PMPI_Bcast(431).............................: MPI_Bcast(buf=0x2b38710a36c0, count=45375, MPI_INTEGER, root=0, comm=MPI_COMM_WORLD) failed
PMPI_Bcast(417).............................: 
MPIDI_Bcast_intra_composition_alpha(305)....: 
MPIDI_POSIX_mpi_bcast(130)..................: 
MPIR_Bcast_intra_scatter_ring_allgather(135): 
(unknown)(): Message truncated
No error info in epw output or solo file in the epw-calculation directory.

the input and the stopped ouput files are attched.

Re: the compilation shut down at Calculating kgmap

Posted: Fri Apr 01, 2022 5:02 pm
by hlee
Dear zhaoyi zhu:
Parallel version (MPI & OpenMP), running on 128 processor cores
Number of MPI processes: 4
Threads/MPI process: 32

MPI processes distributed on 1 nodes
R & G space division: proc/nbgrp/npool/nimage = 4
Fft bands division: nmany = 1
It seems to me that you did the following:
export OMP_NUM_THREADS=32
mpirun (or srun or ibrun) -np 4 .../bin/epw.x -in epw.in ...

The current and previous official versions of EPW don't support the OpenMP parallelization and they support only k-pool MPI parallization.

So the OpenMP threads larger than 1 doesn't result in noticeable speedup and more importantly, you have to use the same number of k pools as the number of MPI tasks.

Thus, if you use 128 physical cores, you have to use the following MPI run command:
mpirun (or srun or ibrun) -np 128 .../bin/epw.x -nk 128 -in epw.in ...

Sincerely,

H. Lee