Hi Samuel,
Apologies for not responding to your message earlier. I just wanted to confirm that following your advice Chathu has started running a more coarse grid which has significantly reduce the load on the file system. Thanks once again for your advice.
Search found 5 matches
- Wed Jul 31, 2019 3:21 pm
- Forum: Running the code
- Topic: How to reduce load on parallel file system
- Replies: 8
- Views: 9520
- Wed Jun 05, 2019 1:00 pm
- Forum: Running the code
- Topic: How to reduce load on parallel file system
- Replies: 8
- Views: 9520
Re: How to reduce load on parallel file system
Hi Samuel, Just to add to Chathu's comment, I've managed to get a little detail from storage system regarding the load on the file system for a job that's currently active. As far as I can tell the high IO (for this particular stage in the job) appears to be due to the MPI tasks (128 of them ...
- Mon Jun 03, 2019 9:43 am
- Forum: Running the code
- Topic: How to reduce load on parallel file system
- Replies: 8
- Views: 9520
Re: How to reduce load on parallel file system
Hi Samuel, "Those files are produced locally (.i.e. each cores within a node should be writing on its own scratch with no communication between nodes)." While the files are produced locally, they seem to be written to the same directory as "outdir" which in our case is a cluster-wide file system. It ...
- Wed May 29, 2019 1:39 pm
- Forum: Running the code
- Topic: How to reduce load on parallel file system
- Replies: 8
- Views: 9520
Re: How to reduce load on parallel file system
Hi Samuel, Thanks for your response. I suspect the issue is really the epb files but given our walltime limit (48 hours) the user said they wanted to keep these in case the jobs don't reach the epw phase in time. I had suggested that they set etf_mem but this causes the jobs to run out of RAM. I've ...
- Tue May 21, 2019 2:56 pm
- Forum: Running the code
- Topic: How to reduce load on parallel file system
- Replies: 8
- Views: 9520
How to reduce load on parallel file system
Hi, I'm supporting a user on our HPC facility running epw from QE 6.3. Unfortunately the jobs the user is running is generating a very high load on our parallel file system (GPFS) to the extent that several (2-3) concurrent multi-node (between 3-10 nodes) jobs are causing the file system to become ...