[Openshmem-list] shmem_tool_sync

Mon Mar 20 21:07:46 UTC 2017

The discussion of shmem_finalize made me wonder if it would not be helpful
to have an explicit call to tell any active tools to dump their logs.

For example, if I have a code like NWChem that goes through a series of
phases, and the odds of phase i+1 crashing is nontrivial, but I want to
know how phase i did with some tool, then I can call shmem_tool_sync
between the phases to ensure that my tool state is (1) persistent and (2)
in an analyzable state.

It really sucks if an app crashes in such a way that tool state is not
useable, particularly when said tool state would help figure out why the
app crashed.

One advantage of this call vs some hidden implementation is that the hidden
implementation is either asynchronous or probably runs more often than it
needs to.  For example, if I automatically dump logs every Nth collective,
then I might slow down the app noticeably.  On the other hand, if I have to
do the dump asynchronously, I cannot take advantage of I/O aggregation or
any other coordinated analysis.

It would be ideal if somebody from the TAU team could comment on this.

Jeff

-- 
Jeff Hammond
jeff.science at gmail.com
http://jeffhammond.github.io/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.openshmem.org/pipermail/openshmem-list/attachments/20170320/5ed5dfc0/attachment-0001.html>