[Openshmem-list] Effects of removing implicit finalize from the specification

Jeff Hammond jeff.science at gmail.com
Wed Mar 22 15:05:39 UTC 2017


Yeah, re-init or only finalize atexit(), like we did before, meaning
shmem_finalize is really just shmem_tool_sync :-)

Jeff

On Wed, Mar 22, 2017 at 6:43 AM, Dinan, James <james.dinan at intel.com> wrote:

> Also, if you use reference counting, you also need re-initialization,
> because:
>
> FOO_init()
> FOO_finalize()
> BAR_init()
> BAR_finalize()
>
> This is likely, if FOO is an I/O library and BAR is some compute library.
>
>  ~Jim.
>
> From: Openshmem-list <openshmem-list-bounces at openshmem.org> on behalf of
> Jeff Hammond <jeff.science at gmail.com>
> Date: Tuesday, March 21, 2017 at 7:25 PM
> To: Bob Cernohous <bcernohous at cray.com>
> Cc: "openshmem-list at openshmem.org" <openshmem-list at openshmem.org>
>
> Subject: Re: [Openshmem-list] Effects of removing implicit finalize from
> the specification
>
> I agree wholeheartedly with Nick that first-call is wrong for the reason
> he gives.  Any we expect one library to be able to init-final, then any
> number need to be allowed to do this, and this means we need last-call
> finalization.
>
> The other option is, of course, to copy MPI and force every library to
> check the state and potentially init and thus be on the hook to finalize as
> well, but this means that libraries must be popped in the order they are
> pushed, or else this happens:
>
>   FOO_init() --> includes call to shmem_init() --> actually initializes,
> knows it
>   BAR_init() --> includes call to shmem_init() --> no-op, knows it did not
> initialize
>   FOO_finalize() --> includes call to shmem_finalize() --> knows it
> initialized, so might assume it should finalize
>   BAR_finalize() --> includes call to shmem_finalize() --> if FOO
> finalized, cannot call SHMEM; if FOO didn't finalize, should BAR finalize?
>
> Obviously, we can generalize...
>
>   FOO_init() --> includes call to shmem_init() --> actually initializes,
> knows it
>   BAR_init() --> includes call to shmem_init() --> no-op, knows it did not
> initialize
>   GOO_init() --> includes call to shmem_init() --> no-op, knows it did
> not initialize
>   BAZ_init() --> includes call to shmem_init() --> no-op, knows it did not
> initialize
>   BAR_finalize() --> includes call to shmem_finalize() --> should I
> finalize?  I did not initialize...
>   BAZ_finalize() --> includes call to shmem_finalize() --> should I
> finalize?  I did not initialize...
>   FOO_finalize() --> includes call to shmem_finalize() --> should I
> finalize?  I initialized...
>   GOO_finalize() --> includes call to shmem_finalize() --> should I
> finalize?  I did not initialize...
>   // uh oh, nobody finalized
>
> The problem situation already occurs with Elemental (
> http://libelemental.org/).  If Elemental is called before MPI is
> initialized, Elemental takes ownership of MPI, which means that it
> initializes MPI and sets a flag that it should also finalize it.  However,
> that means that you have ugly termination because MPI_Finalize is called
> inside of a C++ dtor when the program exits, and this does not always go
> well (perhaps because Elemental needs to do more refcounting than it does
> already...).
>
> Jeff
>
> On Tue, Mar 21, 2017 at 2:41 PM, Bob Cernohous <bcernohous at cray.com>
> wrote:
>
>> In general, I think libraries with side effects like
>> initializing/finalizing SHMEM are problematic.  [It’s especially bad for
>> something like dmapp_init which takes config parameter so neither FOO nor
>> BAR can count on controlling the initialized configuration.  Same thing, I
>> guess, if FOO or BAR were trying to set env’s internally before init.]
>>
>>
>>
>> But everything you say is true and if those are all the goals then
>> reference counting is fine.
>>
>>
>>
>> But there was, I believe, a customer request to support exit w/finalize.
>> Ie. if a process took an early exit, not a shmem_global_exit, wait for
>> other PEs to finalize/exit normally.
>>
>>
>>
>> Or, imagine FOO and BAR didn’t explicitly have finalize functions but
>> relied on atexit to finalize.
>>
>>
>>
>> If you want atexit(shmem_finalize) (or the current implicit finalize) to
>> wait and truly finalize, then it can’t just be decrementing a reference
>> count and executing no-op.
>>
>>
>>
>>
>>
>> *From:* Nicholas Park [mailto:nspark at computer.org]
>> *Sent:* Tuesday, March 21, 2017 4:20 PM
>> *To:* Naveen Ravichandrasekaran <nravi at cray.com>
>> *Cc:* Bob Cernohous <bcernohous at cray.com>; Kuehn, Jeff <jakuehn at lanl.gov>;
>> Jeff Hammond <jeff.science at gmail.com>; openshmem-list at openshmem.org
>>
>> *Subject:* Re: [Openshmem-list] Effects of removing implicit finalize
>> from the specification
>>
>>
>>
>> I'm very concerned about first-call semantics on shmem_finalize(). I
>> think this effectively breaks general-purpose library support for OpenSHMEM.
>>
>> Consider two libraries FOO and BAR that both build on OpenSHMEM.
>>
>> Claim: Library interoperability has two goals:
>>
>>   1) Libraries FOO and BAR can both initialize and finalize the OpenSHMEM
>> library
>>
>>   2) Users of FOO need not make OpenSHMEM calls can can interact solely
>> with FOO
>>
>>
>>
>> Under the proposed first-call semantic, Goal #1 is violated. A call
>> sequence of:
>>
>>   FOO_init() --> includes call to shmem_init() --> actually initializes
>>   BAR_init() --> includes call to shmem_init() --> no-op
>>   BAR_finalize() --> includes call to shmem_finalize() --> actually
>> finalizes
>>   FOO_finalize() --> includes call to shmem_finalize() --> no-op
>>
>> means that FOO_finalize() will fail if it calls *any* OpenSHMEM function
>> other than shmem_finalize(). It could non-trivially require calls to
>> shmem_free(), shmem_my_pe(), shmem_barrier_all().
>>
>> In contrast, the reference-counted semantic leads to valid behavior:
>>   FOO_init() --> includes call to shmem_init() --> actually initializes
>>   BAR_init() --> includes call to shmem_init() --> no-op
>>   BAR_finalize() --> includes call to shmem_finalize() --> no-op
>>   FOO_finalize() --> includes call to shmem_finalize() --> actually
>> finalizes
>>
>>
>>
>> Currently, Goal #2 is violated by the fact that calling shmem_init() more
>> than once is undefined. Each library must make the assumption that
>> OpenSHMEM is either initialized/finalized solely by them or external to
>> them. Only the latter is truly sane, but then means that a user interacting
>> solely with FOO (without loss of generality) has to make OpenSHMEM calls
>> (and may not even be aware that FOO relies on OpenSHMEM). I recognize we're
>> trying to solve that in this discussion, but I wanted to make this goal
>> explicit for library support.
>>
>> Nick
>>
>>
>>
>> On Tue, Mar 21, 2017 at 12:54 PM, Naveen Ravichandrasekaran <
>> nravi at cray.com> wrote:
>>
>> +1 for the first-call semantics – init/finalize on the first-call and
>> then no-op after that.
>>
>>
>>
>> -Naveen N Ravi.
>>
>>
>>
>> *From:* Bob Cernohous
>> *Sent:* Tuesday, March 21, 2017 11:09 AM
>> *To:* Kuehn, Jeff <jakuehn at lanl.gov>; Naveen Ravichandrasekaran <
>> nravi at cray.com>; Jeff Hammond <jeff.science at gmail.com>
>> *Cc:* openshmem-list at openshmem.org
>> *Subject:* RE: [Openshmem-list] Effects of removing implicit finalize
>> from the specification
>>
>>
>>
>> I would vote for first-call semantics.  The email example wasn’t
>> particularly interesting for atexit.  The only reason to want
>> atexit(shmem_finalize) is that there might be a code path that doesn’t
>> explicitly shmem_finalize.  I think you want atexit to finalize, not
>> decrement a reference count.  So init/finalize on first-call and no-op
>> after that?
>>
>>
>>
>>
>>
>> *From:* Openshmem-list [mailto:openshmem-list-bounces at openshmem.org
>> <openshmem-list-bounces at openshmem.org>] *On Behalf Of *Kuehn, Jeff
>> *Sent:* Tuesday, March 21, 2017 10:48 AM
>> *To:* Naveen Ravichandrasekaran <nravi at cray.com>; Jeff Hammond <
>> jeff.science at gmail.com>
>> *Cc:* openshmem-list at openshmem.org
>> *Subject:* Re: [Openshmem-list] Effects of removing implicit finalize
>> from the specification
>>
>>
>>
>> I have a very small concern with nested semantics vs first call semantics
>> in the case of init and finalize.  If one considers the cases of an extra
>> init or finalize and a missing init or finalize, it seems that the
>> observable side effects of those bugs, are more local to the problem code,
>> and thus easier to debug in the case of first-call semantics, and more
>> non-local in the case of nested semantics. It’s a small concern, but what
>> do the rest of you think?
>>
>>
>>
>> Regards,
>>
>> Jeff
>>
>>
>>
>>
>>
>>
>>
>> On 3/20/17, 19:51, "Openshmem-list on behalf of Naveen
>> Ravichandrasekaran" <openshmem-list-bounces at openshmem.org on behalf of
>> nravi at cray.com> wrote:
>>
>>
>>
>> Hi Jeff,
>>
>>
>>
>> > We either need to allow multiple calls to initialize/finalize or we
>>
>> > need to have is_initialized/is_finalized queries.  MPI does the latter
>>
>> > but since it is utterly trivial to ref count in this case, we should
>> just
>>
>> > do better than MPI and do the former.
>>
>>
>>
>> My requirement is not to support multiple shmem_finalize calls. I was just
>>
>> merely stating the fact that the side-effect of implicit finalize in the
>> current
>>
>> specification allows the usage of shmem_finalize multiple times.
>>
>>
>>
>> What we really need is a semantics which successfully performs finalize
>> or
>>
>> do no-op when the user uses atexit(shmem_finalize) in their application.
>>
>>
>>
>> shmem_init(); /* ref_count++ -> actual initialization */
>>
>> shmem_init(); /* ref_count++ -> no-op */
>>
>> shmem_finalize(); /* ref_count-- -> no-op */
>>
>> shmem_finalize(); /* ref_count-- -> actual finalization */
>>
>>
>>
>> FWIU, the reference counter method would be useful for handling
>>
>> multiple shmem_init and shmem_finalize calls, but it won’t probably
>>
>> work for handling atexit(shmem_finalize).
>>
>>
>>
>> In this example, it is unclear how the reference counter works.
>>
>> int main(void) {
>>
>>         shmem_init();          /* ref_count++ -> actual initialization */
>>
>>         atexit(shmem_finalize);
>>
>>         shmem_finalize();   /* ref_count-- -> actual finalization */
>>
>>         return 0;                      /* unclear what should happen here
>> */
>>
>> }
>>
>>
>>
>> Irrespective of the number of times shmem_init is used, we need to
>>
>> successfully handle atexit(shmem_finalize). So, I would prefer adding
>>
>> new APIs in SHMEM to query is_initialized/is_finalized and let the
>>
>> users handle atexit(shmem_finalize) scenario correctly.
>>
>>
>>
>> -Naveen N Ravi.
>>
>>
>>
>> *From:* Jeff Hammond [mailto:jeff.science at gmail.com
>> <jeff.science at gmail.com>]
>> *Sent:* Monday, March 20, 2017 4:01 PM
>> *To:* Naveen Ravichandrasekaran <nravi at cray.com>
>> *Cc:* openshmem-list at openshmem.org
>> *Subject:* Re: [Openshmem-list] Effects of removing implicit finalize
>> from the specification
>>
>>
>>
>> If we permit multiple calls to finalize then we should also permit
>> multiple calls to initialize.  This makes it easy on applications that use
>> libraries and the libraries do not know whether they should call initialize
>> or not.
>>
>>
>>
>> We either need to allow multiple calls to initialize/finalize or we need
>> to have is_initialized/is_finalized queries.  MPI does the latter but since
>> it is utterly trivial to ref count in this case, we should just do better
>> than MPI and do the former.
>>
>>
>>
>> We could add return codes on initialize/finalize to allow users to know
>> whether the calls were no-ops or not.  This does not break any existing
>> code because users do not have to assign return codes to variables.
>>
>>
>>
>> Jeff
>>
>>
>>
>> On Mon, Mar 20, 2017 at 1:37 PM, Naveen Ravichandrasekaran <
>> nravi at cray.com> wrote:
>>
>> At present, the specification allows the multiple usage of shmem_finalize.
>> This seems to be a defined behavior and multiple subsequent calls to
>> finalize
>> are no-ops.
>>
>> The following example should pass as per the current specification;
>>
>> Example:1
>> int main(void) {
>>         shmem_init();
>>         shmem_finalize();
>>         shmem_finalize();
>>         return 0;
>> }
>>
>> In Example:1, we have three finalize (2 explicit + 1 implicit) operations.
>> Even any common SHMEM usage, will have two finalize operations
>> (1 explicit + 1 implicit). Even though the specification is not clear
>> about the
>> above usage, this is the side-effect of having implicit finalize at
>> exit() or at
>> return from main.
>>
>> Example:2
>> int main(void) {
>>         shmem_init();
>>         atexit(shmem_finalize);
>>         return 0;
>> }
>>
>> During F2F meeting, we decided to drop implicit finalize in the
>> specification.
>> The argument was that the users can explicitly call "atexit" in the
>> application
>> to achieve the above behavior as shown in Example:2.
>>
>> Example:3
>> int main(void) {
>>         shmem_init();
>>         atexit(shmem_finalize);
>>         shmem_finalize();
>>         return 0;
>> }
>>
>> FWIU, if we remove implicit finalize from the specification - Example:2 is
>> guaranteed to work but the behavior on Example:3 is undefined.
>>
>> To allow the explicit atexit usage, we either
>> 1. Need to specify that multiple calls to finalize are no-ops or
>> 2. We need to have a new API - "shmem_finalized()" so that the users can
>> have a wrapper from atexit which checks shmem_finalized() before calling
>> shmem_finalize().
>>
>> Please let me know, if this analysis looks valid.
>>
>> -Naveen N Ravi
>>  Cray Inc.
>>
>> _______________________________________________
>> Openshmem-list mailing list
>> Openshmem-list at openshmem.org
>> http://www.openshmem.org/mailman/listinfo/openshmem-list
>>
>>
>>
>>
>>
>> --
>>
>> Jeff Hammond
>> jeff.science at gmail.com
>> http://jeffhammond.github.io/
>>
>>
>> _______________________________________________
>> Openshmem-list mailing list
>> Openshmem-list at openshmem.org
>> http://www.openshmem.org/mailman/listinfo/openshmem-list
>>
>>
>>
>>
>>
>
>
>
> --
> Jeff Hammond
> jeff.science at gmail.com
> http://jeffhammond.github.io/
>
>


-- 
Jeff Hammond
jeff.science at gmail.com
http://jeffhammond.github.io/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.openshmem.org/pipermail/openshmem-list/attachments/20170322/323ddf1e/attachment-0001.html>


More information about the Openshmem-list mailing list