Mon Nov 23 16:03:13 EST 1992
From owner-mpi-lang@CS.UTK.EDU  Wed Dec 23 06:11:46 1992
Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34)
	id AA08579; Wed, 23 Dec 92 06:11:46 -0500
Received: from localhost by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK)
	id AA10302; Wed, 23 Dec 92 06:11:35 -0500
X-Resent-To: mpi-lang@CS.UTK.EDU ; Wed, 23 Dec 1992 11:11:34 GMT
Errors-To: owner-mpi-lang@CS.UTK.EDU
Received: from marge.meiko.com by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK)
	id AA10294; Wed, 23 Dec 92 06:11:30 -0500
Received: from hub.meiko.co.uk by marge.meiko.com with SMTP id AA03680
  (5.65c/IDA-1.4.4 for <mpi-lang@cs.utk.edu>); Wed, 23 Dec 1992 06:11:26 -0500
Received: from float.co.uk (float.meiko.co.uk) by hub.meiko.co.uk (4.1/SMI-4.1)
	id AA22108; Wed, 23 Dec 92 11:11:20 GMT
Date: Wed, 23 Dec 92 11:11:20 GMT
From: jim@meiko.co.uk (James Cownie)
Message-Id: <9212231111.AA22108@hub.meiko.co.uk>
Received: by float.co.uk (5.0/SMI-SVR4)
	id AA01691; Wed, 23 Dec 92 11:11:09 GMT
To: mpi-lang@cs.utk.edu
Cc: jim@meiko.co.uk
Subject: the mode argument
Content-Length: 4763

People,

Apologies if this issue has already been addressed. I am working from
the Supercomputer draft (ORNL/TM-12231 October 1992). Since there
appears to have been no mail in the language binding sub-group I
assume that that area remains the same.

Use of character strings for mode.
==================================

The mode argument to the communication routines is described
throughout the draft as a character string. I believe this is a BAD
IDEA for the following reasons (in no particular priority order) :-

Arguments against using strings
===============================
1) Passing strings will produce slower code
2) Passing strings will produce larger code
3) Passing strings makes a portable implementation more difficult
   because there is no standard way of passing strings from Fortran to
   C.
   

1) Code is slower.
The library routine must determine what the actual mode is. If the
mode argument is a string this should be implemented as a string
compare. This will normally require a function call (and probably more
than one), since the code will be something like
     char * mode;

     if (strcmp(mode,"blocking") == STRINGEQUAL)
     {}
     else if (strcmp(mode,"nonblocking") == STRINGEQUAL)
     {}
     else if (strcmp(mode,"synchronous") == STRINGEQUAL)
     {}
     else
         error(...)

This is hugely slower than
     int mode;

     switch (mode)
     {
         case MPI_BLOCKING:
	 	 	 
	 case MPI_NONBLOCKING:
	 
	 case MPI_SYNCHRONOUS:

	 default:
     }	 

On a RISCy machine, even unpleasant code like

     char * mode;

     switch(*mode)
     {
         case 'b': /* Assume blocking     */
	 case 'n': /* Assume non-blocking */
	 case 's': /* Assume synchronous  */
     }

will be slower than the integer case above since loading the address
of a string to pass it normally takes two instructions, while loading
a small integer constant takes one. Also there's an extra store access
to pick up the first byte of the string (almost certainly from an area
which won't be in the cache), whereas the switch on the integer value
will already have the value in a register.


2) Code is larger.
Just because of all the extra strings which will be placed in the data
segment.

3) A portable implementation is harder.
A natural way to implement the library is to implement it in C, and
then provide a Fortran binding. This is made harder when strings are
passed as arguments, since the Fortran and C string conventions are
entirely different. In the general case this requires that the whole
string be copied when calling a C routine from Fortran and passing a
string as an argument. This will have a bad effect on latency.

Arguments for using strings
===========================

1) User code is easier to understand
2) Additional possibilities are easier to add
3) There is no standard way of including PARAMETER definitions in
   Fortran 77.

1) User code is easier to understand
IMHO 
     MPI_RECV ( MPI_BLOCKING, ...);
is just as easy to understand as
     MPI_RECV ( "blocking", ...);

though I agree that
     MPI_RECV ( 0 , ...);
is extremely opaque.

The solution to this is NOT to define the actual values which are used
to represent the various tokens, but rather to specify only symbolic
names and the include file name. Portable programs must not then make
assumptions about the actual values used.

In ANSI C it would actually be possible to define the modes as an
enumeration type. In conjunction with appropriate prototype
definitions for the library functions this should cause type warnings
on usage like the last example. (Since the 0 is implicitly an int,
rather than an enum mpi_mode)       

2) Additional possibilities are easier to add
It is true that strings allow vendor options to be more securely added
(e.g. "meiko_fastsend") but 
      1) do we want to permit such things ?
	 Any programs which use these features will be neither
	 conformant nor portable. (Though of course we can't stop them).
      2) we have (at least) 2**32 possible modes anyway. If we reserve
         0 <= mode <= 256 for standard sepcified use, and allow other
	 people to choose RANDOM other numbers, the chances of
	 collision are extremely low. (Of course we could recommend 
	 the dollar bill solution, which generates large guaranteed 
	 unique numbers at the cost of $1 for each number).

3) There is no standard way of including PARAMETER definitions in
   Fortran 77.
This is true and a pain. However Fortran 90 provides modules which
overcome this. I don't think this is a sufficiently strong argument

Thoughts ?
Feedback ?
Flames   ?

-- Jim
James Cownie 
Meiko Limited
650 Aztec West
Bristol BS12 4SD
England

Phone : +44 454 616171
FAX   : +44 454 618188
E-Mail: jim@meiko.co.uk or jim@meiko.com


From owner-mpi-lang@CS.UTK.EDU  Mon Jan  4 09:32:49 1993
Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34)
	id AA19838; Mon, 4 Jan 93 09:32:49 -0500
Received: from localhost by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK)
	id AA09250; Mon, 4 Jan 93 09:32:41 -0500
X-Resent-To: mpi-lang@CS.UTK.EDU ; Mon, 04 Jan 1993 14:32:40 GMT
Errors-To: owner-mpi-lang@CS.UTK.EDU
Received: from marge.meiko.com by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK)
	id AA09242; Mon, 4 Jan 93 09:32:36 -0500
Received: from hub.meiko.co.uk by marge.meiko.com with SMTP id AA07241
  (5.65c/IDA-1.4.4); Mon, 4 Jan 1993 09:32:32 -0500
Received: from float.co.uk (float.meiko.co.uk) by hub.meiko.co.uk (4.1/SMI-4.1)
	id AA15874; Mon, 4 Jan 93 14:32:28 GMT
Date: Mon, 4 Jan 93 14:32:28 GMT
From: jim@meiko.co.uk (James Cownie)
Message-Id: <9301041432.AA15874@hub.meiko.co.uk>
Received: by float.co.uk (5.0/SMI-SVR4)
	id AA02690; Mon, 4 Jan 93 14:31:52 GMT
To: mpi-pt2pt@cs.utk.edu, mpi-lang@cs.utk.edu
Subject: Profilers etc.
Content-Length: 2528

Gentlepeople,

I have an implementation issue which I would like to raise in the MPI
forum, since it is unclear where it fits into the sub-committee
structure I have mailed to both mpi-pt2pt and mpi-lang. Apologies to
those of you who receive this message twice.

Issue
=====
The major objective of MPI1 is to achieve portability of applications.
This has major benefits for us all (not least in legitimising and
therefore growing the MPP marketplace). 

One of the benefits which it would also be nice to achieve would be
the wide availability of different tools which support programming in
the MPI1 model. The most immediately obvious such tools (to me at
least !) are

1) HPF to Fortran + MPI1 translators
2) Performance monitoring/tuning tools
3) Debuggers

Support for the first is easy (since it just requires what we're
already doing). 

Portable support for the second is not so trivial, since the
collection of useful performance information is much more intrusive.

Portable support for the third is harder still, and I won't discuss it
further. 

Options
=======
We have various possible options which we can take.

1) Ignore the problem
   Provide no support for portable performance monitoring tools, and
   leave each tool provider with a large porting problem.

   I don't like this solution, it loses some of the benefit of the
   standard, which should be attracting people to build tools.

2) Document specific implementation hooks as part of MPI1. 
   In effect these would be callbacks from the library to profiling
   code which could then do whatever it liked.

3) As 2, but without REQUIRING that a conforming implementation
   provide the functions. They're there as a recommendation, rather
   than being mandatory. 


I think we should be concerned about this, and I'd like us at least to
make some recommendation. Personally I'd probably implement two
separate interface to the library, one of which provided the hooks,
and the other of which didn't so that you don't pay the cost of
checking the profiling hooks unless you asked to.

Of course even if we do nothing it's not too hard to escape (using
horrible macros) in C, but Fortran doesn't always have macros, so a
properly specified internal solution is definitely preferable.

Thoughts ???
Flame me at Dallas. I'm travelling tomorrow. When are we having a
meeting in Europe ???

-- Jim
James Cownie 
Meiko Limited
650 Aztec West
Bristol BS12 4SD
England

Phone : +44 454 616171
FAX   : +44 454 618188
E-Mail: jim@meiko.co.uk or jim@meiko.com

From owner-mpi-lang@CS.UTK.EDU  Wed Feb  3 11:48:30 1993
Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34)
	id AA16901; Wed, 3 Feb 93 11:48:30 -0500
Received: from localhost by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK)
	id AA07042; Wed, 3 Feb 93 11:48:05 -0500
X-Resent-To: mpi-lang@CS.UTK.EDU ; Wed, 3 Feb 1993 11:48:04 EST
Errors-To: owner-mpi-lang@CS.UTK.EDU
Received: from NA-GW.CS.YALE.EDU by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK)
	id AA07034; Wed, 3 Feb 93 11:48:03 -0500
Received: from YOGI.NA.CS.YALE.EDU by CASPER.NA.CS.YALE.EDU via SMTP; Wed, 3 Feb 1993 11:47:59 -0500
Received: by YOGI.NA.CS.YALE.EDU (Sendmail-5.65c/res.client.cf-3.5)
	id AA00312; Wed, 3 Feb 1993 11:47:56 -0500
Date: Wed, 3 Feb 1993 11:47:56 -0500
From: berryman-harry@CS.YALE.EDU (Harry Berryman)
Message-Id: <199302031647.AA00312@YOGI.NA.CS.YALE.EDU>
To: mpi-lang@cs.utk.edu
Subject: Stirring up trouble


Ladies and Gentlemen,

This committee has been (appropriately) quiet the last several months.
It is, however, time to make some noise. I submit the following not 
for your approval, but to stimulate the discussion. I will be disappointed 
if much of what I propose lives as long as the committee draft proposal. 

Although the comments pertain to C and F77, we might also consider 
other languages such as C++, Fortran 90, Ada, and ML.

-----------------------------------------------------------------------------

This committee exists to insure that the design of MPI is consistent with 
the standards of the target languages (Fortran 77 and ANSI-C), and that the 
two interfaces are consistent with each other. I view our role to be 
the "Style Police" of the MPI committee. It is therefore appropriate to 
suggest to the committee a list of pitfalls to avoid.

1) Strings have no place in the parameter lists of MPI calls. 
The internal representation of strings is different in F77 and C. Using 
strings in the interface can only cause trouble. 

2) The availability of macros in Fortran cannot be assumed. 
Although most Fortran compilers allow the use of cpp, the Fortran 77 
standard does not require this. (Some of the functionality of macros
can be had by using "parameters" in F77, but this only allows the binding
of constants to symbols.)

3) As much as possible, the MPI names should be the same in both F77 and C.

4) Identifiers (e.g., function names) must be valid both in C and F77.
Fortran is far more restrictive on this than C is. The F77 standard 
dictates that variables, function names, and subroutine names, be 
distinct in the first six characters. 

5) Using integers in Fortran as handles to keep track of structures
   and pointers to structures is consistent with common F77 programming style
   and the F77 standard. 


Any Other No-Nos for the committee as a whole?


-scott berryman 
Chair, Language Binding 

Yale University Computer Science Department
and
NASA Langley Research Center
From owner-mpi-lang@CS.UTK.EDU  Wed Feb  3 12:00:25 1993
Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34)
	id AA17440; Wed, 3 Feb 93 12:00:25 -0500
Received: from localhost by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK)
	id AA07728; Wed, 3 Feb 93 12:00:11 -0500
X-Resent-To: mpi-lang@CS.UTK.EDU ; Wed, 3 Feb 1993 12:00:10 EST
Errors-To: owner-mpi-lang@CS.UTK.EDU
Received: from marge.meiko.com by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK)
	id AA07710; Wed, 3 Feb 93 12:00:05 -0500
Received: from hub.meiko.co.uk by marge.meiko.com with SMTP id AA18122
  (5.65c/IDA-1.4.4 for <mpi-lang@cs.utk.edu>); Wed, 3 Feb 1993 12:00:00 -0500
Received: from float.co.uk (float.meiko.co.uk) by hub.meiko.co.uk (4.1/SMI-4.1)
	id AA28075; Wed, 3 Feb 93 16:59:49 GMT
Date: Wed, 3 Feb 93 16:59:48 GMT
From: jim@meiko.co.uk (James Cownie)
Message-Id: <9302031659.AA28075@hub.meiko.co.uk>
Received: by float.co.uk (5.0/SMI-SVR4)
	id AA03458; Wed, 3 Feb 93 16:58:10 GMT
To: berryman-harry@CS.YALE.EDU
Cc: mpi-lang@cs.utk.edu
In-Reply-To: Harry Berryman's message of Wed, 3 Feb 1993 11:47:56 -0500 <199302031647.AA00312@YOGI.NA.CS.YALE.EDU>
Subject: Stirring up trouble
Content-Length: 946

> Any Other No-Nos for the committee as a whole?

6) The underscore character (`_') is NOT included in the FORTRAN 77
   character set.

7) In a standard conforming FORTRAN 77 program a subprogram only ever
   has one type signature for its arguments.

We should also concern ourselves somewhat with Fortran 90. A 
FORTRAN 77 binding (though compatible with Fortran 90) is NOT what is
really required for a Fortran 90 binding. In particular Fortran 90
allows various options which may be useful (optional arguments, named
arguments), and provides whole new areas of complexity (POINTERs,
structures etc.) which will not have been addressed by the FORTRAN 77
binding. 

-- Jim
James Cownie 
Meiko Limited			Meiko Inc.
650 Aztec West			Reservoir Place
Bristol BS12 4SD		1601 Trapelo Road
England				Waltham
				MA 02154

Phone : +44 454 616171		+1 617 890 7676
FAX   : +44 454 618188		+1 617 890 5042
E-Mail: jim@meiko.co.uk   or    jim@meiko.com

From owner-mpi-lang@CS.UTK.EDU  Wed Feb  3 14:28:49 1993
Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34)
	id AA21025; Wed, 3 Feb 93 14:28:49 -0500
Received: from localhost by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK)
	id AA14904; Wed, 3 Feb 93 14:28:29 -0500
X-Resent-To: mpi-lang@CS.UTK.EDU ; Wed, 3 Feb 1993 14:28:27 EST
Errors-To: owner-mpi-lang@CS.UTK.EDU
Received: from NA-GW.CS.YALE.EDU by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK)
	id AA14896; Wed, 3 Feb 93 14:28:26 -0500
Received: from YOGI.NA.CS.YALE.EDU by CASPER.NA.CS.YALE.EDU via SMTP; Wed, 3 Feb 1993 14:28:24 -0500
Received: by YOGI.NA.CS.YALE.EDU (Sendmail-5.65c/res.client.cf-3.5)
	id AA00671; Wed, 3 Feb 1993 14:28:23 -0500
Date: Wed, 3 Feb 1993 14:28:23 -0500
From: berryman-harry@CS.YALE.EDU (Harry Berryman)
Message-Id: <199302031928.AA00671@YOGI.NA.CS.YALE.EDU>
To: mpi-lang@cs.utk.edu
Subject: Optional arguments

I'm of the opinion that optional arguments should be discounted because 
F77 doesn't support them, and we want all of the interfaces to be as
much alike as possible. 

Unfortunately, making the interface consistent with F77 and C is pretty 
restrictive. Do we want to keep this as a general principle?

-scott berryman
chairthing
From owner-mpi-lang@CS.UTK.EDU  Wed Feb  3 15:12:38 1993
Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34)
	id AA22300; Wed, 3 Feb 93 15:12:38 -0500
Received: from localhost by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK)
	id AA17335; Wed, 3 Feb 93 15:12:21 -0500
X-Resent-To: mpi-lang@CS.UTK.EDU ; Wed, 3 Feb 1993 15:12:20 EST
Errors-To: owner-mpi-lang@CS.UTK.EDU
Received: from timbuk.cray.com by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK)
	id AA17327; Wed, 3 Feb 93 15:12:18 -0500
Received: from teak18.cray.com by cray.com (4.1/CRI-MX 2.10)
	id AA03650; Wed, 3 Feb 93 14:12:15 CST
Received: by teak18.cray.com
	id AA05975; 4.1/CRI-5.6; Wed, 3 Feb 93 14:12:10 CST
From: par@teak.cray.com (Peter Rigsbee)
Message-Id: <9302032012.AA05975@teak18.cray.com>
Subject: Re:  Optional arguments
To: mpi-lang@cs.utk.edu
Date: Wed, 3 Feb 93 14:12:05 CST
X-Mailer: ELM [version 2.3 PL11b-CRI]

Scott Berryman writes:
> 
> Unfortunately, making the interface consistent with F77 and C is pretty 
> restrictive. Do we want to keep this as a general principle?
> 

It seems to me that most of the restrictions seem to effect the F77
interfaces.  C seems pretty flexible, especially with "void *" in ANSI
C.  (I don't have your other mail messages right at hand, so don't remember 
specifics.)

Adhering to the F77 standard has its advantages, not the least of which
is that it is easy to define and you can make a good argument for.
Unfortunately, I think this will have us end up with the same kinds of
cryptic function names that are found in other portable libraries.  And
if we do end up with lots of subtle variations on send and receive, trying
to remember the difference between SNPAHP and SNPHAP (choosing two strings
at random ;-) will get pretty difficult.

Pragmatically, though, it is my sense that most of the Fortran 77 compilers
available on the market, especially on newer systems, have gone beyond the
standard in a number of areas (especially names).  Given the competitive 
pressures to compile existing code, I think it would be hard for a hardware 
or software vendor today to offer, for example, a Fortran compiler that 
restricted function names to the 6 character limit.

So I think it would be more useful if we were to propose, at least for 
Fortran 77, interfaces that matched this more realistic "standard".  Restrict 
things where true portability will differ, not just where the formal standard 
is limiting.  Another way to look at this is that if the interfaces must
adhere to the lowest common denominator, then we should define this based
on the set of likely target systems, and not simply on the standard.  Then 
let people (members of the MPI group, people who review the spec, etc.) come 
back with specific arguments about where and why particular aspects are in 
fact non-portable.

Some areas where, as a user, I see restrictions as being unnecessary might 
include:
	- maximum length of name (32? 16?)
	- use of underscores in name
	- an argument can only be of one type

I think the net result would be an interface that would be more easily
understood and more usable.

	- Peter Rigsbee
From owner-mpi-lang@CS.UTK.EDU  Mon Feb  8 14:14:34 1993
Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34)
	id AA04248; Mon, 8 Feb 93 14:14:34 -0500
Received: from localhost by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK)
	id AA08480; Mon, 8 Feb 93 14:14:21 -0500
X-Resent-To: mpi-lang@CS.UTK.EDU ; Mon, 8 Feb 1993 14:14:20 EST
Errors-To: owner-mpi-lang@CS.UTK.EDU
Received: from NA-GW.CS.YALE.EDU by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK)
	id AA08472; Mon, 8 Feb 93 14:14:18 -0500
Received: from YOGI.NA.CS.YALE.EDU by CASPER.NA.CS.YALE.EDU via SMTP; Mon, 8 Feb 1993 14:14:15 -0500
Received: by YOGI.NA.CS.YALE.EDU (Sendmail-5.65c/res.client.cf-3.5)
	id AA11866; Mon, 8 Feb 1993 14:14:12 -0500
Date: Mon, 8 Feb 1993 14:14:12 -0500
From: berryman-harry@CS.YALE.EDU (Harry Berryman)
Message-Id: <199302081914.AA11866@YOGI.NA.CS.YALE.EDU>
To: mpi-lang@CS.UTK.EDU
Subject: Why not both?


I also tend to chaff under the restriction of the F77 variable name
requirements. But, on the other hand, I'd hate to have everyone hate
use for not being consistent with the standard. (BTW the ANSI C standard
is 32 chars, counting an implied leading underscore.) Perhaps a compromise
would be to have the standard written in terms of a 31 char limit, but
supply another interface which was consistent with the F77 standard, but
functionally equivelent. Obviously this would double the number of routines 
in the standard.

-scott
From owner-mpi-lang@CS.UTK.EDU  Mon Feb  8 14:40:36 1993
Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34)
	id AA04760; Mon, 8 Feb 93 14:40:36 -0500
Received: from localhost by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK)
	id AA09906; Mon, 8 Feb 93 14:40:26 -0500
X-Resent-To: mpi-lang@CS.UTK.EDU ; Mon, 8 Feb 1993 14:40:25 EST
Errors-To: owner-mpi-lang@CS.UTK.EDU
Received: from ssdintel.ssd.intel.com by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK)
	id AA09887; Mon, 8 Feb 93 14:40:22 -0500
Received: from tualatin.SSD.intel.com by SSD.intel.com (4.1/SMI-4.1)
	id AA05139; Mon, 8 Feb 93 11:40:15 PST
Date: Mon, 8 Feb 93 11:40:15 PST
Message-Id: <9302081940.AA05139@SSD.intel.com>
Received: by tualatin.SSD.intel.com (4.1/SMI-4.0)
	id AA00559; Mon, 8 Feb 93 11:39:51 PST
From: Bob Knighten <knighten@SSD.intel.com>
Sender: knighten@SSD.intel.com
To: berryman-harry@CS.YALE.EDU
Cc: mpi-lang@CS.UTK.EDU
Subject: Re: Why not both?
In-Reply-To: <199302081914.AA11866@YOGI.NA.CS.YALE.EDU>
References: <199302081914.AA11866@YOGI.NA.CS.YALE.EDU>
Reply-To: knighten@SSD.intel.com (Bob Knighten)

The one place where the P1003.9 standard (POSIX FORTRAN 77 Language Interfaces
- Part 1: Binding for System API) which specifies a F77 binding to the POSIX
system interfaces violates the FORTRAN 77 standard is in name length.  There
was no significant opposition because of this.

-- Bob

Robert L. Knighten	             | knighten@ssd.intel.com
Intel Supercomputer Systems Division | 
15201 N.W. Greenbrier Pkwy.	     | (503) 629-4315
Beaverton, Oregon  97006	     | (503) 629-9147 [FAX]
From owner-mpi-lang@CS.UTK.EDU  Wed Feb 10 16:46:22 1993
Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34)
	id AA04990; Wed, 10 Feb 93 16:46:22 -0500
Received: from localhost by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK)
	id AA24116; Wed, 10 Feb 93 16:46:04 -0500
X-Resent-To: mpi-lang@CS.UTK.EDU ; Wed, 10 Feb 1993 16:46:03 EST
Errors-To: owner-mpi-lang@CS.UTK.EDU
Received: from ssdintel.ssd.intel.com by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK)
	id AA24080; Wed, 10 Feb 93 16:45:38 -0500
Received: from tualatin.SSD.intel.com by SSD.intel.com (4.1/SMI-4.1)
	id AA14891; Wed, 10 Feb 93 13:45:36 PST
Date: Wed, 10 Feb 93 13:45:36 PST
Message-Id: <9302102145.AA14891@SSD.intel.com>
Received: by tualatin.SSD.intel.com (4.1/SMI-4.0)
	id AA03909; Wed, 10 Feb 93 13:45:33 PST
From: Bob Knighten <knighten@SSD.intel.com>
Sender: knighten@SSD.intel.com
To: mpi-formal@cs.utk.edu, mpi-lang@cs.utk.edu
Subject: POSIX LIS
Reply-To: knighten@SSD.intel.com (Bob Knighten)

As part of the POSIX work a "Draft TCOS-SSC Technical Report -- Programming
Language Independent Specification Methods" was written.  This  has served as
the basis for the language independent specification of the various system
interfaces that have been developed.  This is *NOT* a formal specification
method, but has as its purpose "to assist and coordinate the development of
functional specifications and language bindings by defining an abstract
model, and providing guidelines for the use of that model in the
development of new functional specifications, the dirivation of a base
standard from an existing language binding, and the development of new
language bindings to a functional specification."

"The model is primarily intended for use in developing language-independent
specifications for operating system and related services, and language
bindings for procedural programming languages."

[The quotation is from the Scope and Purpose of the report.]

This guide was never completely finished and Paul Rabin (OSF), the
principal author, recommends that it be used in conjunction with the
P1003.1LIS which provides a very large example.

Paul Rabin expects that some extensions to the current guide will be
necessary for MPI, just as extensions will be necessary for the POSIX
Real-Time and Threads work.  He is interested in working with us to develop
common extensions.

I can provide copies of the POSIX LIS and both the P1003.1 LIS and the
P1003.16 C binding to the P1003.1 LIS as well.  [P1003.1LIS is about 380
pages and P1003.16 is about 300 pages so I don't want to drop these books
on people unless they are actually desired.]

A formal specification of MPI is quite desirable, but I am doubtful that we
can achieve it in the time we have available.  A language independent
specification of the sort developed within POSIX is, I believe, essential
to provide the common base for all of the language bindings we wish to
provide.  

-- Bob

Robert L. Knighten	             | knighten@ssd.intel.com
Intel Supercomputer Systems Division | 
15201 N.W. Greenbrier Pkwy.	     | (503) 629-4315
Beaverton, Oregon  97006	     | (503) 629-9147 [FAX]
From owner-mpi-lang@CS.UTK.EDU  Fri Apr  9 13:02:00 1993
Received: from CS.UTK.EDU by surfer.EPM.ORNL.GOV (5.61/1.34)
	id AA00637; Fri, 9 Apr 93 13:02:00 -0400
Received: from localhost by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK)
	id AA28148; Fri, 9 Apr 93 13:01:47 -0400
X-Resent-To: mpi-lang@CS.UTK.EDU ; Fri, 9 Apr 1993 13:01:45 EDT
Errors-To: owner-mpi-lang@CS.UTK.EDU
Received: from ocfmail.ocf.llnl.gov by CS.UTK.EDU with SMTP (5.61+IDA+UTK-930125/2.8s-UTK)
	id AA28093; Fri, 9 Apr 93 13:00:56 -0400
Received: by ocfmail.ocf.llnl.gov (4.1/SMI-4.0)
	id AA25915; Fri, 9 Apr 93 10:00:54 PDT
Date: Fri, 9 Apr 93 10:00:54 PDT
From: nessett@ocfmail.ocf.llnl.gov (Danny Nessett)
Message-Id: <9304091700.AA25915@ocfmail.ocf.llnl.gov>
To: mpi-lang@cs.utk.edu, mpi-pt2pt@cs.utk.edu
Subject: cross-language support - a proposal (long)
Cc: nessett@ocfmail.ocf.llnl.gov


I am cross posting this message to the point-to-point list, since it contains
a proposed modification to the point-to-point proposal. Those interested in
point-to-point, but not language binding issues should skip to section 3,
and evaluate whether the changes are acceptable.

1. The Problem
--------------

At the recent MPI meeting in Dallas (31 March - 2 April), the language binding
subcommittee proposed that the MPI standard make no provision for
interoperability between MPI-based programs written in different programming
languages. I objected to that suggestion and volunteered to develop a proposal
that would allow such language interoperability. My objections were and are
based on the following considerations :

  o  MPI is a mulit-use standard, being developed for homogeneous multi-
     processors, homogeneous computer clusters and heterogeneous distributed
     systems. The objective is to provide a message passing interface that
     can be used in all of these environments. In addition, there is an
     objective to promote portability of MPI-based codes from one environment
     to another.

     Correct operation of MPI-based codes in a heterogeneous environment
     requires the MPI implementation to accommodate different formats
     for the communicated data. As a side effect, programs written in
     different programming languages will interoperate as long as the data
     types are conceptually the same. For example, REALs in Fortran and
     floats in C are conceptually the same. Unless there is a requirement
     that MPI support cross-language interoperability, some programs will
     work in heterogeneous environments, but not work in homogeneous
     environments. This will give the appearance that MPI is poorly designed.

  o  MPI can be used in at least two different ways. It can be used for
     intercommunication between application executables that are aware
     (from the programmer's point of view) of each other's internal data
     structures and algorithms. It can also be used to provide services,
     much like a library provides services, in which the internal data
     structures and algorithms are not visible to the service client.
     In this second class of use, MPI acts as a generic service interface,
     supporting a wide variety of applications. Commonly, this generic service
     interface is known as client/server.

     Traditionally, libraries define their interface data structures and
     procedure prototypes in a specific programming language. Some libraries
     provide multiple interface definitions, one for each particular
     programming language they support. However, when using a generic
     service interface to access what is the equivalent of library services,
     this approach becomes problematic. The MPI interface is itself specified
     in terms of one or more programming language bindings. There is no
     useful way to specify another language binding for the client/server
     interface.

     Consequently, another approach is used for the client/server interface,
     which is programming language independent. Generally, server writers
     specify an interface in terms of a message parameter that represents
     the function or procedure to execute and in conjunction with a particular
     value of that parameter, specify a set of parameters that are the
     function's or procedure's arguments. Note that with a message passing
     based generic service interface, such as MPI (as opposed to a remote
     procedure call generic service interface), the service call is not
     constrained to obey function or procedure semantics, since the client is
     not obliged to block while the server provides the service. However,
     that is an aside.

     Since the server's interface is specified in terms of MPI data types,
     rather than in terms of the data types of the programming language used
     to write the server, there is a complete decoupling of the programming
     language used to write a server from the programming language used to
     write the client (and vice versa). The server can use one MPI language
     binding to offer service and the client can use another MPI language
     binding to access those services.

     An example may clarify this somewhat. Suppose I wish to write a server
     that provides a matrix manipulation service. I define my interface as
     follows :

       first parameter : operation to be performed (integer) -

			1 = matrix sum

			2 = matrix multiply

			3 = matrix inversion

			4 = matrix transposition

			5 = return result

       second parameter: number of rows (integer) in matrix
                         (or matrices)

       third parameter : number of columns (integer) in matrix
                         (or matrices)

       fourth parameter: first matrix for all operations (single precision
			  real array, row major order)

       fifth parameter : second matrix for operations 1 and 2 (single
			  precision real array, row major order)

     For this interface, the operation and parameters will be specified as
     MPI data types. For a C or C++ language binding of MPI, these data types
     will be "int" and "float". For a Fortran language binding of MPI, these
     data types will be "INTEGER" and "REAL". This means there really is no
     way to prevent cross language interactions, other than to provide
     a "caveat emptor" warning to the programmer. Some programmers will
     not read the manual that closely; will implement a cross language
     service interface; will use it on various machines with no problems;
     and finally will curse the MPI implementors when it doesn't work on
     other machines. This will reduce confidence in MPI as a standard
     message passing interface.

     Note the property described in the previous bullet. In a heterogeneous
     environment, even if both client and server are written in the same
     language, the MPI implementation must accommodate conversion between
     different ranges of values for a particular type.  For example, if
     one machine represents integers as 32 bit quantities, while another
     represents them as 64 bit quantities, sending an integer parameter
     from the second to the first requires a check to ensure the 64 bit
     value can be represented as a 32 bit value (e.g., that 100,000
     represented as a 64 bit number fits into a 32 bit representation).
     This check also allows a client written in one language (say, C) to call
     a server written in another language (say, Fortran).

2. My view of the issues
------------------------

Below I summarize my view of the issues in regards to cross-language MPI
service. I am sure that there are other issues that I have not thought of
and welcome others to contribute them to the discussion.

  o  As I have stated above, MPI is a multi-use standard being targeted for
     homogeneous multi-processors, homogeneous computer clusters and
     heterogeneous distributed systems. Vendor and user communities
     representing each of these environments are participating in
     the MPI standardization process. This invariably leads to a conflict
     of objectives. For example, the community most interested in homogeneous
     multi-processors does not want to sacrifice the performance of
     parallel programs running on these machines in order to accommodate
     heterogenous distributed processing or cross-language MPI support.
     Alternatively, those interested in heterogenous distributed systems
     don't want to constrain the MPI standard in such a way that severly
     limits its applicability in those environments. Those interested in
     homogeneous computer clusters are probably somewhere in between
     with their objectives.

     Consequently, if the MPI standard is to provide cross-language support,
     it should do so in a way that doesn't penalize the performance of
     programs running on homogeneous multi-processors, while at the same
     time minimizing the implementation effort for cross-language and
     heterogenous distributed system support. Also, programmers who intend
     to use only one language should not have to think about cross-language
     issues.

  o  Cross-language support raises the question of translation between data
     values that are of a fundamentally different type (as opposed to being of
     a different "kind" in the Fortran 90 sense). For example, should the MPI
     standard provide support that would allow a Fortran program to send a
     "complex" value to a program written in C? (By this I mean support
     the transmission of the Fortran "complex" type to a C program, not
     the transmission of a "complex" value represented in some user
     specified way, like two real values). A cross-language facility may
     or may not allow this, depending on the degree of automation the
     standard is designed to provide. It is conceivable for a cross-language
     standard to support the communication of only those data types that
     are common to all of the target programming languages.

     There is also the issue of type coercion of transmitted values within
     a single language. For example C allows you to coerce a real value
     into an integer value when you assign a real value to an integer
     variable. Taking this approach would imply a sender could specify a
     real value in an MPI_ADD_BLOCK call, while the receiver specified
     an integer variable and expect MPI to perform the conversion.
     Allowing such coercion in the MPI interface would significantly increase
     its complexity, since the programmer would now have to specify not
     only the type being sent or received, but also the type being coerced.
     For this reason, I believe we have decided not to support such
     type coercion in MPI.

  o  Translation between values of the same type (but perhaps different
     "kind") across languages requires knowledge of the mapping between
     a particular language type and its machine representation. For example,
     in Fortran 77 a REAL might map to an IEEE 32 and a DOUBLE PRECISION to an
     IEEE 64. In C a float might map to an IEEE 32, a double to an IEEE 64
     and a long double to an IEEE 64. In Fortran 90 a REAL with a given
     kind parameter might map to either an IEEE 32 or an IEEE 64.  Similar
     mapping of int, long, INTEGER (with different kind parameters in
     Fortran 90), char, CHARACTER, LOGICAL and COMPLEX to underlying
     representations is required to properly support cross-language
     interoperability. Some language types that have been suggested as
     MPI types would require definition in certain languages. For example,
     LOGICAL in Fortran has no direct type analog in C. Similarly, there is
     no COMPLEX data type in C. It would be possible to simply declare
     use of these MPI types in a program written in an incompatible
     programming language as erroneous or the standard could specify
     a mapping between these type and non-native types in the appropriate
     languages (e.g., LOGICAL could map to unsigned char and COMPLEX to
     a structure containing two reals).

     In a heterogeneous environment the mapping information can be more
     complex. On a SUN Workstation REALs, DOUBLE PRECISIONS, floats, doubles
     and long doubles may map as specified above. On a CRAY, however, a
     REAL may map to a CRAY 64, a DOUBLE PRECISION to a CRAY 64, a float
     to a CRAY 64, a double to a CRAY 64 and a long double to a CRAY 128.

  o  Cross-language interoperability may require an MPI implementation
     that supports a particular language binding to be aware in some
     way of the existence of other language bindings. This may cause
     a problem when new MPI language bindings are developed. The features
     of MPI that support cross-language interoperability should allow
     the graceful integration of new language bindings into the standard.

3. A Proposal
-------------

In order to support cross-language use of MPI, I propose the following
modification to the point-to-point draft. The proposal is in two completely
independent parts. One can be accepted without accepting the other.

3.1 First Part

Modify the datatype parameter values of the MPI_ADD_... functions so that they
come from the following set :

   MPI_REAL
   MPI_DOUBLEPRECISION
   MPI_COMPLEX
   MPI_INTEGER
   MPI_LOGICAL
   MPI_CHARACTER

   MPI_FLOAT
   MPI_DOUBLE
   MPI_LONGDOUBLE
   MPI_SHORT
   MPI_INT
   MPI_LONG
   MPI_CHAR    (a character array)
   MPI_UCHAR

   MPI_BYTE

These values for the datatype parameter are allowed in any MPI implementation
irrespective of its language binding. They specify the type of the data that
is being sent/received. Notice that the first group (i.e., MPI_REAL,...,
MPI_CHARACTER) are Fortran types; the second group (MPI_FLOAT,...,MPI_UCHAR)
are C (C++) types and MPI_BYTE is a language independent type. The use of a
type from the first group in a Fortran program indicates that the data being
sent/received is in Fortran format and consequently does not require
translation. The use of a type from the first group in a C (C++) program
indicates that the data being sent/received is in Fortran format and so may
require translaton. Similarly, the use of a type from the second group in a
Fortran program indicates the data being sent/received is in C (C++) format
and so may require translation. The use of a type from the second group in a C
program indicates the data being sent/received is in C (C++) format and so
need not be translated.

A client/server interface would specify its interface in terms of these MPI
types. When a MPI_ADD_... function is called, the specified type would be
used as the datatype parameter irregardless of the MPI language binding.
Thus, if a parameter is specified as MPI_REAL, that datatype would be
specified both in Fortran and C programs using the client/server interface.
To properly pass messages, both the client and server must use the same
datatype for the parameter.

In order for the MPI implementation to take advantage of the provided type
information in a cross-language communication, it must know which Fortran
types map into which C types. Therefore, the standard should specify
this information. Let me make the following strawman proposal : MPI_REAL
maps to MPI_FLOAT; MPI_DOUBLEPRECISION maps to MPI_DOUBLE; MPI_COMPLEX
maps either to a C struct or has no mapping, which would cause an error
in a cross-language communication; MPI_INTEGER maps to MPI_INT; MPI_LOGICAL
maps to MPI_UCHAR or has no mapping, which would cause an error; and
MPI_CHARACTER maps to MPI_CHAR.

There are three C types that have no natural analogs in Fortran : MPI_SHORT,
MPI_LONG, and MPI_LONGDOUBLE. MPI_SHORT and MPI_LONG probably should map
to MPI_INTEGER, since it is the only integer type available in Fortran 77.
MPI_LONGDOUBLE probably should map to MPI_DOUBLEPRECISION. However, that means
that MPI_DOUBLE and MPI_LONGDOUBLE map to the same Fortran data type. An
alternate mapping would map MPI_DOUBLE to MPI_REAL and MPI_LONGDOUBLE to 
MPI_DOUBLEPRECISION. This requires discussion.

In addition to a standard mapping between programming language types, a
particular MPI implementation must know how the supported language types
map into underlying machine representations. For example, it must know
that REAL maps into IEEE 32, int maps into a 32 bit 2's complement value, etc.

An MPI implementation would operate as follows. I categorize the implementations
by environment in order to demonstrate properties of the previously discussed
issues.

   3.1.1 Homogeneous Mulit-processors and Computer Clusters
   --------------------------------------------------------

I describe the behavior of an MPI implementation with a Fortran language
binding. The corresponding actions of an MPI implemenation with a C language
binding should be obvious.

   3.1.1.1 Send

If the datatype parameter is in the Fortran set, block copy the data using the
underlying system network. If the datatype parameter is in the C set, use the
type mapping to decide what is the corresponding Fortran type and use the
per implementation machine representation mapping to decide the underlying
representation for both the Fortran and C type. If they are the same, block
copy the data using the underlying system network. If not, convert the data
to the appropriate underlying representation for the C type and send the
converted data.

   3.1.1.2 Receive

If the datatype parameter is in the Fortran set, the buffer in which the data
arrived is in the proper format. Make it available to the MPI caller. If
the datatype parameter is in the C set, use the type mapping to decide what
is the corresponding Fortran type and use the per implementation machine
representation maping to decide the underlying representation for both
the Fortran and C type. If they are the same, the buffer in which the
data arrived is in the proper format. Make it available to the MPI caller.
Otherwise, convert the buffer to the appropriate underlying representation
and make it available to the MPI caller.

A comment on implementation strategy. It is possible for both a send
and receive operation to know before hand whether it needs to translate
the buffer data or not (i.e., the above algorithms can be run before the
actual machine transmission is sent or received). Consequently, in those
situations in which translation is necessary, the implementation can supply
an intermediate buffer in which to translate or from which to translate the
data. By knowing both the Fortran and C types as well as their underlying
representations, the implementation can precalculate the size of the
translation buffer. 

   3.1.2 Heterogeneous Distributed Systems
   ---------------------------------------

Supporting MPI communications in a heterogeneous distributed system is more
complicated than in a homogeneous environment. Not only must differences in
programming language data types be accommodated, differences in underlying
machine represenations are also a concern. The exact algorithms to use when
sending and receiving depend on the particular presentation-level protocol
employed. Protocols like XDR and ASN.1 use a intermediate representation
of data for transmission purposes. A protocol like NDR (used in OSF DCE)
transmits the data in the format of the sender, placing the burden on the
receiver to translate it. In the following discussion I finesse the protocol
issue by using the generic description "call the off-machine protocol
translation module" to mean execute the appropriate presentation protocol
algorithm. Some implementations will combine the protocol translation activity
with sending or receiving the data.

   3.1.2.1 Send

If the datatype parameter is in the Fortran set, determine the underlying
machine represenation and call the off-machine protocol translation module
to put it into the proper format. Send the result to the receiver. If the
datatype parameter is in the C set, use the type mapping to decide what is
the corresponding Fortran type. Determine the Fortran type's underlying
representation and call the off-machine protocol translation module to put
it into the proper format. Send the result to the receiver.

   3.1.2.2 Receive

If the datatype parameter is in the Fortran set, determine the underlying
machine representation and call the off-machine protocol translation module
to put it into the proper format. Return this result to the MPI caller. If the
datatype parameter is in the C set, use the type mapping to decide what is
the corresponding Fortran type. Determine the Fortran type's underlying
representation and call the off-machine protocol translation module to
put it into the proper format. Return this result to the MPI caller.


3.2 Second Part

Cross-language interoperability raises the issue of handling data types in
one programming language that have no exact analog in another programming
language. With the current suggested language bindings for MPI (i.e., Fortran
77, Fortran 90, C and C++), I think the following types fall into this
category, one way or another :

   Fortran 77

   LOGICAL
   COMPLEX

   Fortran 90

   REAL(SELECTED_REAL_KIND(--,--))
   INTEGER(SELECTED_INT_KIND(--))
   LOGICAL
   COMPLEX
   COMPLEX(SELECTED_REAL_KIND(--,--))

C and C++ types included in the MPI standard have natural analogs in Fortran 77
and Fortran 90.

There is at least two ways to handle these type mismatches. The simplest
strategy is to generate an error when these types are referenced in an
inappropriate language. This approach has the advantage of being simple to
implement. It has the disadvantage that MPI-based programs written in one
language without thought of using it from a program written in another language
will likely use inappropriate types.

The second strategy is to define analogs in each language for the non-common
types. For example, LOGICAL in both Fortran 77 and Fortran 90 could map to
unsigned char in C. COMPLEX in Fortran 77 and Fortran 90 could map to a
struct with two float members in C. However, REAL, INT and COMPLEX with
KIND parameter information cannot be handled in this way, since the
"length" of the type is a machine dependent quantity. For example, on
some machines INTEGER(SELECTED_INT_KIND(4)) might be equivalent to an int,
while on other machines it is equivalent to a long (and not an int). Similar
"length" problems exist with the REAL(SELECTED_REAL_KIND(--,--)) type.

With these points in mind I propose the following. Map LOGICAL into unsigned
char, COMPLEX into a C struct with two floats and disallow specification
of the Fortran 90 types that specify a KIND parameter. Since Fortran 90
supports a KIND function call that the programmer can use to determine when
INTEGER and INTERGER(SELECTED_INT_KIND(--)), REAL and
REAL(SELECTED_REAL_KIND(--,--)), and COMPLEX and
COMPLEX(SELECTED_REAL_KIND(--,--)) are equivalent, there is a work around
for most cases.

4. Analysis of the Proposal
---------------------------

Following is an analysis of the proposal according to the issues specified
in section 2.

  o  As described in section 3.1.1 the proposal can be implemented in such
     a way that it does not adversely impact the performance of either
     homogeneous multi-processors or homogeneous computer clusters. The 
     programmer who writes programs in one language need never consider cross-
     language issues. Furthermore, the MPI types he/she would use would be
     natural for the programming language in which he/she is working.

  o  The issue of mapping data types not found in one programming language
     into data types found in another is addressed by the second part, which
     covers mapping of data types by defining analagous types in the
     programming language from which a type is missing. Type coercion is
     not supported between types, since it is simple to first coerce the
     type (e.g., float to int) before sending it. It is not simple to support
     this kind of type coercion from within the MPI implementation.

  o  The proposal handles translation of values of the same type
     by requiring the programmer to specify the type of data in the
     datatype parameter of the MPI_ADD_... functions. There is a limited
     amount of translation between types of different kinds due to the
     mapping required to associate a type in one language with a type
     in another language. This has the advantage of automating much
     of the work required to communicate values in a cross-language
     situation. However, there is also a possible disadvantage. In a
     heterogeneous distributed system the mapping could lead to the loss
     of information or to an error. For example, the suggested mapping
     associates a float with a REAL. On a CRAY a REAL is a 64-bit floating
     point number, while on a SUN a float is a 32-bit floating point.
     Transfering a REAL on a CRAY to a float on a SUN causes loss of
     precision and potentially could result in a translation error
     because the value on the CRAY cannot be represented by a 32-bit
     floating point. This problem also occurs on a single machine if
     a REAL maps to, say, an IEEE 64 and a float to an IEEE 32 (which
     doesn't seem likely).


  o  Adding a new language binding to the MPI standard requires the definition
     of new MPI_<type> values and the mapping between these new values and
     the existing MPI_<type> values. Introduction of MPI implementations with
     support for the new language bindings can be accomplished gradually, since
     they will support the old MPI_<types>, eventhough the old implementations
     do not support the new MPI_<types>. As vendors upgrade their MPI
     implementations to conform to the new language binding standards, the
     new MPI_<type>s can be used more and more until all useful implementations
     support them. Thus, the proposal supports the gradual introduction of
     new language bindings without requiring all implementations to immediately
     support them.
