Home     |     .Net Programming    |     cSharp Home    |     Sql Server Home    |     Javascript / Client Side Development     |     Ajax Programming

Ruby on Rails Development     |     Perl Programming     |     C Programming Language     |     C++ Programming     |     IT Jobs

Python Programming Language     |     Laptop Suggestions?    |     TCL Scripting     |     Fortran Programming     |     Scheme Programming Language


 
 
Cervo Technologies
The Right Source to Outsource

MS Dynamics CRM 3.0

Fortran Programming Language

user-defined alignment in gfortran


Hi,

this sounds like an easy question but I couldn't find an answer in the
doc's:

How do I tell gfortran to allign a complex*16 array at 16 byte
boundaries? This is necessary on the Cell BE architecture for being able
to do DMA transfer. In C I can write __attribute__((aligned(16))) and
it works. How can I do that with gfortran?

Regards,
Timo

On Tue, 29 May 2007 00:22:30 +0000 (UTC), Timo Schneider
<t@hrz.tu-chemnitz.de>
 wrote in <slrnf5ms51.2q2.ti@m34s24.vlinux.de>:

> How do I tell gfortran to allign a complex*16 array at 16 byte
> boundaries? This is necessary on the Cell BE architecture for being able
> to do DMA transfer. In C I can write __attribute__((aligned(16))) and
> it works. How can I do that with gfortran?

        I wouldn't think you'd have to.  I'd expect it to happen
automagically.  ...or are you trying to align things up within a derived
type?  Usual advice in that case is to assign your type/structure from
the largest elements down to the smallest, with character*? at the end.

--
Ivan Reid, School of Engineering & Design, _____________  CMS Collaboration,
Brunel University.    Ivan.Reid@[brunel.ac.uk|cern.ch]    Room 40-1-B12, CERN
        KotPT -- "for stupidity above and beyond the call of duty".

Dr Ivan D. Reid wrote:
> On Tue, 29 May 2007 00:22:30 +0000 (UTC), Timo Schneider
> <t@hrz.tu-chemnitz.de>
>  wrote in <slrnf5ms51.2q2.ti@m34s24.vlinux.de>:

>> How do I tell gfortran to allign a complex*16 array at 16 byte
>> boundaries? This is necessary on the Cell BE architecture for being able
>> to do DMA transfer. In C I can write __attribute__((aligned(16))) and
>> it works. How can I do that with gfortran?

>    I wouldn't think you'd have to.  I'd expect it to happen
> automagically.  ...or are you trying to align things up within a derived
> type?  Usual advice in that case is to assign your type/structure from
> the largest elements down to the smallest, with character*? at the end.

For many gfortran targets, this is controlled in part by parameters
built into binutils.  Typical 32-bit targets didn't even begin to
support 16-byte alignment until SSE came along.  At one time, various
32-bit gcc targets had maximum alignments supported in binutils of 4 or
8 bytes.  Many people claimed that support for more than 4-byte
alignment violated the ABI, as long as the hardware provided automatic
alignment fix-up.
When you have support for 16-byte alignment built into binutils, the
gcc/gfortran option -mpreferred-stack-boundary=4 [this means 2**4]
should accomplish what you requested.   In current 32- and 64-bit
gfortran implementations, this is the default, unless you set -Os, which
tries to conserve stack space by setting smaller alignments, and
disabling support for vectorization.
Dr Ivan D. Reid wrote:

> On Tue, 29 May 2007 00:22:30 +0000 (UTC), Timo Schneider
> <t@hrz.tu-chemnitz.de>  wrote in <slrnf5ms51.2q2.ti@m34s24.vlinux.de>:
>>How do I tell gfortran to allign a complex*16 array at 16 byte
>>boundaries? This is necessary on the Cell BE architecture for being able
>>to do DMA transfer. In C I can write __attribute__((aligned(16))) and
>>it works. How can I do that with gfortran?
>    I wouldn't think you'd have to.  I'd expect it to happen
> automagically.  ...or are you trying to align things up within a derived
> type?  Usual advice in that case is to assign your type/structure from
> the largest elements down to the smallest, with character*? at the end.

Since a COMPLEX*16 variable is operated on, as far as Fortran is
concerned, as two eight byte real values it would seem in that
sense that eight byte alignment would be good enough.  (It would
be interesting to have a processor with complex operations.)

Though I have complained much in the past about the Microsoft
compilers that didn't offer any more than four byte alignment since
that was good enough for the 486.

I presume your statement about C is really about a specific
C compiler.  Also, some linkers may have the ability to
specify alignment for external symbols.

-- glen

Tim Prince <timothypri@sbcglobal.net> schrieb:

> When you have support for 16-byte alignment built into binutils, the
> gcc/gfortran option -mpreferred-stack-boundary=4 [this means 2**4]
> should accomplish what you requested.

I don't think so. I don't want to align the stack but data on the heap.

Regards,
Timo

glen herrmannsfeldt <g@ugcs.caltech.edu> schrieb:

Mm, why? If I have a COMPLEX*16 variable placed somwhere with eight byte
alignment, there can be two cases: It is 16 Byte aligned or not, in
that case only the second part of the variable falls together wih a 16
bit boundary.

> I presume your statement about C is really about a specific
> C compiler.

Yep, gcc.

Regards,
Timo

Timo Schneider wrote:
> glen herrmannsfeldt <g@ugcs.caltech.edu> schrieb:

(snip)

>>Since a COMPLEX*16 variable is operated on, as far as Fortran is
>>concerned, as two eight byte real values it would seem in that
>>sense that eight byte alignment would be good enough.  (It would
>>be interesting to have a processor with complex operations.)
> Mm, why? If I have a COMPLEX*16 variable placed somwhere with eight byte
> alignment, there can be two cases: It is 16 Byte aligned or not, in
> that case only the second part of the variable falls together wih a 16
> bit boundary.

Maybe I didn't say it quite right.

No hardware that I know of has instruction that operate on
complex data.  There is no complex add or complex multiply
instruction.  Operations are done on separate real and imaginary
parts using ordinary floating point instructions.

Now, there could be some advantage to not crossing cache
blocks, though it isn't so obvious what one can do about that.

If you ask about I/O, this reminds me of LOCATE mode I/O that
PL/I allows for RECORD (UNFORMATTED) I/O.

http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/ibm3lr30/1...

Locate mode I/O allows operating on data directly into the I/O
buffer, reducing the number of copies needed.

READ FILE(IN) SET(P);  sets pointer P to point to the data read.

LOCATE FILE(OUT) SET(P);   writes any previous buffer and
sets pointer P to point to the next output buffer.  The
last is written on CLOSE.

-- glen

Timo Schneider wrote:
> Tim Prince <timothypri@sbcglobal.net> schrieb:

>> When you have support for 16-byte alignment built into binutils, the
>> gcc/gfortran option -mpreferred-stack-boundary=4 [this means 2**4]
>> should accomplish what you requested.

> I don't think so. I don't want to align the stack but data on the heap.

I don't see how gfortran will distinguish between stack and heap
alignments; perhaps your question is more complicated than you have told us.

glen herrmannsfeldt wrote:

> No hardware that I know of has instruction that operate on
> complex data.  There is no complex add or complex multiply
> instruction.  Operations are done on separate real and imaginary
> parts using ordinary floating point instructions.

SSE3 comes close enough.  It requires 16-byte alignment to facilitate
optimization of complex operations, and does have instructions which
operate on both parts in parallel.

glen herrmannsfeldt <g@ugcs.caltech.edu> wrote:
> No hardware that I know of has instruction that operate on
> complex data.  There is no complex add or complex multiply
> instruction.  Operations are done on separate real and imaginary
> parts using ordinary floating point instructions.

Reread the original post. He was not talking about Fortran operations on
the complex data. He specifically cited DMA transfer as the reason for
the request and he said it was a specific array. Doesn't sound to me as
though it directly had anything to do with it being complex, but just
that this particular array was the one involved withthe DMA transfer.

--
Richard Maine                    | Good judgement comes from experience;
email: last name at domain . net | experience comes from bad judgement.
domain: summertriangle           |  -- Mark Twain

glen herrmannsfeldt <g@ugcs.caltech.edu> schrieb:

I think this is rather irrelevant for my problem. On the Cell-BE
architecture DMA transfers have to be 16 byte aligned. So I want my
Array or whatever I allocate in Fortran 16 byte aligned so that I can
DMA-transfer it to the Cell's SPU's later.

Regards,
Timo

Richard Maine <nos@see.signature> schrieb:

> glen herrmannsfeldt <g@ugcs.caltech.edu> wrote:

>> No hardware that I know of has instruction that operate on
>> complex data.  There is no complex add or complex multiply
>> instruction.  Operations are done on separate real and imaginary
>> parts using ordinary floating point instructions.

> Reread the original post. He was not talking about Fortran operations on
> the complex data. He specifically cited DMA transfer as the reason for
> the request and he said it was a specific array. Doesn't sound to me as
> though it directly had anything to do with it being complex, but just
> that this particular array was the one involved withthe DMA transfer.

Yes! I dont care about the type of the array, I just want the array to
be 16-byte aligned.

Regards,
Timo

Tim Prince <timothypri@sbcglobal.net> schrieb:

>>> When you have support for 16-byte alignment built into binutils, the
>>> gcc/gfortran option -mpreferred-stack-boundary=4 [this means 2**4]
>>> should accomplish what you requested.

>> I don't think so. I don't want to align the stack but data on the heap.

> I don't see how gfortran will distinguish between stack and heap
> alignments; perhaps your question is more complicated than you have told us.

If so I don't know why. gcc supports __attribute__((aligned(16))) why
shouldn't gfortran have something similar?
Unfortunately most people try to tell me why I don't need alignment, but
I really do need it for the DMA-Transfer stuff on Cell...

Regards,
Timo

In article <slrnf5oclv.au0.ti@m34s24.vlinux.de>,
        Timo Schneider <t@perlplexity.org> writes:

> Tim Prince <timothypri@sbcglobal.net> schrieb:

>>>> When you have support for 16-byte alignment built into binutils, the
>>>> gcc/gfortran option -mpreferred-stack-boundary=4 [this means 2**4]
>>>> should accomplish what you requested.

>>> I don't think so. I don't want to align the stack but data on the heap.

>> I don't see how gfortran will distinguish between stack and heap
>> alignments; perhaps your question is more complicated than you have told us.

> If so I don't know why. gcc supports __attribute__((aligned(16))) why
> shouldn't gfortran have something similar?
> Unfortunately most people try to tell me why I don't need alignment, but
> I really do need it for the DMA-Transfer stuff on Cell...

You're asking for info of in the wrong forum.  Try posting to
gcc at gcc.gnu.org  and fortran at gcc.gnu.org.

--
Steve
http://troutmask.apl.washington.edu/~kargl/

Steven G. Kargl <k@troutmask.apl.washington.edu> schrieb:

Ah, thats a good idea. Thanks!

Regards,
Timo

I can tell you how I would do it in C;  you could probably do something
similar in Fortran, although you might give up some portability.

Start with the assumption that dynamic allocation will give you an
address that is eight-byte aligned, so figure out how many real*8
elements you need, and add one.  Do the allocation.  Pass the returned
address to a C routine that will tell you if it's on a 16-byte boundary.
   If it's on a 16-byte boundary, use that address.  If it's not, check
to make sure that it *is* on an 8-byte boundary, and use the address of
the next element, which *will* be on a 16-byte boundary.

I wouldn't be surprised if there were better, easier ways to do it.

Louis

Louis Krupp <lkr@pssw.nospam.com.invalid> schrieb:

I think I actually gaining "portability" by the aproach described below,
because in that way I don't have to impose restrictions on the caller.
(The code I write which reads the array is a library written in C but
called by (mainly) Fortran).

> Start with the assumption that dynamic allocation will give you an
> address that is eight-byte aligned, so figure out how many real*8
> elements you need, and add one.  Do the allocation.  Pass the returned
> address to a C routine that will tell you if it's on a 16-byte boundary.
>    If it's on a 16-byte boundary, use that address.  If it's not, check
> to make sure that it *is* on an 8-byte boundary, and use the address of
> the next element, which *will* be on a 16-byte boundary.

Yeah, this is exactly what I do right now as a "workaround". The only
problem with this aproach I see right now is that if I want to get the
first element of an arry which is 8 but not 16 bytes aligned I have to
read 8 bytes of "junk" before the start of the array.
Now it could (theoreticaly) happen that the memory of these 8 bytes
"junk" do not belong to my application an we get a segfault.

Maybe you can think of a workaround for this or can prove me that the
case discribed obove can't hapen? I *believe* that it can't happen if
the pagesize is a multiple of 16, but I didn't look into this long
enough to be sure about it or to be able to determine the pagesize
(shoukd be somewhere inside the specs or kernel code).

Regards,
Timo

Timo Schneider <t@hrz.tu-chemnitz.de> wrote:
> How do I tell gfortran to allign a complex*16 array at 16 byte
> boundaries? This is necessary on the Cell BE architecture for being able
> to do DMA transfer. In C I can write __attribute__((aligned(16))) and
> it works. How can I do that with gfortran?

If a static buffer will do the job, you can do the layout in C and
get to it from Fortran by a common block.  The old g77 method
of matching a common to a C struct still seems to work with gfortran.

This is not portable because it relies on gcc extensions and
on the gcc/gfortran ABI, but that may be okay too, given that
you are targetting a specific platform.

    $ cat test.f90
    program test
            implicit none
            complex(kind(0.0d0)), dimension(10):: x
            common /coucou/ x
            print *, x
    end

    $ cat test2.c
    struct {
            double c[20];
    } __attribute__((aligned(16))) coucou_= {
            1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
            11,12,13,14,15,16,17,18,19,20,
    };

    $ gfortran test.f90 test2.c && ./a.out
     (  1.00000000000000     ,  2.00000000000000     )
     <snip>
     (  19.0000000000000     ,  20.0000000000000     )

--
pa at panix dot com

Add to del.icio.us | Digg this | Stumble it | Powered by Megasolutions Inc