|
|
 |
 |
 |
 |
Fortran Programming Language
|
 |
 |
 |
 |
 |
 |
 |
 |
user-defined alignment in gfortran
Hi, this sounds like an easy question but I couldn't find an answer in the doc's: How do I tell gfortran to allign a complex*16 array at 16 byte boundaries? This is necessary on the Cell BE architecture for being able to do DMA transfer. In C I can write __attribute__((aligned(16))) and it works. How can I do that with gfortran? Regards, Timo
On Tue, 29 May 2007 00:22:30 +0000 (UTC), Timo Schneider <t@hrz.tu-chemnitz.de> wrote in <slrnf5ms51.2q2.ti@m34s24.vlinux.de>: > How do I tell gfortran to allign a complex*16 array at 16 byte > boundaries? This is necessary on the Cell BE architecture for being able > to do DMA transfer. In C I can write __attribute__((aligned(16))) and > it works. How can I do that with gfortran?
I wouldn't think you'd have to. I'd expect it to happen automagically. ...or are you trying to align things up within a derived type? Usual advice in that case is to assign your type/structure from the largest elements down to the smallest, with character*? at the end. -- Ivan Reid, School of Engineering & Design, _____________ CMS Collaboration, Brunel University. Ivan.Reid@[brunel.ac.uk|cern.ch] Room 40-1-B12, CERN KotPT -- "for stupidity above and beyond the call of duty".
Dr Ivan D. Reid wrote: > On Tue, 29 May 2007 00:22:30 +0000 (UTC), Timo Schneider > <t @hrz.tu-chemnitz.de> > wrote in <slrnf5ms51.2q2.ti @m34s24.vlinux.de>: >> How do I tell gfortran to allign a complex*16 array at 16 byte >> boundaries? This is necessary on the Cell BE architecture for being able >> to do DMA transfer. In C I can write __attribute__((aligned(16))) and >> it works. How can I do that with gfortran? > I wouldn't think you'd have to. I'd expect it to happen > automagically. ...or are you trying to align things up within a derived > type? Usual advice in that case is to assign your type/structure from > the largest elements down to the smallest, with character*? at the end.
For many gfortran targets, this is controlled in part by parameters built into binutils. Typical 32-bit targets didn't even begin to support 16-byte alignment until SSE came along. At one time, various 32-bit gcc targets had maximum alignments supported in binutils of 4 or 8 bytes. Many people claimed that support for more than 4-byte alignment violated the ABI, as long as the hardware provided automatic alignment fix-up. When you have support for 16-byte alignment built into binutils, the gcc/gfortran option -mpreferred-stack-boundary=4 [this means 2**4] should accomplish what you requested. In current 32- and 64-bit gfortran implementations, this is the default, unless you set -Os, which tries to conserve stack space by setting smaller alignments, and disabling support for vectorization.
Dr Ivan D. Reid wrote: > On Tue, 29 May 2007 00:22:30 +0000 (UTC), Timo Schneider > <t @hrz.tu-chemnitz.de> wrote in <slrnf5ms51.2q2.ti @m34s24.vlinux.de>: >>How do I tell gfortran to allign a complex*16 array at 16 byte >>boundaries? This is necessary on the Cell BE architecture for being able >>to do DMA transfer. In C I can write __attribute__((aligned(16))) and >>it works. How can I do that with gfortran? > I wouldn't think you'd have to. I'd expect it to happen > automagically. ...or are you trying to align things up within a derived > type? Usual advice in that case is to assign your type/structure from > the largest elements down to the smallest, with character*? at the end. Since a COMPLEX*16 variable is operated on, as far as Fortran is concerned, as two eight byte real values it would seem in that sense that eight byte alignment would be good enough. (It would be interesting to have a processor with complex operations.) Though I have complained much in the past about the Microsoft compilers that didn't offer any more than four byte alignment since that was good enough for the 486. I presume your statement about C is really about a specific C compiler. Also, some linkers may have the ability to specify alignment for external symbols. -- glen
Tim Prince <timothypri@sbcglobal.net> schrieb: > When you have support for 16-byte alignment built into binutils, the > gcc/gfortran option -mpreferred-stack-boundary=4 [this means 2**4] > should accomplish what you requested.
I don't think so. I don't want to align the stack but data on the heap. Regards, Timo
glen herrmannsfeldt <g@ugcs.caltech.edu> schrieb:
> Dr Ivan D. Reid wrote: >> On Tue, 29 May 2007 00:22:30 +0000 (UTC), Timo Schneider >> <t @hrz.tu-chemnitz.de> wrote in <slrnf5ms51.2q2.ti @m34s24.vlinux.de>: >>>How do I tell gfortran to allign a complex*16 array at 16 byte >>>boundaries? This is necessary on the Cell BE architecture for being able >>>to do DMA transfer. In C I can write __attribute__((aligned(16))) and >>>it works. How can I do that with gfortran? >> I wouldn't think you'd have to. I'd expect it to happen >> automagically. ...or are you trying to align things up within a derived >> type? Usual advice in that case is to assign your type/structure from >> the largest elements down to the smallest, with character*? at the end. > Since a COMPLEX*16 variable is operated on, as far as Fortran is > concerned, as two eight byte real values it would seem in that > sense that eight byte alignment would be good enough. (It would > be interesting to have a processor with complex operations.)
Mm, why? If I have a COMPLEX*16 variable placed somwhere with eight byte alignment, there can be two cases: It is 16 Byte aligned or not, in that case only the second part of the variable falls together wih a 16 bit boundary. > I presume your statement about C is really about a specific > C compiler.
Yep, gcc. Regards, Timo
Timo Schneider wrote: > glen herrmannsfeldt <g @ugcs.caltech.edu> schrieb: (snip) >>Since a COMPLEX*16 variable is operated on, as far as Fortran is >>concerned, as two eight byte real values it would seem in that >>sense that eight byte alignment would be good enough. (It would >>be interesting to have a processor with complex operations.) > Mm, why? If I have a COMPLEX*16 variable placed somwhere with eight byte > alignment, there can be two cases: It is 16 Byte aligned or not, in > that case only the second part of the variable falls together wih a 16 > bit boundary.
Maybe I didn't say it quite right. No hardware that I know of has instruction that operate on complex data. There is no complex add or complex multiply instruction. Operations are done on separate real and imaginary parts using ordinary floating point instructions. Now, there could be some advantage to not crossing cache blocks, though it isn't so obvious what one can do about that. If you ask about I/O, this reminds me of LOCATE mode I/O that PL/I allows for RECORD (UNFORMATTED) I/O. http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/ibm3lr30/1... Locate mode I/O allows operating on data directly into the I/O buffer, reducing the number of copies needed. READ FILE(IN) SET(P); sets pointer P to point to the data read. LOCATE FILE(OUT) SET(P); writes any previous buffer and sets pointer P to point to the next output buffer. The last is written on CLOSE. -- glen
Timo Schneider wrote: > Tim Prince <timothypri @sbcglobal.net> schrieb: >> When you have support for 16-byte alignment built into binutils, the >> gcc/gfortran option -mpreferred-stack-boundary=4 [this means 2**4] >> should accomplish what you requested. > I don't think so. I don't want to align the stack but data on the heap.
I don't see how gfortran will distinguish between stack and heap alignments; perhaps your question is more complicated than you have told us.
glen herrmannsfeldt wrote: > No hardware that I know of has instruction that operate on > complex data. There is no complex add or complex multiply > instruction. Operations are done on separate real and imaginary > parts using ordinary floating point instructions.
SSE3 comes close enough. It requires 16-byte alignment to facilitate optimization of complex operations, and does have instructions which operate on both parts in parallel.
glen herrmannsfeldt <g @ugcs.caltech.edu> wrote: > No hardware that I know of has instruction that operate on > complex data. There is no complex add or complex multiply > instruction. Operations are done on separate real and imaginary > parts using ordinary floating point instructions. Reread the original post. He was not talking about Fortran operations on the complex data. He specifically cited DMA transfer as the reason for the request and he said it was a specific array. Doesn't sound to me as though it directly had anything to do with it being complex, but just that this particular array was the one involved withthe DMA transfer. -- Richard Maine | Good judgement comes from experience; email: last name at domain . net | experience comes from bad judgement. domain: summertriangle | -- Mark Twain
glen herrmannsfeldt <g@ugcs.caltech.edu> schrieb:
>>>Since a COMPLEX*16 variable is operated on, as far as Fortran is >>>concerned, as two eight byte real values it would seem in that >>>sense that eight byte alignment would be good enough. (It would >>>be interesting to have a processor with complex operations.) >> Mm, why? If I have a COMPLEX*16 variable placed somwhere with eight byte >> alignment, there can be two cases: It is 16 Byte aligned or not, in >> that case only the second part of the variable falls together wih a 16 >> bit boundary. > Maybe I didn't say it quite right. > No hardware that I know of has instruction that operate on > complex data. There is no complex add or complex multiply > instruction. Operations are done on separate real and imaginary > parts using ordinary floating point instructions.
I think this is rather irrelevant for my problem. On the Cell-BE architecture DMA transfers have to be 16 byte aligned. So I want my Array or whatever I allocate in Fortran 16 byte aligned so that I can DMA-transfer it to the Cell's SPU's later. Regards, Timo
Richard Maine <nos@see.signature> schrieb: > glen herrmannsfeldt <g @ugcs.caltech.edu> wrote: >> No hardware that I know of has instruction that operate on >> complex data. There is no complex add or complex multiply >> instruction. Operations are done on separate real and imaginary >> parts using ordinary floating point instructions. > Reread the original post. He was not talking about Fortran operations on > the complex data. He specifically cited DMA transfer as the reason for > the request and he said it was a specific array. Doesn't sound to me as > though it directly had anything to do with it being complex, but just > that this particular array was the one involved withthe DMA transfer.
Yes! I dont care about the type of the array, I just want the array to be 16-byte aligned. Regards, Timo
Tim Prince <timothypri@sbcglobal.net> schrieb: >>> When you have support for 16-byte alignment built into binutils, the >>> gcc/gfortran option -mpreferred-stack-boundary=4 [this means 2**4] >>> should accomplish what you requested. >> I don't think so. I don't want to align the stack but data on the heap. > I don't see how gfortran will distinguish between stack and heap > alignments; perhaps your question is more complicated than you have told us.
If so I don't know why. gcc supports __attribute__((aligned(16))) why shouldn't gfortran have something similar? Unfortunately most people try to tell me why I don't need alignment, but I really do need it for the DMA-Transfer stuff on Cell... Regards, Timo
In article <slrnf5oclv.au0.ti@m34s24.vlinux.de>, Timo Schneider <t@perlplexity.org> writes: > Tim Prince <timothypri @sbcglobal.net> schrieb: >>>> When you have support for 16-byte alignment built into binutils, the >>>> gcc/gfortran option -mpreferred-stack-boundary=4 [this means 2**4] >>>> should accomplish what you requested. >>> I don't think so. I don't want to align the stack but data on the heap. >> I don't see how gfortran will distinguish between stack and heap >> alignments; perhaps your question is more complicated than you have told us. > If so I don't know why. gcc supports __attribute__((aligned(16))) why > shouldn't gfortran have something similar? > Unfortunately most people try to tell me why I don't need alignment, but > I really do need it for the DMA-Transfer stuff on Cell...
You're asking for info of in the wrong forum. Try posting to gcc at gcc.gnu.org and fortran at gcc.gnu.org. -- Steve http://troutmask.apl.washington.edu/~kargl/
Steven G. Kargl <k@troutmask.apl.washington.edu> schrieb:
> In article <slrnf5oclv.au0.ti @m34s24.vlinux.de>, > Timo Schneider <t @perlplexity.org> writes: >> Tim Prince <timothypri @sbcglobal.net> schrieb: >>>>> When you have support for 16-byte alignment built into binutils, the >>>>> gcc/gfortran option -mpreferred-stack-boundary=4 [this means 2**4] >>>>> should accomplish what you requested. >>>> I don't think so. I don't want to align the stack but data on the heap. >>> I don't see how gfortran will distinguish between stack and heap >>> alignments; perhaps your question is more complicated than you have told us. >> If so I don't know why. gcc supports __attribute__((aligned(16))) why >> shouldn't gfortran have something similar? >> Unfortunately most people try to tell me why I don't need alignment, but >> I really do need it for the DMA-Transfer stuff on Cell... > You're asking for info of in the wrong forum. Try posting to > gcc at gcc.gnu.org and fortran at gcc.gnu.org.
Ah, thats a good idea. Thanks! Regards, Timo
Timo Schneider wrote: > Richard Maine <nos @see.signature> schrieb: >> glen herrmannsfeldt <g @ugcs.caltech.edu> wrote: >>> No hardware that I know of has instruction that operate on >>> complex data. There is no complex add or complex multiply >>> instruction. Operations are done on separate real and imaginary >>> parts using ordinary floating point instructions. >> Reread the original post. He was not talking about Fortran operations on >> the complex data. He specifically cited DMA transfer as the reason for >> the request and he said it was a specific array. Doesn't sound to me as >> though it directly had anything to do with it being complex, but just >> that this particular array was the one involved withthe DMA transfer. > Yes! I dont care about the type of the array, I just want the array to > be 16-byte aligned.
I can tell you how I would do it in C; you could probably do something similar in Fortran, although you might give up some portability. Start with the assumption that dynamic allocation will give you an address that is eight-byte aligned, so figure out how many real*8 elements you need, and add one. Do the allocation. Pass the returned address to a C routine that will tell you if it's on a 16-byte boundary. If it's on a 16-byte boundary, use that address. If it's not, check to make sure that it *is* on an 8-byte boundary, and use the address of the next element, which *will* be on a 16-byte boundary. I wouldn't be surprised if there were better, easier ways to do it. Louis
Louis Krupp <lkr@pssw.nospam.com.invalid> schrieb:
> Timo Schneider wrote: >> Richard Maine <nos @see.signature> schrieb: >>> glen herrmannsfeldt <g @ugcs.caltech.edu> wrote: >>>> No hardware that I know of has instruction that operate on >>>> complex data. There is no complex add or complex multiply >>>> instruction. Operations are done on separate real and imaginary >>>> parts using ordinary floating point instructions. >>> Reread the original post. He was not talking about Fortran operations on >>> the complex data. He specifically cited DMA transfer as the reason for >>> the request and he said it was a specific array. Doesn't sound to me as >>> though it directly had anything to do with it being complex, but just >>> that this particular array was the one involved withthe DMA transfer. >> Yes! I dont care about the type of the array, I just want the array to >> be 16-byte aligned. > I can tell you how I would do it in C; you could probably do something > similar in Fortran, although you might give up some portability.
I think I actually gaining "portability" by the aproach described below, because in that way I don't have to impose restrictions on the caller. (The code I write which reads the array is a library written in C but called by (mainly) Fortran). > Start with the assumption that dynamic allocation will give you an > address that is eight-byte aligned, so figure out how many real*8 > elements you need, and add one. Do the allocation. Pass the returned > address to a C routine that will tell you if it's on a 16-byte boundary. > If it's on a 16-byte boundary, use that address. If it's not, check > to make sure that it *is* on an 8-byte boundary, and use the address of > the next element, which *will* be on a 16-byte boundary.
Yeah, this is exactly what I do right now as a "workaround". The only problem with this aproach I see right now is that if I want to get the first element of an arry which is 8 but not 16 bytes aligned I have to read 8 bytes of "junk" before the start of the array. Now it could (theoreticaly) happen that the memory of these 8 bytes "junk" do not belong to my application an we get a segfault. Maybe you can think of a workaround for this or can prove me that the case discribed obove can't hapen? I *believe* that it can't happen if the pagesize is a multiple of 16, but I didn't look into this long enough to be sure about it or to be able to determine the pagesize (shoukd be somewhere inside the specs or kernel code). Regards, Timo
Timo Schneider <t @hrz.tu-chemnitz.de> wrote: > How do I tell gfortran to allign a complex*16 array at 16 byte > boundaries? This is necessary on the Cell BE architecture for being able > to do DMA transfer. In C I can write __attribute__((aligned(16))) and > it works. How can I do that with gfortran? If a static buffer will do the job, you can do the layout in C and get to it from Fortran by a common block. The old g77 method of matching a common to a C struct still seems to work with gfortran. This is not portable because it relies on gcc extensions and on the gcc/gfortran ABI, but that may be okay too, given that you are targetting a specific platform. $ cat test.f90 program test implicit none complex(kind(0.0d0)), dimension(10):: x common /coucou/ x print *, x end $ cat test2.c struct { double c[20]; } __attribute__((aligned(16))) coucou_= { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12,13,14,15,16,17,18,19,20, }; $ gfortran test.f90 test2.c && ./a.out ( 1.00000000000000 , 2.00000000000000 ) <snip> ( 19.0000000000000 , 20.0000000000000 ) -- pa at panix dot com
|
 |
 |
 |
 |
|