Home     |     .Net Programming    |     cSharp Home    |     Sql Server Home    |     Javascript / Client Side Development     |     Ajax Programming

Ruby on Rails Development     |     Perl Programming     |     C Programming Language     |     C++ Programming     |     IT Jobs

Python Programming Language     |     Laptop Suggestions?    |     TCL Scripting     |     Fortran Programming     |     Scheme Programming Language


 
 
Cervo Technologies
The Right Source to Outsource

MS Dynamics CRM 3.0

Fortran Programming Language

OpenMP problem


Hi, my configuration is

Red Hat Linux 9.0
Intel Fortran Compiler (end debugger) 9.1
Intel 2x3.6 ghz
3.6 gig of ram

I have a program which manipulate large arrays (speed vectors for
direct numerical simulation, CFD), and I a trying to modify and debug
it with ifort and idb. The code contains openmp sections. When
compiled without the openmp option, everything looks fine, but when I
try to compile it with openmp, and I run it, I had crash early in the
run at trivial functions call. After some research, I thought it might
be linked to the stack size. I set "ulimit -s 2000000" and "export
KMP_STACKSIZE=2g" and the program get further when it run, but, I got
segmentation fault at a point later (I run it with 2 threads).
A step-by-step debugging show me that, inside an omp do loop, the
values of some SHARED variables where modified, but no line inside the
do loop does this modification. Since those variables store arrays
boundaries, once they are modified the do loop get outside the
corresponding array and I got segfault.

If you have any advice on this problem, I would really appreciate it,
since I have no clue why the openmp version version modify those
values.

Thanks,

J.D.

John Deas wrote:

(snip on OpenMP problem)

> (I run it with 2 threads).
> A step-by-step debugging show me that, inside an omp do loop, the
> values of some SHARED variables where modified, but no line inside the
> do loop does this modification. Since those variables store arrays
> boundaries, once they are modified the do loop get outside the
> corresponding array and I got segfault.

Not knowing OpenMP well, or anything about your program, it would
seem that there is nothing to keep the two threads together.  If
there is nothing in the loop modifying variables, then it would seem
that the other thread is doing it.  You need some control between
the threads so that variables modified by one are changed before
the other thread needs them.

-- glen

On May 18, 12:35 am, glen herrmannsfeldt <g@ugcs.caltech.edu>
wrote:

My problem looks like this:

!$OMP PARALLEL DEFAULT(SHARED)
!$OMP DO SCHEDULE(STATIC) PRIVATE(j,i)
      do j=j1,j2
        do i=i1,i2
          sol(i,j,k1) = sol(i,j,k1)/ack(k1)
        enddo
      enddo
!$OMP END DO

...
!$OMP END PARALLEL

nothing in any loop inside the omp section modify i1 nor i2, but,
before the first omp directive, i2 = 32, and, inside this section, i2
take the value 0 for example

J.D.

Try running with bounds checking on.  It looks as if you may be
overrunning an array bound somewhere.
On May 18, 7:10 am, Gib Bogle <g.bo@auckland.no.spam.ac.nz> wrote:

I will try this as soon as I can. But it seems to me that I already
know where I overrun an array (I know exactly where the segmentation
fault occur, inside the omp do loop).

No, you said i2 changes its value.  The question is how does this
happen.  Generally bounds overruns produce effects in unpredictable places.
On May 19, 12:18 am, Gib Bogle <b@ihug.too.much.spam.co.nz> wrote:

ok, excuse me if I am not clear on this issue. What I meant is that
since i2 change its value for no reason when it enter the omp section,
then in the loop i=i2 is outbound, then evaluating sol(i,j,k1) produce
the segfault. That is why the real question is why i2's value change.

sorry about the confusion

On May 19, 12:57 am, John Deas <john.d@gmail.com> wrote:

I see what you mean. Another array access might goes outbound and
reach the i2 value stored somewhere in the memory.

John Deas wrote:
> I see what you mean. Another array access might goes outbound and
> reach the i2 value stored somewhere in the memory.

Right.

John Deas wrote:

(snip)

>>>>>My problem looks like this:
>>>>>!$OMP PARALLEL DEFAULT(SHARED)
>>>>>!$OMP DO SCHEDULE(STATIC) PRIVATE(j,i)
>>>>>      do j=j1,j2
>>>>>        do i=i1,i2
>>>>>          sol(i,j,k1) = sol(i,j,k1)/ack(k1)
>>>>>        enddo
>>>>>      enddo
>>>>>!$OMP END DO
>>>>>...
>>>>>!$OMP END PARALLEL

(snip)

> ok, excuse me if I am not clear on this issue. What I meant is that
> since i2 change its value for no reason when it enter the omp section,
> then in the loop i=i2 is outbound, then evaluating sol(i,j,k1) produce
> the segfault. That is why the real question is why i2's value change.

Since you didn't show all the code, specifically where i2 gets
its value, it is hard to say.  Is there only one value ever assigned
to i2?  Are all threads supposed to have the same i2?  Is i2
ever supposed to be greater than 32?
(Making it zero should not cause a segfault.)

I don't know if it is important here, but note that the Fortran
standard requires the loop count to be determined at the beginning
of the loop.  I believe, though, that compilers are allowed to
notice that i2 doesn't change inside the loop and use i2 in
the loop termination test.  That fails if i2 can change from
other threads.  I would expect OpenMP compilers to consider this.

But in your case, with nested loops, the change in i2 will be seen
in the next iteration of i1, anyway.

-- glen

John Deas wrote:

(snip)

>>>>>>My problem looks like this:
>>>>>>!$OMP PARALLEL DEFAULT(SHARED)
>>>>>>!$OMP DO SCHEDULE(STATIC) PRIVATE(j,i)
>>>>>>      do j=j1,j2
>>>>>>        do i=i1,i2
>>>>>>          sol(i,j,k1) = sol(i,j,k1)/ack(k1)
>>>>>>        enddo
>>>>>>      enddo
>>>>>>!$OMP END DO
>>>>>>...
>>>>>>!$OMP END PARALLEL

What happens if you try:

!$OMP PARALLEL DEFAULT(SHARED)
!$OMP DO SCHEDULE(STATIC) PRIVATE(j,i)
       i2x=i2
       do j=j1,j2
         do i=i1,i2x
           sol(i,j,k1) = sol(i,j,k1)/ack(k1)
         enddo
       enddo
!$OMP END DO

does i2 ever change here?  I only just noticed that
there was something here.

!$OMP END PARALLEL

-- glen

On May 19, 2:42 am, glen herrmannsfeldt <g@ugcs.caltech.edu> wrote:

Thank you, I will take a look at all this monday morning and write
more about my code as you requested if necessary.
On May 19, 2:26 am, glen herrmannsfeldt <g@ugcs.caltech.edu> wrote:

Hi, thank you for you reply. I will post more of the code because I am
not very precise with you from the beginning, sorry.

So, here is what the code looks like. It is a code aimed at inversing
a tridiagonal matrix (1). As you can see, i2 value for example come as
an argument of the subroutine trvpjk3d. When debugging, I can check
that the value has been correctly assigned and do not vary during the
monothreaded part of the subroutine. Then, for no reason, when it
executes the first multithreaded instruction after the !$OMP DO
SCHEDULE(STATIC), values of i1, i2, j1, j2, k1, k2, which are supposed
to stay fixed are changed. In the code I run, dimension of sol is
sol(1:i2,1:j2,1,k2) and i1=j1=k1=1. And since i1...k2 contains the
boundaries of the sol array, they are never supposed to change, and in
all the threads the values are supposed to be the same.

      SUBROUTINE trvpjk3d(i1,i2,j1,j2,k1,k2,sol)

      use param

      IMPLICIT NONE

      REAL(KIND=8),dimension (m1,m2,m3) :: sol,fek

      integer i1,i2,i
      integer j1,j2,j
      integer k1,k2,k,ka,ke

      ka = k1 + 1
      ke = k2 - 1
!
***************init***********
!
!     forward elimination
sweep
!
      qqk(k1) = - apk(k1)/ack(k1)
      ssk(k1) = - amk(k1)/ack(k1)
      do k=ka,ke
        ppk(k) =1.d0/( ack(k) + amk(k)*qqk(k-1))
        qqk(k) = - apk(k)*ppk(k)
        ssk(k) = - amk(k)*ssk(k-1)*ppk(k)
      enddo
!
!     backward
pass
!
      ssk(k2) = 1.d0
      do k=ke,k1,-1
        ssk(k) = ssk(k) + qqk(k)*ssk(k+1)
      enddo
      ppk(k2)=1.d0/(apk(k2)*ssk(k1) + amk(k2)*ssk(ke)+ack(k2))
!
!
**************calcul***********
!
!     FORWARD ELIMINATION SWEEP
!
!$OMP PARALLEL DEFAULT(NONE) &
!$OMP SHARED
(k,ka,ke,k1,k2,j1,j2,i1,i2,sol,ack,amk,apk,ppk,qqk,ssk,fek)&
!$OMP PRIVATE(j,i)
!$OMP DO SCHEDULE(STATIC)
      do j=j1,j2
        do i=i1,i2
          sol(i,j,k1) = sol(i,j,k1)/ack(k1)
        enddo
      enddo
!$OMP END DO
      do k=ka,ke
!$OMP DO SCHEDULE(STATIC)
        do j=j1,j2
          do i=i1,i2
            sol(i,j,k) = ( sol(i,j,k) - amk(k)*sol(i,j,k-1))*ppk(k)
          enddo
        enddo
!$OMP END DO
      enddo
!
!     BACKWARD PASS

...

(for the clarity of the post I did not mention the rest of the source,
but no line explicitely reassign a new value to i1...k2)

I hope that this help describe my problem. Glen, since I am not at
work, I can not try to compile with

       i2x=i2
       do j=j1,j2
         do i=i1,i2x
           sol(i,j,k1) = sol(i,j,k1)/ack(k1)
         enddo
       enddo

to see what happen to i2 then, but I will try it as soon as possible
and post the result.

I really appreciate all the help you kindly provided me so far, many
thanks.

J.D.

(1) This code is a section from the CFD code to be found in the book
"Fluid Flow Phenomena - A Numerical Toolkit", by Paolo Orlandi,
Springer

> !$OMP PARALLEL DEFAULT(NONE) &
> !$OMP SHARED
> (k,ka,ke,k1,k2,j1,j2,i1,i2,sol,ack,amk,apk,ppk,qqk,ssk,fek)&
> !$OMP PRIVATE(j,i)
> !$OMP DO SCHEDULE(STATIC)
>       do j=j1,j2
>         do i=i1,i2
>           sol(i,j,k1) = sol(i,j,k1)/ack(k1)
>         enddo
>       enddo
> !$OMP END DO
>       do k=ka,ke
> !$OMP DO SCHEDULE(STATIC)
>         do j=j1,j2
>       ...

Here I don't understand : the instruction "do k=ka,ke" is directly
inside a parallel region. So, I suppose that it will be executed
nthread times. Do you want really this behavior ? I cannot imagine the
result. This is why I generally use the compact form "!OMP PARALLEL
DO ..." which avoids such situation.

Regards

F. Jacq

On May 23, 10:15 am, fj <francois.j@irsn.fr> wrote:

Me too, moreover, k, ka, ke are shared
Add to del.icio.us | Digg this | Stumble it | Powered by Megasolutions Inc