Fortran Programming Language
Intel Fortran on AMD CPU's with SSE3?
I know that this has been discussed in the past when SSE3 was not
supported by Opterons (we actually have a system with 4-Opterons without
support for SSE3 on which Intel Fortran does not perform well compared
to Intel CPU's with support for SSE3).
Now I am on the verge of buying a desktop computer, and I would like to
know your experience of intel fortran on AMD processors (particularly
Athlon 64 x2's that support SSE3) compared to that on Core Duo's. On
equal grounds, I am more inclined to AMD processors because of their
lower cost. On the other hand, if there is a major speed difference like
in the case of Opterons' without SSE3 support compared against Intel
Pentiums with SSE3 support, I would be more inclined to buy Intel Core Duo.
Your time and effort is much appreciated.
Maybe I'm missing your point. Are you claiming there is a known Fortran
example where SSE3 would help Opteron or Core  Duo?
An Opteron (2 dual core CPUs) ought to out-perform even a Core 2 Duo
(single dual core CPU). I certainly wouldn't recommend an obsolete Core
Duo (no 64-bit support, possibly still available in discounted laptops);
you'd probably prefer an obsolescent Opteron. If you can get the latest
Opteron 2 socket dual core for less money than a Core Duo (and equal
noise level), your choice is clear.
More to the point, compare your Opteron 2 socket with a Core 2 Quad
"Kentsfield," which may have similar performance on many (certainly not
all) Fortran applications which use the 4 cores. You may never get
around to checking out the SSE3 question.
Are you saying that bare CPU cost is all you consider, not the cost of a
full working system? If so, you probably don't want the top priced
version of either brand. You clearly excluded the cost of large
complements of RAM, which could make a real difference, beyond 4GB.
Maybe you have a point, if you are implying that the brand names are
In spite of obvious bias, I don't speak for any vendor involved....
Tim Prince articulated on 04/24/07 08:45:
Thanks for the answer. I might have incorrectly remembered the reason
for longer run times on AMD Opteron system (which was bought in December
2005) using -O3 -ip -ipo -xW but I do remember that the code was running
faster on my laptop (which has an intel Pentium M) using -O3 -ip -ipo
-xB. Then, somehow I started to think that the optimization in intel
fortran compiler is sort of biased for intel processors. My code
typically runs 24 hours on a Core Duo 2 1.66 GHz. May be this is
obvious, and may be it is also OK, I am not questioning that; I was just
wondering if people in general noted this difference and if people
thought that in general one should choose intel processors over AMD's if
one must use intel fortran compiler (which is provided for us by our
institution, so we don't have to pay for it, but any other compiler we
have to pay for, so we naturally don't want to try another compiler).
A point. We use/used 12 machines with AMD cpus for animation work. One
by one10 burnt out.
Four other machines (and also this one) were Intel. They all still
Sets of AMD replacement boards could not even be found to replace two
very different machines where the cpus seemed to work still.
Although all our machines (both kinds) were custom built by the same
(well known) vendor we just couldn't get replacements - they don't
make them any more.
There a conclusion there and a warning. I hear AMD got bought out?
On Apr 24, 6:33 am, FCC <fcca@REMOVEMEgmail.com> wrote:
>I was just If you look at the compiler performance comparison at www.polyhedron.com
> wondering if people in general noted this difference and if people
> thought that in general one should choose intel processors over AMD's if
> one must use intel fortran compiler (which is provided for us by our
> institution, so we don't have to pay for it, but any other compiler we
> have to pay for, so we naturally don't want to try another compiler).
on their Opteron test system, you'll see that the Intel compiler was
within 10% of the fastest tested compiler. Our goal is that you will
get the best performance with the combination of an Intel processor
and the Intel compiler, but that performance on AMD processors will be
at least as good with the Intel compiler as with any other compiler
(within some margin of error.)
There are too many variables in the comparison you made and I am
dubious that SSE3 has any real relevance. Given that your application
runs a long time, I'd think you'd want the fastest system you can
afford and that is likely to be based on an Intel Core 2 processor
(especially with the Intel price drop announced yesterday.) But if you
do end up with an AMD system, the Intel compiler is a fine choice for
Developer Products Division
User communities for Intel Software Development Products
Intel Fortran Support
My Fortran blog
FCC wrote: I might have incorrectly remembered the reason
> for longer run times on AMD Opteron system (which was bought in December Pentium M was never available in anything but single core single socket.
> 2005) using -O3 -ip -ipo -xW but I do remember that the code was running
> faster on my laptop (which has an intel Pentium M) using -O3 -ip -ipo
> -xB. Then, somehow I started to think that the optimization in intel
> fortran compiler is sort of biased for intel processors.
It's optimization requirements are somewhat strange, and there's no
reason why any compiler would go the last mile to match all of them.
gfortran -march=pentium-m -ftree-vectorize -funroll-loops does fairly
well also. The optimization requirement to use the combination of x87
scalar, plus a few specific SSE2 instructions, along with SSE2 vector
instructions, went away with Intel Core architecture. pentium-m code is
excellent for mixed single and double precision. You could prove almost
anything by selective choice of examples.
Pentium-m code ought to run OK on an Opteron, but not as well as P4
code, in normal cases. It's a big waste, if you have 3 idle cores.
Maybe you forgot to set affinity, so as to avoid spreading your single
thread data across all the memory banks and caches. Opteron was the
first popular machine where this made a big difference.
Affinity is still not much of an issue on Core 2 Duo, even under
Windows, since the L2 cache is shared.
FCC articulated on 04/23/07 19:39:
I realized that the whole SSE3 thing I mention above has nothing to do
with the problem I had back then (probably it was not even available
commercially then). Sorry for the superfluous aspects of my message.
Anyway, thanks for everyone for their answers. I ordered a system with
Core Duo 2 at 2.13 GHz with 4GB Ram, at almost the same price as the AMD
system, by downgrading the harddrive and monitor. I expect ifort and
this new computer to reduce my 24 hours of run time significantly, let
FCC wrote: Use the -xT option (/QxT on Windows) along with -O3 to generate the best
> Anyway, thanks for everyone for their answers. I ordered a system with
> Core Duo 2 at 2.13 GHz with 4GB Ram, at almost the same price as the AMD
> system, by downgrading the harddrive and monitor. I expect ifort and
> this new computer to reduce my 24 hours of run time significantly, let
> us see...
code for the Core 2 processor. You'll probably want to experiment with
-parallel or even OpenMP to take advantage of the second core. Also
consider installing a 64-bit OS and using the 64-bit Intel "EN64T"
compiler. If you need help with the Intel compiler, visit our user forum
(http://softwareforums.intel.com/) or contact us at Intel Premier Support.