Back to main homepage Background of current staff Download electronic publications Download sample codes Search the website

Euler3d Benchmark Data

This page presents data for the STARS Euler3d CFD benchmark. The benchmark is intended to provide information about the relative speed of different processor, operating system, and compiler combinations for a multi-threaded, floating point, computationally intensive CFD code.

The older single-cpu STARS benchmark is presented on a separate page

Description of the Benchmark Testcase

The benchmark testcase is the AGARD 445.6 aeroelastic test wing. The wing uses a NACA 65A004 airfoil section and has a panel aspect ratio of 1.65, a taper ratio of 0.66, and a 45 degree quarter-chord sweep angle. This AGARD wing was tested at the NASA Langley Research Center in the 16-foot Transonic Dynamics Tunnel and is a standard aeroelastic test case used for validation of unsteady, compressible CFD codes. Figure 1 shows the CFD predicted Mach contours for a freestream Mach number of 0.960.

STARS CFD Benchmark: Mach Contours
Figure 1: AGARD 445.6 Mach Contours at Mach 0.960.

The benchmark CFD grid contains 1.23 million tetrahedral elements and 223 thousand nodes. The benchmark executable advances the Mach 0.50 AGARD flow solution. Our benchmark score is reported as a CFD cycle frequency in Hertz.

The euler3d benchmark source code is not publicly available. However, the benchmark code is derived exactly from the production CFD code with hard coded I/O and calculation routines. We use the Intel Fortran compiler (ifort 10.0). All floating point variables are Fortran's double precision (8 bytes). Parallelization is through OpenMP. Tim Cowan's dissertation provides an in-depth development and technical discussion of his euler3d CFD code. For comparison, source code for a 1D finite element CFD solver preceding the euler3d code is available .

An xml file (eulerbenchmark.xml) is available to view the raw results. Figure 2 is generated with Also, please send us an E-mail ( to report your benchmark result. We appreciate all submissions!

Figure 2: Euler3d cycle frequency versus Total CPU frequency.

Discussion of Results

The Euler3d benchmark is sensitive to chipset, memory, and cpu performance. For example, a E6700 Core 2 Duo on a certain Nforce4 motherboard gives a benchmark score of about 1.08. A reference Intel motherboard with the same chip gives a score of about 1.38. For our lab, that's a major difference because our simulations can run for a week, a month, or more!

The mulit-core results appear to roughly match Amdahl's law. For this memory intensive CFD program, expecting a linear speedup by adding more processors requires an increase in inter-processor data transfer (``bandwidth'').

A fundamental characteristic of unstructured grid computations is the requirement to scatter calculated variables across a non-continuous list of memory locations. Thus, we should conceptually expect the benchmark to be rate-limited by raw floating point calculation speed and the memory access speed. As these two are (mostly) sequential operations, both influence the final benchmark speed.

Score differences between WinXp, Vista and Linux are typically small. Moving to a 64 bit executables does not automatically improve performance (32 bit compilers appear well tuned!).

Intel's Fortran Compiler generates different code paths for different processor capabilities. Non-Intel processors may not trigger the fast code path (See more information). Yes, we are aware of this issue. Yes, it is quite annoying that Intel's compilers behave this way. According to an anonymous contributor, the difference on a modern AMD processor is about 10%. We actually expected the difference to be significantly greater! We are also aware of several fixes but have not applied them as of yet. Raw performance is king in the CFD world. That usually means using Intel's chips and Intel's compiler. For what it is worth, we don't consider Intel's code path scheduler a bug, the behavior is intentional, well-publicized, and likely a purely economics decision by Intel. We firmly believe in letting the market decide if Intel's strategy is worth the lost goodwill. Expecting or desiring non-voluntary fiat to enforce an arbitrary standard is nonsense (e.g. the FTC's "decision" regarding this issue) and anti-liberty. This issue is just one (of many) criterion we use when periodically selecting our lab's compiler. There are strong competitors to Intel's compilers; some are tantalizingly enticing with some benchmarks.


If you have an x86 compatible PC and want to see how your machine compares to those on this list, download the files provided below to run the benchmark testcase. The benchmark executable is a hard-wired version of STARS Euler3d that will solve one and only one problem. You'll also need the bm2.g3d file, which is a large binary data file defining the agard2 geometry. Since this is a hard coded benchmark, visualization and aerodynamics output files are not generated.

Benchmark data, in an xml format, is available to view the raw numbers. Also, please send us an E-mail to report your benchmark result. We appreciate all submissions!

From a command prompt window (cmd.exe), the command line format is: e3dbm <#steps> <#threads>. For quick experiments, we typically use between 1 to 5 steps (#steps). The default is 20 steps. The program will automatically detect and use the total number of processors in your system for the default number of threads (#threads). When possible on your N processor machine, send us N results when using 1 through N threads.

The benchmark outputs time (seconds) and score (Hz) information in a command-line window.

Benchmark Links

If you want to find more benchmark data regarding the performance of various processors, we recommend that you check-out the SPEC benchmark suite and related performance data at Also, TechReport provides reports and analysis for a wide range of up-to-date PC hardware.

Oklahoma State University