Amber 6 (Sander) Compiler Comparisson (1 CPU)

Job = LADH Dimer (76723 Atoms)

PME = On

Input Card = imin=0, ntpr = 100, ntwr = 1000, ntwx = 250, cut = 12.0, nstlim = 500
dt=0.001, ntt = 1, scee = 1.2, temp0 = 300.0, tempi = 300.0, npscal = 1, ntc = 1, nmropt=0
ntf = 1, ntp = 1, ntb = 2, taup = 2.0, ntxo = 1, ioutfm = 0, ntx = 7, irest = 1

Compiler options used:

G77 (2.96-69) Defaults (LFS Support)
PGF77 (3.2-4a) Defaults (-tp p6)
PGF77 (3.2-4a) Portland Blas Library (-tp p6)
PGF77 (3.2-4a) Portland Blas Library -Mvect=sse (-tp p6)
PGF77 (3.2-4a) Portland Blas Library -Mvect=sse -r8 (-tp p6)
PGF77 (3.2-4a) Portland Blas Library -Mvect=prefetch (-tp athlon)

sizes.h

MAXINT = 10000000
MAXPR = 50000000
MAXREA = 15000000
MAXHOL = 800000
MAXDUP = 50000
MAX_RSTACK = 15000000
MAX_ISTACK = 200000
MAX_STACK_PR = 100
MAX_HEAP_PR = 100

System Specs

Dual Intel PIII 800EB
512MB PC133 Cas3 Ram
IBM 7200RPM Deskstar 46GB HD
MSI Dual PIII Motherboard 694D Via Chipset
GeForce2 GTS 32MB Graphics Card
Redhat 7.1 (Kernel 2.4.1)

Dual AMD Athlon 1.2GHz MP
1GB PC2100 DDR Ram
10,000RPM LVD160 SCSI Drive 36GB
Tyan Thunder K7 Board
Redhat Linux 7.1 (Kernel 2.4.7)

Timings

Compiler Options and Machine Total Time Scalar Sum Grad Sum FFT Recip Direct
           
P3/800 G77 Defaults 10392.53 302.99 638.08 513.85 2005.4 7300.11
P3/800 PGF77 Defaults 10592.53 238.65 568.17 559.38 1883.04 7733.12
P3/800 PGF77 Portland Blas 10580.44 239.98 569.16 571.57 1899.49 7702.26
P3/800 PGF77 Portland Blas SSE 10669.41 236.35 571.36 621.82 1928.98 7780.81
P3/800 PGF77 Portland Blas SSE -r8 10480.85 235.77 574.5 615.2 1940.49 7586.69
           
AMD/1.2 G77 Def 6315.12 178.06 382.59 333.45 1212.08 4414.84
AMD/1.2 PGF77 Portland Blas -tp p6 6299.3 115.53 367.31 357.64 1153.41 4519.02
AMD/1.2 PGF77 Portland Blas -tp athlon 6306.12 114.58 367.9 364.22 1154.14 4552.91

Conclusions = It makes very little difference what compilers / blas libraries / optimisations are used because what you seem to gain on one we lose on the other.  Therefore I suggest we stick with the GNU compilers as they seem to do a pretty good job.

ALTERNATE DUAL AMD MOTHERBOARD

Comparisson between Tyan K7 and Alternate AMD Chipset

1 CPU Sander 6 job as listed above

Tyan K7 = 6315.12 s

Alternate = 6269.36 s