|
|
![]() |
| | OpenIMPACT | Current Status | Software Releases | FAQ | |
Please report problems or results back to the Gelato mailing list, or directly to the maintainer (cernekee at crhc dot uiuc dot edu).
| zlib-1.1.4.tar.gz | Original zlib 1.1.4 source code (official release). |
| gzio.patch | Patch for the buffer overflow in gzio.c. |
| libz-nocspec.a | zlib without control speculation - will work on all kernels, with or without the general speculation patch. |
| libz-cspec.a | zlib with control speculation. This library requires a kernel built with the general speculation patch. |
Benchmarking and Compilation:
gcc (baseline) - Compiled with gcc-2.96 -O3
eccprof - Compiled with Intel's ecc-7.0 -O3 and profile-guided
optimization
oicc_nocspec - Compiled with oicc -O3 --no-control-speculation
and profile-guided optimization
oicc_cspec - Compiled with oicc -O3 and profile-guided
optimization
The CVS version of OpenIMPACT (oicc) from 03/01/2003 was used to build
these files.
The machine used was an unloaded 900MHz zx2000 (Itanium II, 4 GB RAM) running kernel 2.4.20 with the control speculation patch.
The inputs used for benchmarks and regression tests were 3-6 megabyte files containing code, ASCII text, zeroes, and random numbers.
Performance:
| minigzip benchmark results (graph) | |||||
| gcc (baseline) | eccprof | oicc (spec) | oicc (nospec) | Debian testing stock | |
| ia32binaries | 1.00x | 1.39x | 1.73x | 1.59x | 1.00x |
| ia32libs | 1.00x | 1.42x | 1.80x | 1.63x | 1.00x |
| ia64binaries | 1.00x | 1.37x | 1.69x | 1.54x | 1.00x |
| ia64libs | 1.00x | 1.38x | 1.66x | 1.54x | 1.00x |
| shakespeare | 1.00x | 1.38x | 1.69x | 1.60x | 1.00x |
| urandom | 1.00x | 1.30x | 1.47x | 1.37x | 1.00x |
| zero | 1.00x | 1.13x | 1.13x | 1.04x | 1.00x |
Notes:
ia32libs and ia32binaries: oicc's performance advantages on
these tests is largely due to reduced NOPs per cycle. oicc's ability to
distinguish between sequential and independent operations allows it to
schedule independent operations in parallel more often than compilers that
do less intensive analysis of data flow.
zero performance was mostly due to reduced NOPs and slightly better
success at branch prediction. There is some room for improvement here,
though, as gcc's binary produced fewer data cache related stalls.
# untar files
tar zxvf zlib-1.1.4
tar zxvf gel-chatr-0.0.tgz
cp libz-*.a zlib-1.1.4/
cd zlib-1.1.4
# build gcc -O3 version
make CFLAGS="-DHAVE_UNISTD_H -DUSE_MMAP -O3" minigzip
mv minigzip minigzip-gcc
# build optimized versions
gcc minigzip.o -o minigzip-cspec -L. -lz-cspec
gcc minigzip.o -o minigzip-nocspec -L. -lz-nocspec
../chatr/chatr -r minigzip-cspec
# dry run to copy the benchmark data into the buffer cache
./minigzip-gcc < /tmp/linux-2.4.20/vmlinux > /dev/null
# measure execution time
time ./minigzip-gcc < /tmp/linux-2.4.20/vmlinux > /dev/null
time ./minigzip-nocspec < /tmp/linux-2.4.20/vmlinux > /dev/null
time ./minigzip-cspec < /tmp/linux-2.4.20/vmlinux > /dev/null
References