C is our favorite language as it provides reasonably high level constructs while maintaining very good performance. The majority of the code we write for for both vintage computers and microcontrollers computers is in C. In this article we compare several C compiler for the 6502 to understand the performance and limitations of each.
Compiler Lineup
Our default option for 6502 C compiler is cc65 and we will use this as the baseline for all out testing. The contenders are gcc-6502, llvm-mos, sdcc-6502, kickc, vbcc and the wdc compiler. For easy comparison our testing will focus on the C64 but most of the results apply to all supported targets.
cc65
This is the default choice for all our 6502 code. This compiler provides an excellent standard library and provide cross platform API for many system resources. Its compliance with the C standard is also excellent. It has very good support for mixing C and assembly. The major drawbacks are lack of floating point and the mediocre quality of the generated code.
gcc-6502
GCC is our favorite compiler on our linux workstations. We use to compile native code and cross compile for several MCUs. While the compiler is heavily biased towards 32-bit machines, its mature gcc-avr port has demonstrated that the compiler can generate excellent code even for tiny MCUs. The 6502 port is unfortunately unmaintained.
kickc
This is the the only compiler that is not written in C/C++ (it's in Java). Another feature that makes it stand apart from the other compilers is that one of its goals is to output human readable assembly code. Unfortunately, the compiler impose several limitations on the code (missing unions, no recursion, no floating point, no runtime multiplication and division, ...) with the promise of high performance. Working with KickC feels more like assembly++ than C. KickC failed to compile 7 out of the 11 benchmarks. Also compile times are significantly worse than any of the competitors.
vbcc
VBCC seems to produce the fastest code of all the tested compilers. It was the fastest in 5 out of 11 tests and in most other cases it was a close second. Its implementation is fairly complete and its documentation is excellent. Licensing is however a weak point. Source code is provided with unclear licensing. The documentation clearly states that redistribution is only allowed without modifications and vbcc cannot be used for commercial purposes without licensing.
sdcc-6502
This is a new entry in the 6502 C compiler space. Its implementation of the standard is also fairly complete. Performance is quite good for such a young implementation. It fares reasonably well against all the other compilers. Support for mixing C and assembly is average. The assembler uses a non standard syntax and the linker configuration is far from intuitive.
LLVM-mos
LLVM and clang are building blocks of many modern high performance compilers. The work is currently in the early stage and the compiler is considered pre-beta. The type of large scope optimizations that both llvm and gcc are able to apply is a huge potential for improvement. LLVM-mos was not benchmarked as requested on their web page.
WDC compiler
Western Design Center distributes a compiler to support their 65C02 and 65C816 chips. The compiler was not benchmarked due to lack of a suitable C64 SDK.
Testing Methodology
We wrote a small common routine to call the various benchmarks and measure time intervals using the hardware TOD clock on the commodore 64. Originally we wanted to use the classic BYTE nbench. Unfortunately the code will not compile with most of the compilers in the lineup. We ended up looking for common simpler algorithms. The current benchmarks are: CRC8, CRC16, CRC32, Sieve, Sieve_bit, PI, factorial, FP exponential, puff (zlib decompress), dhrystone and aes256. Coremark is planned for a future revision of the benchmarks.
The CRCx tests compute the corresponding x-bit CRC of the C64 kernel ROM. The test stresses logical operations for the char, int and long types.
Sieve and Sieve_bit compute the prime numbers using the sieve of Eratosthenes algorithm. Sieve uses an array of bytes to mark the non primes and sieve_bit uses a bitfield to conserve memory. Both stress loops and array accesses, sieve_bit adds additional logical operations.
PI computes 160 digits of PI. It stresses primarily integer multiplication and division with a bit of loops and array access.
The factorial test is a naive implementation using recursive function calls. It stresses recursion and the use of local variables on the stack.
The FP exponential is also a naive implementation using recursive function calls. It stresses recursion, the use of local variables on the stack and some FP operations.
Puff decompresses zlib compressed data. It stresses array accesses and bit operations. Most of the tested compilers had trouble generating a working executable.
Dhrystones is a classic computer benchmark. Its performance largely depends on the quality of some standard library routines (memcpy, strcpy, strcmp).
Aes256 encrypts the C64 kernel. It stresses array accesses and bit operation.
Results
Time to execute the benchmark in seconds (lower is better). FC indicates failed to compile due to missing features. FE means the generated executable did not complete the test or produced incorrect results.
Benchmark | cc65 | gcc | KickC | vbcc | sdcc |
---|---|---|---|---|---|
2.18 | 8.4.1 | 0.8.5 | 0.9h/0.3 | 4.2.0 | |
CRC8 | 3.3 | 2.0 | 1.8 | 2.1 | 1.8 |
CRC16 | 4.6 | 3.3 | 3.6 | 3.1 | 2.7 |
CRC32 | 38.9 | 13.3 | FC | 7.7 | 4.5 |
Sieve | 23.1 | 12.6 | 16.5 | 13.3 | 21.8 |
Sieve bit | 70.7 | FE | 28.2 | 24.1 | 28.9 |
PI | 104.7 | 120.2 | FC | 96.4 | 96.3 |
Fact | 238.2 | 176.0 | FC | 187.2 | 171.9 |
FP pow | FC | 8.7 | FC | 23.6 | 30.7 |
puff | 53.4 | FE | FC | 20.8 | FE |
dhrystone | 10.0 | 7.8 | FC | 2.6 | 6.6 |
aes256 | 195.4 | 46.0 | FC | 34.8 | 92.9 |
The graph below shows the relative performance of each compiler vs. CC65 (higher number equals better performance)
Standard Headers
std | header | cc65 | gcc | KickC | vbcc | sdcc |
---|---|---|---|---|---|---|
C89 | assert.h | YES | YES | - | YES | YES |
C89 | ctype.h | YES | - | YES | YES | YES |
C89 | errno.h | YES | YES | YES | YES | YES |
C89 | float.h | - | YES | - | YES | YES |
C89 | limits.h | YES | YES | - | YES | YES |
C89 | locale.h | YES | - | - | YES | - |
C89 | math.h | - | YES | YES*1 | YES | YES |
C89 | setjmp.h | YES | - | - | YES | YES |
C89 | signal.h | YES | - | - | YES | YES |
C89 | stdarg.h | YES | YES | - | YES | YES |
C89 | stddef.h | YES | YES | - | YES | YES |
C89 | stdio.h | YES | YES | YES*1 | YES | YES |
C89 | stdlib.h | YES | YES | YES | YES | YES |
C89 | string.h | YES | YES | YES | YES | YES |
C89 | time.h | YES | YES | YES | YES | YES |
C95 | iso646.h | YES | YES | - | - | YES |
C95 | wchar.h | - | - | - | - | YES |
C95 | wctype.h | - | - | - | - | - |
C99*2 | complex.h | - | - | - | - | - |
C99 | fenv.h | - | - | - | - | - |
C99 | inttypes.h | YES | - | - | YES | - |
C99 | stdbool.h | YES | YES | - | YES | YES |
C99 | stdint.h | YES | YES | YES | YES | YES |
C99 | tgmath.h | - | - | - | - | - |
C11 | stdalign.h | - | YES | - | - | YES |
C11*2 | stdatomic.h | - | - | - | - | YES |
C11 | stdnoreturn.h | - | YES | - | - | YES |
C11*2 | thread.h | - | - | - | - | - |
C11 | uchar.h | - | - | - | - | YES |
C23 | stdckdint.h | - | - | - | - | YES |
POSIX | unistd.h | YES | - | - | - | - |
*1 These files are present in the distribution but they are severely incomplete and/or just placeholders.
*2 Header is optional in the standard
Features
cc65 | gcc | KickC | vbcc | sdcc | |
---|---|---|---|---|---|
float | NO | YES | NO | YES | YES |
file I/O | YES | NO | NO | YES | NO |
c64 printf | YES | Broken | Broken | YES | YES |
recursion | YES | YES | NO | YES | YES*1 |
license | ZLIB | GPL | MIT | Non-Free | GPL |
maintained | YES | NO | YES | YES | YES |
*1 SDCC recursion is not enabled by default. Functions that need reentrancy must be marked __reentrant.
Final Remarks
Three out of the five compilers tested have major weaknesses that disqualified them from being considered for our use. VBCC, despite the excellent performance, has a horrible licensing which is not open source and it specifically prohibit commercial use. GCC has been unmaintained for too long and hopes to revive it are pretty slim. KickC is promising from the performance standpoint, but currently its lack of major language features disqualifies it from being considered.
This leaves just CC65 and SDCC. CC65 is considered the standard for 6502. It has a long history, is well maintained and has a large helpful community. The compiler has a hand optimized assembly library and an extensive collection of drivers and runtimes for many 6502 platforms. The assembler and linker are excellent as well. Its major weaknesses are low performance and lack of floating point.
SDCC also has a long history but the 6502 port is quite new. Its library is fairly complete but not well optimized. Runtime support is currently limited to the Commodore 64 and only the conio library is implemented.
Despite its lower performance, thanks to the extensive library and support, CC65 is still our recommended choice. We look forward to see improvements in KickC, LLVM-mos and SDCC.
Attachment | Size |
---|---|
bench8_v1.1.tar.gz | 71.07 KB |