This document is slightly outdated! See cc65.txt and library.txt for a more up-to-date discussion. Discussion of some of the features/non features of the current cc65 version --------------------------------------------------------------------------- 1. Copyright 2. Differences to the original version 3. Known bugs and limitations 4. Library 5. Bugs 1. Copyright ----------- This is the original compiler copyright: -------------------------------------------------------------------------- -*- Mode: Text -*- This is the copyright notice for RA65, LINK65, LIBR65, and other Atari 8-bit programs. Said programs are Copyright 1989, by John R. Dunning. All rights reserved, with the following exceptions: Anyone may copy or redistribute these programs, provided that: 1: You don't charge anything for the copy. It is permissable to charge a nominal fee for media, etc. 2: All source code and documentation for the programs is made available as part of the distribution. 3: This copyright notice is preserved verbatim, and included in the distribution. You are allowed to modify these programs, and redistribute the modified versions, provided that the modifications are clearly noted. There is NO WARRANTY with this software, it comes as is, and is distributed in the hope that it may be useful. This copyright notice applies to any program which contains this text, or the refers to this file. This copyright notice is based on the one published by the Free Software Foundation, sometimes known as the GNU project. The idea is the same as theirs, ie the software is free, and is intended to stay that way. Everybody has the right to copy, modify, and re- distribute this software. Nobody has the right to prevent anyone else from copying, modifying or redistributing it. -------------------------------------------------------------------------- In acknowledgment of this copyright, I will place my own changes to the compiler under the same copyright. However, since the library and all binutils (assembler, archiver, linker) are a complete rewrite, they are covered by another copyright: -------------------------------------------------------------------------- CC65 C Library and Binutils (C) Copyright 1998 Ullrich von Bassewitz COPYING CONDITIONS This software is provided 'as-is', without any expressed or implied warranty. In no event will the authors be held liable for any damages arising from the use of this software. Permission is granted to anyone to use this software for any purpose, including commercial applications, and to alter it and redistribute it freely, subject to the following restrictions: 1. The origin of this software must not be misrepresented; you must not claim that you wrote the original software. If you use this software in a product, an acknowledgment in the product documentation would be appreciated but is not required. 2. Altered source versions must be plainly marked as such, and must not be misrepresented as being the original software. 3. This notice may not be removed or altered from any source distribution -------------------------------------------------------------------------- I will try to contact John, maybe he is also willing to place his sources under a less restrictive copyright, after all these years:-) 2. Differences to the original version -------------------------------------- This is a list of changes against the cc65 archives. I got the originals from: http://www.umich.edu/~archive/atari/8bit/Languages/Cc65/ * Removed all assembler code from the compiler. It was unportable because it made assumptions about the character set (ATASCII) and made the sources hard to read and to debug. * All programs do return an error code, so they may be used by make. All programs try to remove the target file, if there were errors. * The assembler now checks several error conditions (others still go undetected - see "known bugs"). * Removed many bugs from the compiler. One error was invalid code produced by the compiler that went through the assembler since the assembler did not check for ranges itself. * Removed many non-portable constructs from the compiler. Code cleanups, rewrite of the function headers and more. * New style function prototypes supported instead of the old K&R syntax. The new syntax is a must, that is, the old style syntax is no longer understood. As an extension, unnamed parameters may be used to avoid warnings about unused parameters. * New void type. May also be used as a function return type. * Changed the memory management in the compiler. Use malloc/free instead of the old homebrew (and unportable) stuff. * Default character type is unsigned. This is much more what you want in small systems environments, since a char is often used to represent a small numerical value, and the integer promotion does the wrong thing in those cases. Look at the follwing piece of code: char c = read_char (); switch (c) { case 0x80: printf ("c is 0x80\n"); break; default: printf ("c is something else\n"); break; } With signed chars, the code above, will *always* run into the default selector. c is promoted to int, and since it is signed, 0x80 will get promoted to 0xFF80 - which will select the default label. With unsigned chars, the code works as intended (but note: the code works for cc65 but it is non portable anyway, since many other compilers have signed chars by default, so be careful! Having unsigned chars is just a convenience thing). * Shorter code when using the builtin operators and the lhs of an expr is a constant (e.g. expressions like "c == 0x80" are encoded two bytes shorter). * Some optimizations when pushing constants. * Character set translation by the compiler. A new -t option was added to set the target system type. Use -t0 For no spefic target system (default) -t1 For the atari (does not work completely, since I did not have an ATASCII translation table). -t2 Target system is C64. -t3 Target system is C128. -t4 Target system is ACE. -t5 Target system is Plus/5. * Dito for the linker: Allow an option to set the target system and add code to the linker to produce different headers and set the correct start address. * Complete rewrite of the C library. See extra chapter. * Many changes in the runtime library. Splitted it into more than one file to allow for smaller executables if not all of the code is needed. * Allow longer names. Now the first 12 characters are sigificant at the expense of some more memory used at runtime. * String constants are now concatenated in all places. This allows things like: fputs ("Options:\n" " -b bomb computer\n" " -f format hard disk\n" " -k kill init\n", stderr); saving code for more than one call to the function. * Several new macros are defined: M6502 This one is old - don't use! __CC65__ Use this instead. Defined when compiling with cc65. __ATARI__ Defined when the target system is atari. __CBM__ Defined when compiling for a CBM system as target. __C64__ Defined when the C64 is the target system. __C128__ Defined when compiling for the 128. __ACE__ Defined when compiling for ACE. __PLUS4__ Defined when compiling for the Plus/4. The __CC65__ macro has the compiler version as its value, version 1.0 of the compiler will define this macro as 0x100. * The -a option is gone. * The compiler will generate external references (via .globl) only if a function is defined as extern in a module, or not defined but called from a module. The old behaviour was to generate a reference for every function prototype ever seen, which meant that using a header file like stdio.h got most of the C library linked in, even if it was never used. * Many new warnings added (about unused parameters, unused variables, compares of unsigneds against zero, function call without prototype and much more). * Added a new compiler option (-W) to suppress all warnings. * New internal variable __fixargs__ that gives the size of fixed arguments, a function takes. This allows to work (somehow) around the problem, that cc65 has the "wrong" (that is, pascal) calling order. See below ("Known problems") for a discussion. * The "empty" preprocessor directive ("#" on a line) is now ignored. * Added a "#error" directive to force user errors. * Optimization of the code generation. Constant parts of expressions are now detected in many places where the old compiler evaluated the constants at runtime. * Allow local static variables (there was code in the original compiler for that, but it did not work). Allow also initialization in this case (no code for that in the original). Local static variables in the top level function block have no penalty, for static variables in nested blocks, the compiler generates a jump around the variable space. To eliminate this, an assembler/linker with support for segments is needed. * You cannot return a value from a void function, and must return a value in a non-void function. Violations are flagged as an error. * Typedefs added. * The nonstandard evaluation of the NOARGC and FIXARGC macros has been replaced by a smart algorithm that does the same thing automagically and without user help (provided there are function prototypes). * Function pointers may now be used to call a function without dereferencing. Given a function void f1 (void (*f2) ()) the following was valid before: (*f2) (); The ANSI standard allows a second form (because there's no ambiguity) which is now also allowed: f2 (); * Pointer subtraction was completely messed up and did not work (that is, subtraction of a pointer from a pointer produced wrong results). * Local struct definitions are allowed. * Check types in assignments, parameters for function calls and more. * A new long type (32 bit) is available. The integer promotion rules are applied if needed. This includes much more type checking and a better handling of chars (they are handled as chars, not as ints, in all places where this is possible). * Integer constants now have an associated type, 'U' and 'L' modifers may be used. * The old #asm statement is gone. Instead, there's now a asm ("xxx") statement that has the syntax that is defined by the C++ standard (the C standard does not define an ASM statement). The string literal in parenthesis is inserted in the assembler output. You may also use __asm__ instead of asm (see below). * Allow // comments. * New compiler option -A (ANSI) that disables several extensions (asm directive, // comments, unnamed function parameters) and also defines a macro named __STRICT_ANSI__. The header files will exclude some non-ANSI functions if __STRICT_ANSI__ is defined (that is, -A is given on the command line). -A will not disable the __asm__ directive (identifiers starting with __ are in the namespace of the implementation). * Create optimized code if the address of a variable is a constant. This may be achieved by constructs like "*(char*)0x200 = 0x01" and is used to access absolute memory locations. The compiler detects this case also if structs or arrays are involved and generates direct stores and fetches. 3. Known problems ----------------- * No floats. * Only simple automatic variables may be initialized (no arrays). * "Wrong" order of arguments on the stack. The arguments are pushed in the order, the arguments are parsed. That means that the va_xxx macros in stdarg.h are ok (they work as expected), but the fixed parameters of a function with a variable argument list do not match and must be determined with the (non-standard) va_fix macro. Using the __fixargs__ kludge, it is possible to write standard conform va_xxx macros to work with variable sized argument lists. However, the fixed parameters in the function itself usually have the wrong values, because the order of the arguments on the stack is reversed compared to a stock C compiler. Pushing the args the other way round requires much work and a more elaborated intermediate code than cc65 has. To understand the problem, have a look at this (non working!) sprintf function: int sprintf (char* buf, char* format, ...) /* Non working version */ { int count; va_list ap; va_start (ap, format); count = vsprintf (buf, format, ap); va_end (ap); return count; } The problem here is in the "format" and "buf" parameters. They do (in most cases) not contain, what the caller gave us as arguments. To access the "real" arguments, use the va_fix macro. It is only valid before the first call to va_arg, and takes the va_list and the number of the fixed argument as parameters. So the right way would be int sprintf (char* buf, char* format, ...) /* Working version */ { int count; va_list ap; va_start (ap, format); count = vsprintf (va_fix (ap, 1), va_fix (ap, 2), ap); va_end (ap); return count; } The fixed parameter are obtained by using the va_fix macro with the number of the parameter given as second argument. Beware: Since the fixed arguments declared are usually one of the additional parameters, the following code, which tries to be somewhat portable, does *not* work. The assignment will overwrite the other parameters instead, causing unexpected results: int sprintf (char* buf, char* format, ...) /* Non working version */ { int count; va_list ap; va_start (ap, format); #ifdef __CC65__ buf = va_fix (ap, 1); format = va_fix (ap, 2); #endif count = vsprintf (buf, format, ap); va_end (ap); return count; } To write a portable version of sprintf, use code like this instead: int sprintf (char* buf, char* format, ...) /* Working version */ { int count; va_list ap; va_start (ap, format); #ifdef __CC65__ count = vsprintf (va_fix (ap, 1), va_fix (ap, 2), ap); #else count = vsprintf (buf, format, ap); #endif va_end (ap); return count; } I know, va_fix is a kludge, but at least it *is* possible to write functions with variable sized argument lists in a comfortable manner. * The assembler still accepts lots of illegal stuff without an error (and creates wrong code). Be careful! * When starting a compiled program twice on the C64 (or 128), you may get other results or the program may even crash. This is because static variables do not have their startup values, they were changed in the first run. * There's only *one* symbol table level. It is - via a flag - used for both, locals and global symbols. However, if you have variables in nested blocks, the names may collide with the ones in the upper block. I will probably add real symbol tables some time to remove this problem. * Variables in nested blocks are handled inefficiently, especially in loops. The frame on the stack is allocated and deallocated for each loop iteration. There's no way around this, since the compiler has not enough memory to hold a complete function body in memory (it would be able to backpatch the frame generating code on function entry). 4. Library ---------- The C library is a complete rewrite and has nothing in common with the old Atari stuff. When rewriting the library, I was guided by the following rules: * Use standard conform functions as far as possible. In addition, if there's a ANSI-C compatible function, it should act as defined in the ANSI standard. If if does not act as defined, this is an error. * Do not use non-standard functions if the functionality of those functions is covered by a standard function. Use exceptions only, if there is a non-ANSI function that is very popular (example: itoa). * Use new style prototpyes and header files. * Make the library portable. For example, the complete stdio stuff is based on only four system dependent functions: open, read, write, close So, if you rewrite these functions for a new system, all others (printf, fprintf, fgets, fputc ...) will work, too. * Do not expect a common character set. Unfortunately, I was not able to be completely consequent in this respect. C sources are no problem since the compiler does character translation, but the assembler sources make assumptions about the following characters: 0 --> code $30 + --> code $2B - --> code $2D All other functions (especially the isxxx ones) are table driven, so only the classification table is system dependent. The first port was for the ACE operating system. The current version has also support for the C64, the C128 and the Plus/4 in native mode. The ACE port has disk support but no conio module, all others don't have disk support but direct console I/O. Currently the following limitations the are known: * getwd (ace) does not work. I get an error (carry flag) with an error code of zero (aceErrStopped). Maybe my code is wrong... * The error codes are currently system error codes. They should be translated to something system independent. The ace codes are a good starting point. However, I don't like the idea, that zero is a valid error code, and some other codes are missing ("invalid parameter" and more). As soon as this is done, it is also possible to write a strerror() function to give more descriptive error messages to the user. * Many functions not very good tested. * The printf and heap functions are way too big. Rewritting _printf and malloc/free in assembler will probably squeeze 2K out of the code. * The isxxx functions do not handle EOF correctly. This is probably a permanent restriction, even if it is non-standard. It would require extra code in each of the isxxx functions, since EOF is defined as -1 and cannot be handled effectively with the table approach and 8 bit index registers. * The strcspn, strpbrk and strspn functions have a string length limitation of 256 for the second argument. This is usually not a problem since the second argument gives a character set, and a character set cannot be larger than 256 chars for all known 6502 systems. 5. Bugs ------- Please note that the compiler and the libraries are beta! Send bug reports to uz@cc65.org.