Mercurial > hg > Members > kono > nitros9-code
changeset 629:befc3fed07e5
Chapitre 1 est fini
author | roug |
---|---|
date | Wed, 04 Dec 2002 21:04:16 +0000 |
parents | 661c14ca83e8 |
children | 73c67b086b74 |
files | docs/ccguide/chap1.chapter |
diffstat | 1 files changed, 492 insertions(+), 2 deletions(-) [+] |
line wrap: on
line diff
--- a/docs/ccguide/chap1.chapter Wed Dec 04 16:09:42 2002 +0000 +++ b/docs/ccguide/chap1.chapter Wed Dec 04 21:04:16 2002 +0000 @@ -60,8 +60,35 @@ <section> <title>Differences from the K & R Specification</title> -<para> -</para> +<itemizedlist spacing="compact"> +<listitem><para> +Bit fields are not supported. +</para></listitem> +<listitem><para> +Constant expressions for initializers may include arithmetic +operators only if all the operands are of type INT or CHAR. +</para></listitem> +<listitem><para> +The older forms of assignment operators, '=+' or '=*', which +are recognized by some C compilers, are not supported. You +must use the newer forms '+=','*=' etc. +</para></listitem> +<listitem><para> +"#ifdef (or #ifndef) ...[#else...] #endif" is supported but +"#if <constant expression>" is not. +</para></listitem> +<listitem><para> +It is not possible to extend macro definitions or strings +over more than one line of source code. +</para></listitem> +<listitem><para> +The escape sequence for new-line '\n' refers to the ASCII +carriage return character (used by OS-9 for end-of-line), not +linefeed. (hex 0A). Programs which use '\n' for end-of-line +(which includes all programs in K & R), will still work +properly. +</para></listitem> +</itemizedlist> </section> <section> @@ -70,18 +97,137 @@ <section> <title>The <quote>Direct</quote> Storage Class</title> <para> +The 6809 microprocessor instructions for accessing memory via +an index register or the stack pointer can be relatively short and +fast when they are used in C programs to access "auto" (function +local) variables or function arguments. The instructions for +accessing global variables are normally not so nice and must be four +bytes long and correspondingly slow. However, the 6809 has a nice +feature which helps considerably. Memory, anywhere in a single page +(256 byte block), may be accessed with fast, two byte instructions. +This is called the "direct page", and at any time its location is +specified by the contents of the "direct page register" within the +processor. The linkage editor sorts out where this could be, and +it need not concern the programmer, who only needs to specify for +the compiler which variables should be in the direct page to give +the maximum benefit in code size and execution speed. +</para> +<para> +To this end, a new storage class specifier is recognized by the +compiler. In the manner of K & R page 192, the sc-specifier list +is extended as follows: +<informaltable frame="none"> +<tgroup cols="3"> +<colspec colwidth="1.0in"> +<colspec colwidth="1.0in"> +<colspec colwidth="1.0in"> +<tbody> + <row> + <entry>Sc-specifier:</entry> + <entry>auto</entry> + <entry></entry> + </row> + <row> + <entry></entry> + <entry>static</entry> + <entry></entry> + </row> + <row> + <entry></entry> + <entry>extern</entry> + <entry></entry> + </row> + <row> + <entry></entry> + <entry>register</entry> + <entry></entry> + </row> + <row> + <entry></entry> + <entry>typedef</entry> + <entry></entry> + </row> + <row> + <entry></entry> + <entry>direct</entry> + <entry>(extension)</entry> + </row> + <row> + <entry></entry> + <entry>extern direct</entry> + <entry>(extension)</entry> + </row> + <row> + <entry></entry> + <entry>static direct</entry> + <entry>(extension)</entry> + </row> +</tbody> +</tgroup> +</informaltable> +The new key word may be used in place of one of the other sc-specifiers, +and its effect is that the variable will be placed in +the direct page. "DIRECT" creates a global direct page variable. +"EXTERN DIRECT" references an EXTERNAL-type direct page variable; +and "STATIC DIRECT" creates a local direct page variable. These new +classed may not be used to declare function arguments. "Direct" +variables can be initialized but will, as with other variables not +explicitly initialized, have the value zero at the start of program +execution. 255 bytes are available in the direct page (the linker +requires one byte). If all the direct variables occupy less than the +full 255 bytes, the remaining global variables will occupy the +balance and memory above if necesary. If too many bytes or storage +are requested in the direct page, the linkage editor will report an +error, and the programmer will have to reduce the use of DIRECT-type +variables to fit the 256 bytes addressable by the 6809. +</para> +<para> +It should be kept in mind that "direct" is unique to this +compiler, and it may not be possible to transport programs written +using "direct" to other environments without modification. </para> </section> <section> <title>Embedded Assembly Language</title> <para> +As versatile as C is, occasionally there are some things that +can only be done (or done at maximum speed) in assembly language. +The OS-9 C compiler permits user-supplied assebly-language +statements to be directly embedded in C source programs. +</para> +<para> +A line beginning with "#asm" switches the compiler into a mode +which passes all subsequent lines directly to the assembly-language +output, until a line beginning with "#endasm" is encountered. +"#endasm" switches the mode back to normal. Care should be +exercised when using this directive so that the correct code section +is adhered to. Normal code from the compiler is in the PSECT (code) +section. If your assembly code uses the VSECT (variable) section, +be sure to put a ENDSECT directive at the end to leave the state +correct for following compiler generated code. </para> </section> <section> <title>Control Character Escape Sequences</title> <para> +The escape sequences for non-printing characters in character +constants and strings (see K & R page 181) are extended as follows: +<programlisting> + linefeed (LF): \l (lower case 'ell') +</programlisting> +This is to distinguish LF (hex 0A) from \n which on OS-9 is the same +as \r (hex 0D). +<programlisting> + bit patterns: \NNN (octal constant) + \dNNN (decimal constant) + \xNN (hexadecimal constant) +</programlisting> +For example, the following all have a value of 255 (decimal): +<programlisting> + \377 \xff \d255 +</programlisting> </para> </section> </section> @@ -98,12 +244,94 @@ <section> <title>Data Representation and Storage Requirements</title> <para> +Each variable type requires a specific amount of memory for +storage. The sizes of the basic types in bytes are as follows: +</para> +<informaltable frame="none"> +<tgroup cols="3"> +<colspec colwidth="0.8in"> +<colspec colwidth="0.4in"> +<colspec colwidth="3.0in"> +<thead> +<row> +<entry>Data Type</entry> +<entry>Size</entry> +<entry>Internal Representation</entry> +</row> +</thead> +<tbody> +<row> +<entry>CHAR</entry> +<entry>1</entry> +<entry>two's complement binary</entry> +</row> +<row> +<entry>INT</entry> +<entry>2</entry> +<entry>two's complement binary</entry> +</row> +<row> +<entry>UNSIGNED</entry> +<entry>2</entry> +<entry>unsigned binary</entry> +</row> +<row> +<entry>LONG</entry> +<entry>4</entry> +<entry>two's complement binary</entry> +</row> +<row> +<entry>FLOAT</entry> +<entry>4</entry> +<entry>binary floating point (see below)</entry> +</row> +<row> +<entry>DOUBLE</entry> +<entry>8</entry> +<entry>binary floating point (see below)</entry> +</row> +</tbody> +</tgroup> +</informaltable> +<para> +This compiler follows the PDP-1 implementation and format in +that CHARs are converted to INTs by sign extension, "SHORT" or +"SHORT INT" means INT, "LONG INT" means LONG and "LONG FLOAT" means +DOUBLE. The format for DOUBLE values is as follows: +</para> +<screen> +(low byte) (high byte) ++-+---------------------------------------+----------+ +! ! seven byte ! ! +! ! mantissa ! ! ++-+---------------------------------------+----------+ + ^ sign bit +</screen> +<para> +The for of the mantissa is sign and magnitude with an implied +"1" bit at the sign bit position. The exponent is biased by 128. +The format of a FLOAT is identical, except that the mantissa is only +three bytes long. Conversion from DOUBLE to FLOAT is carried out by +truncating the least significant (right-most) four bytes of the +mantissa. The reverse conversion is done by padding the least +significant four mantissa bytes with zeros. </para> </section> <section> <title>Register Variables</title> <para> +One register variable may be declared in each function. The +only types permitted for register variables are int, unsigned and +pointer. Invalid register variable declarations are ignored; i.e. +the storage class is made auto. For further details see K & R page 81. +</para> +<para> +A considerable saving in code size and speed can be made by +judicious use of a register variable. The most efficient use is +made of it for a pointer or a counter for a loop. However, if a +register variable is used for a complex arithmetic expression, there +is no saving. The "U" register is assigned to register variables. </para> </section> @@ -128,12 +356,39 @@ <section> <title>Operating System Calls</title> <para> +The system interface supports almost all the system calls of +both OS-9 and UNIX. In order to facilitate the portability of +programs from UNIX, some of the calls use UNIX names rather than +OS-9 names for the same function. There are a few UNIX calls that +do not have exactly equivalent OS-9 calls. In these cases, the +library function simulates the function of the corresponding UNIX +call. In cases where there are OS-9 calls that do not have UNIX +equivalents, the OS-9 names are used. Details of the calls and a +name cross-reference are provided in the "C System Calls" section of +this manual. </para> </section> <section> <title>The Standard Library</title> <para> +The C compiler includes a very complete library of standard +functions. It is essential for any program which uses functions +from the standard library to have the statement: +<programlisting> + "#include <stdio.h> +</programlisting> +See the "C Standard Library" section of this manual for details on +the standard library functions provided. +</para> +<para> +IMPORTANT NOTE: If output via printf(), fprintf() or sprintf() of +long integers is required, the program MUST call "pflinit()" at some +point; this is necessary so that programs not involving LONGS do not +have the extra LONGs output code appended. Similarly, if FLOATs or +DOUBLEs are to be printed, "pffinit()" MUST be called. These functions +do nothing; existence of calls to them in a program informs +the linker that the relevant routines are also needed. </para> </section> </section> @@ -141,6 +396,29 @@ <section> <title>Run-time Arithmetic Error Handling</title> <para> +K & R leave the treatment of various arithmetic errors open, +merely saying that it is machine dependent. This implementation +deal with a limited number of error conditions in a special way; it +should be assumed that the results of other possible errors are +undefined. +</para> +<para> +Three new system error numbers are defined in <errno.h>: +<programlisting> + #define EFPOVR 40 /* floating point overflow of underflow */ + #define EDIVERR 41 /* division by zero */ + #define EINTERR 42 /* overflow on conversion of floating point + to long integer */ +</programlisting> +</para> +<para> +If one of these conditions occur, the program will send a +signal to itself with the value of one of these errors. If not +caught or ignored, the will cause termination of program with +an error return to the parent process. However, the program can +catch the interrupt using "signal()" or "intercept()" (see C System +Calls), and in this case the service routine has the error number as +its argument. </para> </section> @@ -171,12 +449,35 @@ <section> <title>The Optimizer Pass</title> <para> +The optimizer pass automatically occurs after the compilation +passes. It reads the assembler source code text and removes +redundant code and searches for code sequences that can be replaced +by shorter and faster equivalents. The optimizer will shorten object +code by about 11% with a significant increase in program execution +speed. The optimizer is recommended for production versions of +debugged programs. Because this pass takes additional time, the "-O" +compiler option can be used to inhibit it during error-checking-only +compilations. </para> </section> <section> <title>The Profiler</title> <para> +The profiler is an optional method used to determine the +frequency of execution of each function in a C program. It allows +you to identify the most-frequently used functions where algorithmic +or C source code programming improvements will yield the greatest +gains. +</para> +<para> +When the "-P" compiler option is selected, code is generated at +the beginning of each function to call the profiler module (called +"_prof"), which counts invocations of each function during program +execution. When the program has terminated, the profiler +automatically prints a list of all functions and the number of times +each was called. The profiler slightly reduces program execution +speed. See "prof.c" source for more information. </para> </section> </section> @@ -184,11 +485,112 @@ <section> <title>C Compiler Component Files and File Usage</title> <para> +Compilation of a C program by cc requires that the following +files be present in the current execution directory (CMDS). +</para> + +<table frame="none"> +<title>OS-9 Level I Systems</title> +<tgroup cols="2"> +<colspec colwidth="1.0in"> +<colspec colwidth="3.0in"> +<tbody> + <row> + <entry>cc1</entry> + <entry>compiler executive program</entry> + </row> + <row> + <entry>c.prep</entry> + <entry>macro pre-processor</entry> + </row> + <row> + <entry>c.pass1</entry> + <entry>compiler pass 1</entry> + </row> + <row> + <entry>c.pass2</entry> + <entry>compiler pass 2</entry> + </row> + <row> + <entry>c.opt</entry> + <entry>assembly code optimizer</entry> + </row> + <row> + <entry>c.asm</entry> + <entry>relocating assembler</entry> + </row> + <row> + <entry>c.link</entry> + <entry>linkage editor</entry> + </row> +</tbody> +</tgroup> +</table> + + +<table frame="none"> +<title>OS-9 Level II Systems</title> +<tgroup cols="2"> +<colspec colwidth="1.0in"> +<colspec colwidth="3.0in"> +<tbody> + <row> + <entry>cc2</entry> + <entry>compiler executive program</entry> + </row> + <row> + <entry>c.prep</entry> + <entry>macro pre-processor</entry> + </row> + <row> + <entry>c.comp</entry> + <entry>compiler proper</entry> + </row> + <row> + <entry>c.opt</entry> + <entry>assembly code optimizer</entry> + </row> + <row> + <entry>c.asm</entry> + <entry>relocating assembler</entry> + </row> + <row> + <entry>c.link</entry> + <entry>linkage editor</entry> + </row> +</tbody> +</tgroup> +</table> +<para> +In addition a file called "clib.l" contains the standard library, +math functions, and systems library. The file "cstart.r" is +the setup code for compiled programs. Both of these files must be +located in a directory named "LIB" on the system's default mass +storage device, which is specified in the OS-9 "INIT" module and is +usually the disk drive the system is booted from. +</para> +<para> +If, when specifying "#include" files for the pre-processor to +read in, the programmer uses angle brackets, "<" and ">", instead of +parentheses, the file will be sought starting at the "DEFS" +directory on whichever drive is the default system drive for the +system running. </para> <section> <title>Temporary Files</title> <para> +A number of temporary files are created in the current data +directory during compilation, and it is important to ensure that +enough space is available on the disk drive. As a rough guide, at +least three times the number of blocks in the largest source file +(and its included files) should be free. +</para> +<para> +The identifiers "etext", "edata", and "end" are predefined in the +linkage editor and may be used to establish the addresses of the end +of executable text, initialized data, and uninitialized data +respectively. </para> </section> </section> @@ -196,6 +598,94 @@ <section> <title>Running the Compiler</title> <para> +The are two commands which inlvoke distinct versions of the +compiler. "cc1" is for OS-9 Level I which uses a two pass compiler, +and, "cc2" is for Level II which causes a single pass version. Both +versions of the compiler works identically, the main difference is +that cc1 has been divided into two passes to fit the smaller memory +size of OS-9 Level I systems. In the following text, "cc" refers to +either "cc1" or "cc2" as appropiate for your system. The syntax of +the command line which calls the compiler is: +</para> +<cmdsynopsis> + <command>cc</command> + <arg>option-flags</arg> + <arg rep="repeat" choice="plain"><replaceable>file</replaceable></arg> +</cmdsynopsis> +<para> +One file at a time can be compiled, or a number of files may be +compiled together. The compiler manages the compilation up +to four stages: pre-processor, compilation to assembler code, +assembly to relocatable code, and linking to binary executable +code (in OS-9 memory module format). +</para> +<para> +The compiler accepts three types of source files, provided each +name on the command line has the relevant postfix as shown below. +Any of the above file types may be mixed on the command line. +</para> +<table frame="none"> +<title>File Name Suffix Conventions</title> +<tgroup cols="3"> +<colspec colwidth="0.5in"> +<colspec colwidth="3.0in"> +<thead> +<row> + <entry>Suffix</entry> + <entry>Usage</entry> +</row> +</thead> +<tbody> +<row> + <entry>.c</entry> + <entry>C source file</entry> +</row> +<row> + <entry>.a</entry> + <entry>assembly language source file</entry> +</row> +<row> + <entry>.r</entry> + <entry>relocatable module</entry> +</row> +<row> + <entry>none</entry> + <entry>executable binary (OS-9 memory module)</entry> +</row> +</tbody> +</tgroup> +</table> +<para> +There are two modes of operation: multible source file and +single source file. The compiler selects the mode by inspecting +the command line. The usual mode is single source and is specified +by having only one source file name on the command line. Of +course, more than one source file may be compiled together by using +the "#include" facility in the source code. In this mode, the +compiler will use the name obtained by removing the postfix from the +name supplied on the command line, and the output file (and the +memory module produced) will have this name. For example: +<screen> + cc prg.c +</screen> +will leave an executable file called "prg" in the current execution +directory. +</para> +<para> +The multiple source mode is specified by having more than one +source file name on the command line. In this mode, the object code +output file will have the name "output" in the current execution +directory, unless a name is given using the "-f=" option (see +below). Also, in multiple source mode, the relocatable modules +generated as intermediate files will be left in the same directories +as their corresponding source files with the postfixes changed to +".r". For example: +<screen> + cc prg1.c /d0/fred/prg2.c +</screen> +will leave an executable file called "output" in the current +execution directory, one file called "prg1.r" in the current data +directory, and "prg2.r" in "/d0/fred". </para> </section>