view docs/ccguide/chap1.chapter @ 617:fd2cb29abee5

Chapter 2 finished
author roug
date Sat, 30 Nov 2002 09:44:49 +0000
parents b8ed2006640e
children befc3fed07e5
line wrap: on
line source

<chapter>
<title>The C Compiler System</title>

<section>
<title>Introduction</title>
<para>
The "C" programming language is rapidly growing in popularity
and seems destined to become one of the most popular programming
languages used for microcomputers. The rapid rise in the use of C
is not surprising. C is an incredibly versatile and efficient
language that can handle tasks that previously would have required
complex assembly language programming.
</para>
<para>
C was originally developed at Bell Telephone Laboratories as an
implementation language for the UNIX operating system by Brian
Kernighan and Dennis Ritchie. They also wrote a book titled <quote>The
C Programming Language</quote> which is universally accepted as the standard
for the language. It is an interesting reflection on the language
that although no formal industry-wide <quote>standard</quote> was ever developed
for C, programs written in C tend to be far more portable between
radically different computer systems as compared to so-called
<quote>standardized</quote> languages such as BASIC, COBOL, and PASCAL. The
reason C is so portable is that the language is so inherently
expandable that is some special function is required, the user can
create a portable extension to the language, as opposed to the
common practice of adding additional statements to the language.
For example, the number of special-purpose BASIC dialects defies all
reason. A lesser factor is the underlying UNIX operating system,
which is also sufficiently versatile to discourage bastardization of
the language. Indeed, standard C compilers and Unix are intimately
related.
</para>
<para>
Fortunately, the 6809 microprocessor, the OS-9 operating
system, and the C language form an outstanding combination. The
6809 was specifically designed to efficiently run high-level
languages, and its stack-oriented instruction set and versatile
repertoire of addressing modes handle the C language very well. As
mentioned previously, UNIX and C are closely related, and because
OS-9 is derived from UNIX, it also supports C to the degree that
almost any application written in C can be transported from a UNIX
system to an OS-9 system, recompiled, and correctly executed.
</para>
</section>
<section>
<title>The Language Implementation</title>
<para>
OS-9 C is implemented almost exactly as described in 'The C
Programming Language' by Kernighan and Ritchie (hereafter referred
to as K&amp;R).
</para>
<para>
Allthough this version of C follows the specification faithfully,
there are some differences. The differences mostly reflect
parts of C that are obsolete or the constraints imposed by memory
size limitations.
</para>
</section>

<section>
<title>Differences from the K &amp; R Specification</title>
<para>
</para>
</section>

<section>
<title>Enhancements and Extensions</title>

<section>
<title>The <quote>Direct</quote> Storage Class</title>
<para>
</para>
</section>

<section>
<title>Embedded Assembly Language</title>
<para>
</para>
</section>

<section>
<title>Control Character Escape Sequences</title>
<para>
</para>
</section>
</section>

<section>
<title>Implementation-dependent Characteristics</title>
<para>
K &amp; R frequently refer to characteristics of the C language
whose exact operations depend on the architacture and instruction
set of the computer actually used. This section contains specific
information regarding this version of C for the 6809 processor.
</para>

<section>
<title>Data Representation and Storage Requirements</title>
<para>
</para>
</section>

<section>
<title>Register Variables</title>
<para>
</para>
</section>

<section>
<title>Access To Command Line Parameters</title>
<para>
The standard C arguments "argc" and "argv" are available to
"main" as described in K &amp; R page 110. The start-up routine for C
programs ensures that the parameter string passed to it by the
parent process is converted into null-terminated strings as expected
by the program. In addition, it will run together as a single
argument any strings enclosed between single or double quotes ("'" or '"').
If either is part of the string required, then the other
should be used as a delimiter.
</para>
</section>
</section>

<section>
<title>System Calls and the Standard Library</title>

<section>
<title>Operating System Calls</title>
<para>
</para>
</section>

<section>
<title>The Standard Library</title>
<para>
</para>
</section>
</section>

<section>
<title>Run-time Arithmetic Error Handling</title>
<para>
</para>
</section>

<section>
<title>Achieving Maximum Program Performance</title>

<section>
<title>Programming Considerations</title>
<para>
Because the 6809 is an 8/16 bit microprocessor, the compiler
can generate efficient code for 8 and 16 bit objects (CHARs, INTs,
etc.). However, code for 32 and 64 bit values (LONGs, FLOATs,
DOUBLEs) can be at least four times longer and slower. Therefore
don't use LONG, FLOAT, or DOUBLE where INT or UNSIGNED will do.
</para>
<para>
The compiler can perform extensive evaluation of constant
expressions provided they involve only constants of type CHAR, INT,
and UNSIGNED. There is no constant expression evaluation at
compile-time (except single constants and "casts" of them) where
there are constants of type LONG, FLOAT, or DOUBLE, therefore,
complex constant expressions involving these types are evaluated at
run time by the compiled program. You should manually compute the
value of constant expressions of these types if speed is essential.
</para>
</section>

<section>
<title>The Optimizer Pass</title>
<para>
</para>
</section>

<section>
<title>The Profiler</title>
<para>
</para>
</section>
</section>

<section>
<title>C Compiler Component Files and File Usage</title>
<para>
</para>

<section>
<title>Temporary Files</title>
<para>
</para>
</section>
</section>

<section>
<title>Running the Compiler</title>
<para>
</para>
</section>

<section>
<title>Compiler Option Flags</title>
<para>
The compiler recognizes several command-line option flags which
modify the compilation process where needed. All flags are
recognized before compilation commences so the flags may be placed
anywhere on the command line. Flags may be ran together as in "-ro",
except where a flag is followed by something else; see "-f=" and
"-d" for examples.
</para>
<para>
-A
suppresses assembly, leaving the output as assembler code in a
file whose name is postfixed ".a".
</para>
<para>
-E=&lt;number&gt;
Set the edition number constant byte to the number given. This is
an OS-9 convention for memory modules.
</para>
<para>
-O
inhibits the assembly code optimizer pass. The optimizer will
shorten object code by about 11% with a comparable increase in speed
and is recommended for production versions of de-bugged programs.
</para>
<para>
-P
invokes the profiler to generate function frequency
statistics after program execution.
</para>
<para>
-R
suppresses linking library modules into an executable program.
Outputs are left in files with postfixes ".r".
</para>
<para>
-M=&lt;memory size&gt;
will instruct the linker to allocate &lt;memory size&gt;
for data, stack, and parameter area. Memory size may be expressed
in pages (an integer) or in kilobytes by appending "k" to an 
integer. For more details of the use of this option, see the
"Memory Management" section of this manual.
</para>
<para>
-L=&lt;filename&gt;
specifies a library to be searched by the linker
before the Standard Library and system interface.
</para>
<para>
-F=&lt;path&gt;
overrides the above output file naming. The output file
will be left with &lt;filename&gt; as its name. This flag does not make
sense in multiple source mode, and either the -a or -r flag is also
present. The module will be called the last name in &lt;path&gt;.
</para>
<para>
-C
will output the source code as comments with the assembler code.
</para>
<para>
-S
stops the generation of stack-checking code. -S should only be
used with great care when the appication is extremely time-critical
and when the use of the stack by compiler generated code is fully
understood.
</para>
<para>
-D&lt;identifier&gt;
is equivalent to "#define &lt;identifier&gt;" written in
the source file. -D is useful where different versions of a program
are maintained in one source file and differentiated by means of the
"#ifdef" of "#ifndef" pre-processor directives. If the &lt;identifier&gt;
is used as a macro for expansion by the pre-processor, "1"(one) will
be the expanded "value" unless the form "-d&lt;identifier&gt;=&lt;string&gt;" is
used in which case the expansion will be &lt;string&gt;.
</para>
<table frame="none">
<title>Command Line and Option Flag Examples</title>
<tgroup cols="3">
<colspec colwidth="1.5in" colname="c1">
<colspec colwidth="1.5in" colname="c2">
<colspec colwidth="1.5in" colname="c3">
<thead>
    <row>
	<entry>command line</entry>
	<entry>action</entry>
	<entry>output file(s)</entry>
    </row>
</thead>
<tbody>
    <row>
	<entry>cc prg.c</entry>
	<entry>compile to an executable program</entry>
	<entry>prg</entry>
	<entry></entry>
    </row>
    <row>
	<entry>cc prg.c -a</entry>
	<entry>compile to assembly language source code</entry>
	<entry>prg.a</entry>
    </row>
    <row>
	<entry>cc prg.c -r</entry>
	<entry>compile to relocatable module</entry>
	<entry>prg.r</entry>
    </row>
    <row>
	<entry>cc prg1.c prg2.c prg3.c</entry>
	<entry>compile to executable program</entry>
	<entry>prg1.r, prg2.r, prg3.r, output</entry>
    </row>
    <row>
	<entry>cc prg1.c prg2.a prg3.r</entry>
	<entry>compile prg1.c, assemble prg2.a and combine all into
and executable program</entry>
	<entry>prg1.r, prg2.r</entry>
    </row>
    <row>
	<entry>cc prg1.c prg2.c -a</entry>
	<entry>compile to assembly language source code</entry>
	<entry>prg1.a, prg2.a</entry>
    </row>
    <row>
	<entry>cc prg1.c prg2.c -f=prg</entry>
	<entry>compile to executable program</entry>
	<entry>prg</entry>
    </row>
    </tbody>
</tgroup>
</table>

</section>
</chapter>