The Essentials of C.
(Under Construction)



Online and Hard Copy References


The C Programming Language may have been one of the most elegant and beautiful computer languages ever devised; a personal opinion. I'm not sure how fond I am of the new ANSI standard for it. Let me describe the original C as designed by Ken Thompson and Dennis Ritchie.

See more of the history at Dennis Ritchie's http://cm.bell-labs.com/cm/cs/who/dmr/chist.html.

C is a small language, with about 30 reserved words. Technically, C itself knows nothing of I/O; this is contained in the standard library of routines. C is usually augmented by a preprocessor and a preprocessing language that provides for definitions of constants, macros, includes, and conditional compilation. C is a structured language; its central object is that of 'function'. A C program is a function, which may call other functions. Functions take data type arguments and return data type values. Functions are organized or grouped into modules (files).

C is often called a 'low level' language. It also happens to be, in principle, a high level language. Like Assembler languages, C incorporates data reference 'by indirection', or by 'pointers'. Because of this, it becomes a 90% replacement for assembly language and useful in the writing of operating systems, unix in particular. Memory is addressable by pointers; only registers may not be so accessed as these are machine specific. It allows conversions of data types freely. It allows the definitions of new compound data types of unbounded complexity. It is the opposite of Cobol (yuch!) requiring a minimal number of keystokes to write a statement; it can be laconic to the point of inpenetrable terseness. It is a free form language, ignoring white space. Because of its great freedom and minimal vocabulary, genius can write poetry in it while ....

It has been said, because of its small core structure, and short list of reserved words, that C is simple. To learn the structure and the reserved words and their meanings is easy; to learn what to do with them and how to use them to create programs that actually do something interesting is anything but simple. Among human languages, most programming lanugauges are like English with a simple usage and an enormous vocabulary; C is more like many Amerind languages, Cherokee, in particular, where the "root vocabulary" is relatively small and literacy or fluency in not so much a matter of memorizing the huge vocabulary, but is more a matter of inventive usage of the syntactical combining forms that gives precision, color, style and complex expressiveness. It is very easy to make a fool of yourself in C. There is a fine line in writing good C code that avoids brutish and pldding obviousness resulting in to much code to do a simple thing, and that also avoids using every arcane compression of statements available so that six months later, even the coder is scratching his or her head about what the code does. If you learn C well enough to do that and can't resist your own cleverness, then document internally what's going on, and spare yourself grief later. The formating style is free though there are established conventions. Develop whatever style suits you and stick with it. The ease of reading your own code will be greatly increased if the style of format remains the same.

The most common source of confusing error in newbie code is mistaking an address (pointer) for a variable. Classic is the "scanf gotcha". Standard library functions of the scanf family always take as arguments pointers. While the name of an array is understood as a pointer to the first element of the first dimension of the array (Its value is an address), the name of an int float or double variable is not a pointer, but just a variable (It's value is number). In the writing of operating systems, this variable just might be an address and so, syntactically, not using a pointer where one might be called for is allowed by the compiler. It is up to the coder to know what is proper. If an impropriety of pointers is committed, the program will compile but bomb on execution with a segmentation violation.

The second most common error in C coding will also give a segmentation violation. It is "simply" allowing an index of an array to exceed the declared bounds of the array. Other typres of errors are either to simple or too subtle and specific to discuss here. Vode and you will err; I think that means learning.

What follows is a mini tutorial in getting started with C, assuming that you are working under a unix system. I'm not going to define the language; there are full length books for that. Nor will I tell all, that needs a few more books. This is just to get started making and explaining a few supersimple programs.

The simplest and classic 'first program' in C looks like this


   /* first.c
    * 
    * Our first C program.
    * 
    */
   #include <stdio.h>

   int main( argc, argv )
   int argc;
   char *argv[];
   {
        puts( "Hello World!" );
        exit( 0 );
   }

There are a number of variations that I could have used; there are things I could have left out. But, I tend to be explicit and write things in, even when I know they would be understood by the compiler as 'default'. It's my style, and who's writing this anyhow.

This code is presumably kept in a file 'first.c' (C code should always have the '.c' extension; in unix it's always lower case.) The compilation line is then

          cc first.c

     which produces a compiled program file called 'a.out', or

          cc -o first first.c

     which produces a compiled program file called 'first'.

A /* Comment */ is delimited by the two digraphs indicated. My extra vertically aligned stars are just a matter of style and mean nothing to the compiler. They are just part of that which is being ignored in the first place. COMMENTS DO NOT NEST.

The line


#include <stdio.h>

is a directive to the C preprocessor 'cpp'. All directives to cpp are lines whose first character are '#', and conversely. The include statement says to read into the code file the contents of a file 'stdio.h' before going to the compiler 'cc'. The enclosing angle brackets indicate that the file is to be found in a standard directory that cpp knows about. On a unix system this is 'always' the directory /usr/include. It is interesting to look at this file to see what it contains.

The function which is the program is always named 'main'. Notice that main takes two arguments that are separated by a comma. Comma is the separator for function arguments; it has another function also, but I'll get to that later. The function main() always returns an 'int' data type, so I didn't have to say so explicitly. Also, main() can have a third argument that contains environmental variables of the shell. I'm ignoring that. Notice that I've also ignored the values of the command line variables argc and argv. For this program, I could have main() as

   main()
   {
        puts( "Hello World!" );
   }
The int argc is the number of arguments on the command line, INCLUDING THE PROGRAM NAME ITSELF. The argv variable is a pointer to array of char. For the time being think of it as a two dimensional array or matrix of characters, where
          argv[0]  =  program name (as it is actually called!)
          argv[1]  =  the first argument
          argv[2]  =  the second argument
                 ...
          argv[argc-1]  =  the last argument
Our little program doesn't need any arguments, so in this case it doesn't matter.

Many useful programs, in keeping with the unix philosophy are small and are contained in a single module (file). The following is a list of links to program and header files of such utilities:

  1. addays.c
  2. ark.c
    A text file archiver, formated manual, nroff manual
  3. ascii.h
  4. chmog.c
  5. chmogdir.c
  6. chog.c
  7. datediff.c
  8. ftotal.c
    column totaling, with manual
  9. grafit.c
    A function, not a program
  10. gtfield.c
  11. gtln.c
  12. rmi.c
  13. sdate.c
  14. stdfun.h
  15. suffx.c
  16. tree.c
  17. uid.c
  18. A Simple Involutional encryption program (Alan Filipski)
  19. daemon.c
  20. A Module of miscellaneous C functions
  21. A Module of some I/O C functions

Because of the different flavors of unix type OSs, these are not necessarily guaranteed to work, but with a few adjustments, there is no reason why they can't work. The code is pretty much documented internally, which is good practice. It is free for the taking. If you do and have difficulties making them work ON A UNIX TYPE SYSTEM, email me, and I will try to help. Forget them for DOS or anything like it - why waste your time. Most of the programs can be particularly useful from within shell scripts. The functions in the module are improvements and additions to the standard libraries. I'll probably be adding some other general utilities to the list in the future.

The function puts() is a standard function from the standard I/O library which puts the string data type (delimited by double quotes) which it expects to "standard out" (for the time being, your screen) appending a newline character. Note: a literal string, in C, is treated as a pointer to char; which is to say, puts() expects as an argument a data type pointer to char. If one added the line


     puts( argv[0] )

the calling name of the program would be printed out on a separate line.

NOTE: for security reasons most nonprivate unix systems are set so that the shell does not look in your current directory for an executable program. If "command not found" or something similar results from your trying to run the program by its file name, try

           ./first
  or
           ./a.out
depending on which compiler command you used. This tells the shell that the file to be executed is found in ".", unix shorthand for "the current directory".

The included file stdio.h for the standard function library contains I/O definitions, macros, and declarations of returns of standard functions. The standard functions are mostly I/O functions. For string manipulating functions like strlen(), strcmp(), strncmp() strcpy() and others, be sure to have the header file string.h included also. These functions are also in the standard library so you do not have to link with another library in the compile command. To use mathematical functions (mostly returning data type 'double', i.e., double precision floating decimal) you will need to link the math library with '-lm' in the compile command line and include the header file math.h. There are other function libraries and other header files.

Let's write a slightly more complicated program to illustrate a few more locutions of C. The following is a cheap version of the unix utility 'cat'. In real life, cat takes multiple file arguments reading them from left to right in sequence and copies each file to standard out, so that


               cat foo1 foo2 foo3 foo4 > FOO
will copy the contents of all the foo files into FOO, thus con(cat)enating the files. To start, write a program that reads a file character by character, and copies it to standard out.

   /* cat1.c
    * 
    * Simplified 'cat' program
    * 
    */
   #include <stdio.h>

   int main( argc, argv )
   int argc;
   char *argv[];
   {
        FILE *rptr;
        int c;

        /* Open the named file with contingency
         * fopen() requests a file pointer to read the file whose
         * name is the command line first argument.  Its return value
         * is assigned to the declared variable rptr.  If that value
         * tests to be equal to zero, fopen has failed.  The type
         * coercion of 0 to that of pointer to data type FILE is
         * a matter od taste.  Commonly NULL is on the RHS of the
         * test ==, and this is usually defined as pointer to type
         * char.  Notice that the error message is printed to stderr,
         * the standard error output stream and not to stdout.  In
         * that case, the error message which one would want to see
         * diappears into FOO with usage: cat1 foo > FOO.
         * fprintf() is the general standard function for creating
         * strings from variable values and writing them to some
         * data stream.  Normal unix exit (from the shell, exit status)
         * is zero.  Exiting with error value is program dependent,
         * but it should always be nonzero.
         */
        if( (rptr = fopen( argv[1], "r" )) == (FILE *)0 )    {
            fprintf( stderr, "%s: cannot read %s\n", argv[0], argv[1] );
            exit( 1 );
        }

        /* If we're here, the file has been successfully opened
         * for reading.  Read and write chars in a while loop.
         */

        while( (c = getc( rptr )) != EOF )    {
              putchar( c );
        }

        /* The exiting of the process will close rptr, so an explicit
         * fclose is not really necessary.  Using fclose explicitly
         * I consider to be good form and healthy practice.  If this
         * were a function called repeatedly, and rptr was not closed
         * before the function returned, there would be an accumulation
         * of open files, soon exceeding the system limit.  The fact
         * that rptr would be local to the function would not save you.
         */
        fclose( rptr );

        exit( 0 );
   }


---- TO BE CONTINUED ----




                                  ONLINE REFERENCES
  1. Using and Porting GNU CC - Table of Contents [Link]
  2. C Programming FAQs Errata [Link]
  3. Annotations on K&R II (J. Blustein) [Link]
  4. comp.lang.c Frequently Asked Questions [Link]
  5. Infrequently asked Questions in comp.lang.c [Link]
    This is an in-joke!
                                 HARD COPY REFERENCES

        Kernighan, Brian W.; Dennis M. Ritchie, The C Programming Language
        Prentice-hall (1978)

             The one and only Bible of pre ANSI C, with examples of
             code, but very little on how to get your code compiled.
             Thanks to Michael Somos for pointing out that the second
             edition does, in fact, cover the ANSI standard.

        Zahn, C. T., C Notes: A Guide to the C Programming Language,
        Yourdon Press (1979)

              Examples of code, and much dwelling on just how complicated
              the concept of data type can become in C.

        Harbison, Samuel P.; Guy L. Steele Jr., C: A reference Manual,
        (third edition) Prentice-Hall (1991)

             This covers the ANSI standard, is well written, but with
             only a modicum of example code.
          



Go to Top of Metayoga Pages


Go to Home Page


The URL for this document is:
http://graham.main.nc.us/~bhammel/graham/C.html
Created: 1997
Last Updated: September 8, 2000
Email me, Bill Hammel at
bhammel@graham.main.nc.us READ WARNING BEFORE SENDING E-MAIL