Making Your Coding Life Easier

In many university classes, you rarely deal with large program management. Typically, you write code that resides in a single file, and modify and debug it until it seems to run correctly. Then you turn it in and hope you never see it again.

In reality (i.e., in companies and in university research) most codes are split into many different files with different people working simultaneously on them. This brings up two needs: the first is to coordinate the updating and changing of files with others and is called version control. The second is to avoid having to compiler every source file when just one of them changes. More than one project in scientific computing has been bogged down because it ended up taking over 24 hours to recompile the program, when only one function or file had changed.

A compiler must be able to compile the files separately, and then link all of them together into a working program. When a file changes, it is necessary to compile that file, and usually only a few others that depend on it. There is a mechanism called "make" or "makefile" to keep track of this automatically. Next to a text editor, it is probably the most important tool in a coder's toolchest. Make is the Unix version of this; other systems such as Microsoft Visual C++ usually come with similar program build facilities. If you use nmake under Windows and avoid dependencies on the MFC libraries, then usually your project can be shipped to a Unix system without too many headaches.

The name "make" also has the advantage of leading to great puns (making out, making your way, making a mess, etc.)

Basic Makefiles

You can use makefile for many tasks other than compiling. Compiling is a complicated way to use it, so we'll start with something else. The UNIX "touch" command does the following. When you type
  touch blurb
then: Here is a simple makefile:
blurb:
	touch blurb
\______/          /\
 < TAB>           < RET>
There must be a tab character at the start of the second line before the word "touch". If you download a sample makefile using cut and paste from a browser, it almost always changes the tab to spaces, which will cause using the makefile to fail with some cryptic message. If it does not work, check that first! In general, copying using the mouse in X-windows to grab the text, or clipboard in Windows, or even some versions of ftp will convert tabs to spaces - so check that when you transfer a makefile from one place to another. For P573, try to transfer tar'd, gzipped, or zipped copies of a makefile to avoid this problem.

The make file must be called "makefile" or "Makefile" (actually, it can be called anything and then invoked using the -f option for the make command, but for now let's keep things simple). When you type "make" in the directory where the makefile lives, in the example above it will print

  touch blurb
and actually execute that same command. If "blurb" did not exist in your directory you will see that it now does - although as an empty file. If you now type "make" again, you will see
  make: `blurb' is up to date.
When the file named in the first line exists and is "up-to-date", make recognizes nothing need be done and ... doesn't do anything.

Now add some lines to the file "makefile":

blurb: blob
	touch blurb

blob:
	touch blob
and remove the file "blurb" from your directory. When you type "make" you get
  touch blob
  touch blurb
which shows that "blob" was created first followed by "blurb". The way to read this is that
  1. blurb depends on the existence of blob.
  2. to create blob (which depends on nothing), you need to execute the command "touch blob".
If blob exists the above becomes
  1. blurb depends on the existence of blob, and if blob is more recent than blurb, (as indicated by the files' time stamp) you need to execute the next step.
  2. to create blob (which depends on nothing), you need to execute the command "touch blob".
Of course, the lines you added must to have a tab in front of the word "touch", not just spaces. So if you tried it and it did not work ... reread the previous part of this page.

L More generally, the makefile is grouped into "rules". Each rule looks like

< target >: < dependent 1 > < dependent 2 > ....
	< command >
	< command >
When you type "make" with no arguments, a recursive update function is called on the target of the first rule in the file. Make can also be called with an argument, e.g. "make blob", which calls update on a different target. The update function could be written like this in pseudo-code
update(target) {
  for (i in dependents(target)) do
    update(i)
  if ((exists(target) == false) or
      (date(target) < last_date(dependents(target))))
    execute commands
}
In the example above the first call to make causes it to invoke update(blurb). The rule in the makefile for blurb has blob as a dependent. So update(blob) is called. Helper has no dependents, and it doesn't exist, so the command "touch blob" is executed. Then update(blob) returns back to update(blurb). Since blurb doesn't exist, it is created as well. Yep, that is the usual "explanation of a recursive function call" that is pretty darned weird, but if you parse it carefully ... never mind.

The second time we type "make", update(blurb) calls update(blob) again, but now blob exists and since it has no dependents, the command "touch blob" is not executed. It then returns back to update(blurb). Now blurb exists and it has a dependent, but the dependent was created slightly before it. So the date comparison fails and blurb is unchanged.

Now you figure out what would happens if you do:

  1. "rm blob", and then "make"
  2. "rm blurb", and then "make"
  3. "touch blob", and then "make"
  4. "touch blurb", and then "make"
Unless you are really cocky and know makefiles well, you should try the above in some directory to doublecheck your answers.

The command in the rule can be any Unix command. Usually it is a command that creates the target. e.g. in the rules for blurb and blob, the touch command actually created files blurb and blob, or updated their dates. A common example is a "clean" rule, which cleans up the directory of all the debris created by the makefile:

blurb: blob
	touch blurb

blob:
	touch blob

clean:
	rm blurb blob
Since clean isn't in any list of dependents, it doesn't get called in any recursive update call. But if you type "make clean", make looks for the rule for "clean", notices that a file named "clean" does not exist, and executes the command "rm blurb blob". That causes blurb and blob to be deleted, but it doesn't cause a file named "clean" to be created. So the next time you type "make clean" it will still try to delete blurb and blob. "make clean" is a standard command to get rid of the individual binaries after a compiling several separate files, since they're usually not needed. The most common clean target is to remove object files and archives, as in "rm *.o *.a".

OK, if you are really cocky, what happens with this sequence:

  make
  touch clean
  make clean

Makefile Definitions

You can make definitions in the makefile, for the same reason you would make C preprocessor definitions (#define) in a C++ program. In the sample makefile, which C/C++ compiler to use is defined in the line
CC = g++
which compiler options to use in
OPTS = -O3 -Wall
which additional libraries to link in
LIBS = -lrt
and any required include file paths in
INC = -I/usr/include
When specifying how to compile the codes, the strings
$(CC) $(OPTS) $(INC) -c
are translated into the Unix command
g++ -O3 -Wall -I/usr/include -c 
The -c part says compile and create a .o file, but don't yet try linking everything together in an executable. That is done in the stanza that says how to create the executable file named "time_example". You can define anything you want, but note
  1. Although you do not have to do so, it is best to make the things you define in all capitals, so they can be spotted easier.
  2. To use a defined quantity it must be proceeded by a dollar sign and enclosed in parentheses. Using $OPTS above would have failed.

Even for the small timing example, a makefile is not overkill: it is far easier to just type "make" a year from now, than it is a year or 10 from now to try to figure out what mystic invocation will compile the code you wrote. Using definitions also is a big help, even for programs with just a few files. Changing to a different compiler should just involve changing the definition in one place, instead hunting it down dozens of occurrences of "g++" and having to change them to "icc" or "xlc".

Makefile Substitution Rules

OK, you really don't want to know this. Instead of having to specify (with two lines for each file) how to compile all 4000 functions that comprise your MegaBlaster program, you can give a suffix rule. For example,
.C.o:
        $(CC) -c $(CCFLAGS) $<
says that every file with a suffix of ".C" should have a corresponding object file with suffix ".o" created, and the way to do that is to apply the command
cxx -c -O3 -check
on every file with suffix ".C". The " < " expands to allow this to be carried out for all files in the current directory with the corresponding suffix.

How to REALLY Use Makefiles

Nobody in their right mind ever writes a makefile from scratch. Instead, we just copy one over from someone else or from another project, and edit it to handle the current project.

More Makefile Madness

Many people in scientific computing use makefiles for working on a paper using Latex - each section is put into a different file, and each is put into a makefile stanza. This is especially helpful if the paper is being jointly authored with multiple people working on it at the same time. But then version controls systems like CVS or SVN become vital, another tool anyone doing coding work in any area should learn.