Common Unix Timers


There are a several Unix timers: etime, dclock, mclock, gettimeofday, gethrtime, etc. For portability it is best to write a function mytime which is called within your codes. Then you only need change the single function when moving your code to another machine.

There are four major issues when calling a timer:

  1. What is clock resolution, that is, how small of a time interval can be measured?
  2. What is overhead of calling the timer, that is, how much does it add on to the program execution time?
  3. How much does it perturb the results? A timer that performs a system call may flush the cache, for example, and then extra cost is entailed by the computer having to reload data from main memory into the cache. [The cache is a small fast memory that contains frequently used data, to cut down on memory access times. Details about caches will be later in the course.]
  4. What does the timer return?
For the distinction between an elapsed time and a delta clock, the first is used as
   time_1 = mytime()
       (Stuff to be timed goes here)
   time_for_Stuff = mytime() - time_1
while the delta timer is used as
   time_1 = mytime()
       (Stuff to be timed goes here)
   time_for_Stuff = mytime() 
Usually you want to find elapsed CPU time instead of wall clock time . This is particularly true for large scale scientific computing, where your job may spend a large amount of wall clock time swapped out. Also, if available an elapsed time clock is preferred to a delta time clock. You can always implement one, given the other (and it is a useful exercise to figure out how!), but a real common approach to timing parts of a program is to time a few sections that you suspect account for most of the time, and then to subtract the sum of those from the overall time to find time spent in "everything else".

Clock resolution typically ranges from around a few nanoseconds, which is considered ``high resolution'', up to 0.01 seconds, which is pretty sloppy. The overhead (function call penalty) for calling a timer on modern (> 2002 CE) machines is typically near negligble, but that changes from year to year so you should know how to measure it.

To avoid problems with resolution and overhead, follow the general rules whenever possible:

Why not use standard statistical techniques for judging the quality of a timing? The brief answer is that you should, but it does require rather sophisticated statistical methods and cannot be done blindly. Timings rarely follow a normal (Gaussian) curve - why? Furthermore, they frequently cluster around discrete quanta corresponding to system events happening or not happening (like swaps). The real question is why not look at all of the timing data? Plotting 100k data points in Matlab takes < 0.25 seconds on my three year old workstation from 2006, and plotting 1M data points takes 0.29008 seconds. It's dumb and/or lazy to not look at the data for outliers and strange values when it can be done faster than you can read this sentence. You can right-click and download the Matlab script I used to get those timings, and try some timings yourself to see how much data your system can realistically handle.

Recently there have been some efforts to create a uniform API for accessing timing and hardware performance information, regardless of what platform you are using. See the PAPI Project for one example. Likely there will be more movement towards standardization getting timing data, but right now tools like PAPI can require rebuilding a Unix kernel, and root or admin privileges in other OS's, which makes it impractical for students running on shared machines.

C/C++ have several timers, but the resolution is usually only claimed to be 0.01 seconds, the Posix standard. In practice, many C/C++ systems have much better resolution than that; you have to measure it. Fortran 90 has as part of the language function calls that return the clock's resolution and other information. This is OS-independent, making it practical for using across platforms as well. The routine that provides wall clock time is subroutine system_clock(count, count_rate, count_max) integer*8 count, count_rate, count_max where

Why is the count_max important? Because ...
Negative times. Scientific codes often run for a long time, possibly days or even weeks. Even on a smaller scale it often occurs that the clock rolls over, and when you take the difference between end time and start time, you get a negative number. When that happens, if the roll over occured only once, the correct time is given by
    call system_clock(count_start, count_rate, count_max)
    ...
    call system_clock(count_end, count_rate, count_max)
    time_used = count_end - count_start
    if (time_used < 0) time_used = time_used + count_max
If the code fragment was really long, then the clock may have rolled over multiple times. In that case you would need to insert timer calls into the timed section, and try to keep track of how many roll-overs occurred. My recommendation is in C to use the Unix epoch time (which won't roll over until 2038), and in Fortran to use date_and_time(). And arrange for 2038 to be a vacation year for yourself.