There are four major issues when calling a timer:
time_1 = mytime()
(Stuff to be timed goes here)
time_for_Stuff = mytime() - time_1
while the delta timer is used as
time_1 = mytime()
(Stuff to be timed goes here)
time_for_Stuff = mytime()
Usually you want to find elapsed CPU time instead of
wall clock time . This is particularly true for large scale
scientific computing, where your job may spend a large amount of wall clock
time swapped out.
Also, if available an elapsed time clock is preferred to a delta time clock.
You can always implement one, given the other (and it is a useful exercise
to figure out how!), but a real common approach to timing parts of a program
is to time a few sections that you suspect account for most of the time,
and then to subtract the sum of those from the overall time to find
time spent in "everything else".
Clock resolution typically ranges from around a few nanoseconds, which is considered ``high resolution'', up to 0.01 seconds, which is pretty sloppy. The overhead (function call penalty) for calling a timer on modern (> 2002 CE) machines is typically near negligble, but that changes from year to year so you should know how to measure it.
To avoid problems with resolution and overhead, follow the general rules whenever possible:
Why not use standard statistical techniques for judging the quality of a timing? The brief answer is that you should, but it does require rather sophisticated statistical methods and cannot be done blindly. Timings rarely follow a normal (Gaussian) curve - why? Furthermore, they frequently cluster around discrete quanta corresponding to system events happening or not happening (like swaps). The real question is why not look at all of the timing data? Plotting 100k data points in Matlab takes < 0.25 seconds on my three year old workstation from 2006, and plotting 1M data points takes 0.29008 seconds. It's dumb and/or lazy to not look at the data for outliers and strange values when it can be done faster than you can read this sentence. You can right-click and download the Matlab script I used to get those timings, and try some timings yourself to see how much data your system can realistically handle.
Recently there have been some efforts to create a uniform API for accessing timing and hardware performance information, regardless of what platform you are using. See the PAPI Project for one example. Likely there will be more movement towards standardization getting timing data, but right now tools like PAPI can require rebuilding a Unix kernel, and root or admin privileges in other OS's, which makes it impractical for students running on shared machines.
C/C++ have several timers, but the resolution is usually only claimed to be
0.01 seconds, the Posix standard. In practice, many C/C++ systems have much
better resolution than that; you have to measure it.
Fortran 90 has as part of the language function calls that return the clock's
resolution and other information. This is OS-independent, making it
practical for using across platforms as well. The routine that provides
wall clock time is
subroutine system_clock(count, count_rate, count_max)
integer*8 count, count_rate, count_max
where
call system_clock(count_start, count_rate, count_max)
...
call system_clock(count_end, count_rate, count_max)
time_used = count_end - count_start
if (time_used < 0) time_used = time_used + count_max
If the code fragment was really long, then the clock may have rolled
over multiple times. In that case you would need to insert timer calls
into the timed section, and try to keep track of how many roll-overs
occurred. My recommendation is in C to use the Unix epoch time (which
won't roll over until 2038), and in Fortran to use date_and_time().
And arrange for 2038 to be a vacation year for yourself.