Project 3: Network Path Diagnostics (NPD)

Assigned: 11/05/09, Due: 11/24/09


Project Goal

The traceroute utility is commonly used for network path diagnostics and topology inference. In this project, you will build a toolkit that leverages traceroute to provide various pieces of information about network paths.

Project Specification

Input

Your program, npd, must be written in C or C++. It must run on CS Linux machines and accept the following four kinds of inputs:

  1. a host name
  2. an IP address
  3. a file containing multiple host names, one on each line
  4. a file containing multiple IP addresses, one on each line

For each type of input, your program must confirm that the destination IP or host name is syntactically valid (composed only of letters, numbers, the ".", and the "-" characters).

Output

If the input is a host name or an IP address, your program should execute the traceroute command and report on the following statistics:

  1. Traceroute complete: Determine if the traceroute reached its destination by the following criteria:
  2. Extent of reachability: If the traceroute does not complete, infer the extent of reachability using the following criteria, which help determine if the reachability was hurt closer to the destination or somewhere in the core of the Internet:
  3. Number of hops: Report on the number of hops traversed irrespective of whether the traceroute was successful or not. This statistic gives you an idea of average path length in the Internet.
  4. Starred hops: For both successful and unsuccessful traceroutes, report on the percentage of hops that only returned *s. This statistic gives you an idea of how often traceroute is blocked or incorrectly implemented.
  5. Number of unique networks traversed: Using the IP addresses and host names contained in the traceroute output, find out the unique networks traversed in the following three ways:
  6. Time: Report on the total time taken for the traceroute to finish, both for successful as well as unsuccessful traceroutes. This time should be parsed from the traceroute output, not measured by your program. Due to the possibility of *s, this time would be a lower bound. Ignore '!' if it shows up after a valid time entry in a row of traceroute output.

Statistics for files

If the input is a file containing host names or IP addresses, your program should still report on the same statistics as you would report for individual host names and IP addresses, except that you should now provide aggregate statistics, separately for failed and successful traceroutes. For example, instead of reporting on the number of hops traversed for individual destinations, report on how many successful and unsuccessful traceroutes went how many hops. Be sure to quote the total number each of successful and unsuccessful traceroutes.

Guidelines and Resources:

Extra Credit

For 10% extra credit, add the following statistics to your output for all types of input.

  1. Hops traversed at edges: To determine how the total hops in the traceroute output compare with those close to source and destination, find out the number of hops at the same BGP prefix, /24, and domain as the source and destination IP address and domain. For unsuccessful traceroutes, the number matching the destination may be zero.
  2. When computing the total time traceroute takes to complete, also report on how it compares with to the following times:

Deliverables and Grading

The grading will be based on the following three deliverables, all of which are due by 11:55 pm on the due date: