Port scanner software can probe machines for open ports. Port scanners are often used by network administrators to verify the security of machines in their network. They are of interest to Internet miscreants as well, since attackers can use them to find machines to compromise simply by probing for software versions with known vulnerabilities. An example of a well known, open source port scanner is Nmap.
In this project you will create a basic port scanner, written for network administrators interested in ensuring that machines on their network run only expected services. In addition to helping you gain experience with socket programming, this project will help you appreciate the interplay of various implementation of firewalls, transport protocols and operating systems.
Most of the scan types mentioned below are only available to privileged users since scanners send and receive raw packets, which requires root access on Unix systems. While you can experiment some on your personal PCs where you have administrator privileges, we will release a list of CS machines where you would have adequate privileges for this project. Also, note that each scan type is limited in what it can infer.
The basic idea behind a port scanner is simple: Given an IP address of a machine and a list of interesting TCP or UDP ports to scan, the scanner will connect on each port using TCP or UDP sockets, make a determination of whether or not the port is open based on success of the connection request and close the socket before moving on to the next port to scan.
Your scanner should run on CS Linux machines and you must write it in C/C++. An administrator would invoke it as: "./portScanner [option1, ..., optionN]". Implement the following options:
Details of each option are given below:
Output: After each invocation, the portScanner should output a succinct summary of the list of open TCP and UDP ports on each IP address. Additionally, for each open port, it will include the name of the service that is likely running. In the case when an IP prefix is asked to be scanned, this information should be at the prefix granularity. To find services associated with ports [1-1024], visit http://www.iana.org/assignments/port-numbers.
When multiple scans are used, the results for each scan should be included in the output. Additionally, the combined inference from all of the scans run should be listed. Doing this requires you to determine a way to combine possibly contradictory results from different scans. Prepared to justify your method to the evaluator during the code review.
Timeouts and malicious targets: As the machines you are expected to scan are not under your control, your program should be designed to handle arbitarily bad responses from the remote machine. No response from the remote machine should cause your program to crash. Likewise, your program should timeout in a reasonable period of time (generally a handful of seconds), no matter how the remote machine responds.
Expectations on concurrency: We require in the speedup option that multiple threads be spawned to divide the work load. Threads should not be idle due to the static division of large amounts of work. For example, if requested to use 5 threads to scan 5,000 ports you should not simply initially assign each thread to scan 1,000 ports - some threads may finish much faster, and should then assist the slower running threads. Additionally, your program should not continually create new threads. For example, you should not create a thread to scan a port, scan the port, record the results, and then destroy the thread only to repeat the process for the next port. No possible execution paths of these concurrent threads should lead to an error. Specifically, ensure that you protect all functions and library calls explicitly unless they are noted to be thread safe. All memory should be freed when your server program exits. Specifically, there should not be any memory leaks or zombie threads. Do not rely on the OS to clean up terminated threads.
Extra Functionality Required of P538 Students: In addition to the above, for SSH, HTTP, SMTP, POP, IMAP, and WHOIS, your program should verify that the port is indeed running the standard service expected on that port. Upon verification, find out the specific version of software as well. Inferring this information will require you to do one of the following: 1) parse identifying information sent by the service if it sends it or 2) send appropriate queries to cause the service to reveal this information to you. For example, SSH sends you the version information immediately upon connection but for HTTP you must send a valid query to the remote server and read back the header of its response to find the version of the software it is running.
Getting Started: Begin by familiarizing yourself with the Nmap software. A simple starting point is to scan your machine, aka, localhost, via "nmap 127.0.0.1". Another useful resource is telnet, which will allow you to interact with a server using a plain text command line. For example, "telnet burrow.cs.indiana.edu 22" will allow you to connect to the SSH service running on burrow.cs.indiana.edu on port 22. burrow.cs.indiana.edu will respond by telling you a bit about the SSH service and will wait for you to send the appropriate authentication.
Implementation Specifics: As stated earlier, you are required to use either C or C++ for this assignment. Additionally, you must use the native Linux/BSD socket system calls to open TCP and UDP sockets. You may not use other socket libraries. No credit will be given to solutions that do not adhere to these requirements. These restrictions are being made so you become familiar with the details of lower-level socket programming.
To create a multi-threaded version of your program, use the pthreads library. This library can be enabled with the gcc compiler using specify the -pthread option. Additionally, the -D_REENTRANT gcc option, which make code libraries re-entrant, may come in handy because it can help the compiler load copies of libraries where it is safe to have two threads call the same function.
The following system calls will likely be required to complete the assignment: connect, htons, pthread_cancel, pthread_create, pthread_join, pthread_kill, recv, send, setsockopt, sleep, socket. You may also want to look at the more advanced synchronization methods available in pthreads.
Other Resources: You cannot copy any code from the Internet. However, you are encouraged to avail other Internet resources and Linux manual pages when completing this Project. Socket tutorials such as, http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html, will be helpful in understanding socket programming. However, you must ensure you write your own code and that you be explicit about what resources you use, including web tutorials and discussions with individuals outside of your group.
Experiment with Caution: Exercise frugality in testing your program since system administrators of organizations whose machines you scan would likely get upset and complain to our system administrators, who in turn may have to limit the scope of the project beyond the point where you would find it a useful practical experience.
Roadmap:Build your program incrementally. Here is a road-map to use:
Submit your code and project files as a single archive file (.tar or .tar.gz file formats only) via OnCourse by 11:59pm of the day of the deadline. Shorly after the submission deadline, demo slots will be posted on the Demonstration Scheduling System (a reminder will be posted on the Web Board). You must schedule an appointment to demonstrate your project. Groups that fail to demonstrate their project will not receive credit for the project. If a group member fails to attend his or her scheduled demonstration time slot, it will result in a 10 point reduction in his or her grade.
In addition to testing your code for various test cases, the AIs will be explicitly evaluating the contributions of individual project partners. In cases where they determine that partners have not contributed equally, differential grading will be used. The instructor and the AIs reserve the right to determine appropriate penalty in such cases.