Finding System Facts


Most of the following is Linux-specific, but many other *nix systems also support the /proc or /sys virtual filesystems that provide much of the information. On Unix, devices, sockets, and other data streams are treated as files. Such "virtual" or fake filesystems simplify accessing those devices. One such filesystem found on every Unix system I know of is /dev, used for mounting and unmounting devices. One problem with using a virtual filesystem like /proc is that the 'files' contained often are empty until a process tries to read them; only then is the information needed acquired and the file is seemingly populated.

As an example, the /proc/cmdline file holds the boot command used in starting the system. Running a long listing on the file shows it is empty, with 0 bytes:

    > ls -l cmdline 
    -r--r--r-- 1 root root 0 Oct  9 11:11 cmdline
    
But concatenating the contents to the terminal shows the actual information:

    > cat cmdline 
    BOOT_IMAGE=/boot/vmlinuz-3.13.0-37-generic root=UUID=0e6ce68c-1d0b-4af0-9239-2091882d7176 ro quiet splash
    

After the cat command, the file will go back to showing a size of 0 bytes.

Another minor issue is permissions; not all of the files shown can be read by a non-root user. Doing ls /proc will show several directories that are named with numbers. Those are process numbers, and each active process has a subdirectory in /proc. The files in one of those directories give details about the given process: the owner and group, memory usage, open files, associated timers, etc. This is information is a potential security hole if not protected by Unix file and group permissions. However, /proc has detailed information that a ps command does not easily provide. In fact, the ps command gets its information from the procfs.

You can (given sufficiently high privileges) change kernel settings dynamically by writing values into some /proc files, especially in /proc/sys. As an example,

echo 3 >! /proc/sys/vm/drop_caches
will free pagecache, dentries and inodes so that the system will show the actual amount of free memory, not including unused cached data. [Doing this will make the system reload caches as needed, so it typically will make user response much slower until that flushed data is restored]. Another example of using procfs: I often need to get the cachesize of the current system, so the getcachesize script reads the information from /proc/cpuinfo and turns it into a line that can be inserted into a Matlab function or script. However, the /proc filesystem is recreated on each boot, so any changes you make there are temporary. Most Unix systems now prefer changing kernel settings and related system values using the sysctl interface, or modifying contents of files in the /sys virtual filesystem.

Wikipedia has an article about procfs, and on a Unix system man -s5 proc (or, if it is in a different man section, man -a proc ) will show the details of the proc-filesystm on the target platform. The Linux Documentation Project has a detailed article on the procfs . man


Basic Info

The uptime command gives how since the system was booted up, how many "users" are logged in, and the load average for the past 1, 5, and 15 minutes. Do man uptime to see how load average is defined. A load average of 3.0 might mean a single core CPU system is heavily loaded, but a 24 CPU system is lightly loaded. The uptime command gives at most a crude measure.

The top command gives a dynamic view of the processes running on the system. Because thousands of processes are often running on shared systems, top is best used with options to narrow down the information. For example,

    top -d 1.0 -u p573 -H
will show all of the running processes (and individual threads) owned by user p573, with a 1 second delay between updates. [How do I know all of those cryptic options? I don't. Instead I created a shell alias for incantations that seem most useful. The last one is aliased to topthreads on systems I often use.] A useful variant of top is htop , which presents the information in a better organized manner. While top is installed on all Unix systems, htop is an add-on that most distributions of Unix do not install by default. Another non-default variant, useful more for tracking down processes than gathering performance information for scientific computing is pstree .


Less Basic Info

The command lspci shows what hardware is attached to a systems motherboard via a PCI interconnect. This is particularly useful to get information about a GPU if it is used for computations. Like most Unix commands, adding options like -v, -vv, -vvv give successively more information about the PCI subsystem.

lshw shows more details about all of the hardware, including the brand and timing of memory modules. It will give a warning that not all information can be provided to a non-root user.


Finding CPU Info

It is never enough to identify the hardware used in a performance timing and analysis by just giving a name like silo.cs.indiana.edu. The configuration and underlying hardware change constantly, especially for servers. The computer name shortly.cs.indiana.edu has been in use for 12 years, and at various times has been an SGI workstation, a Sun Microsystems computer, and eventually a gamer's IBuyPower system. The number of cores has ranged from one to six, and the memory subsystem has gone from 2 Gbytes to 64 and is now 32 Gbytes. The hardware configuration can change in mid-semester; about the only guarantee until recently is that the hardware will not change during a single process's run. Always acquire the information at the time a code is run.


Finding Memory Info

Similar to the CPU information, the proc filesystem gives info about memory, cache sizes and cache alignment. cat /proc/meminfo will give a dump about the memory status and availability. If available lscpu shows the number and sizes of caches, and whether or not a separate instruction (e.g. L1i) cache is used. In large scale numerical computing the instruction cache is rarely a limiting factor to performance. Instead, large vectors and arrays are processed using a small number of machine intstructions repeated many times in a loop.

Many Unix commands related to memory are focused on the virtual memory status, dating back to the low memory hardware days of the 1970s and 1980s.

The general tool is vmstat which as with most commands has man options. The one I use most is /usr/bin/vmstat -s -S M , which displays the information is a slightly more readable format than vmstat alone does.

The sys virtual filesystem also provides extensive status information, and my vm-snapshot script reads that information and creates a file named "vmlist" which is then displayed in tabular format. Beware that it creates a file in the directory it is run from, and so might fail if you are not running from a directory with valid permissions.