Assignment 11
Address Labels
Pair or individual programming in this lab and assignment.
In Lab
Fill in the ellipsis:
# lab11.py, by chaynes@indiana.edu
import random
def write_random(file_name, max_value, num_values):
"""
Writes num_values random numbers, one per line, to the named file.
The numbers are evenly distributed between 0 and max_value.
>>> write_random('random.txt', 10, 4)
>>> print file('random.txt').read()
6.1067434776
6.21454289438
1.89631011328
5.30140893185
<BLANKLINE>
>>>
"""
#.
#.
#.
#.
#.
#.
def sum_file(file_name):
"""
Return the sum of the numbers in the named file. Assume there is at most
one number per line, but that some lines may contain only whitespace, and
that the numbers have the syntax of Python floating point or integer
literals.
>>> print file('data.txt').read()
3
5.4
<BLANKLINE>
-2
<BLANKLINE>
>>> sum_file('data.txt')
6.4000000000000004
>>>
"""
#.
#.
#.
#.
#.
#.
#.
#.
#.
def filter_file(in_file_name, out_file_name, filter_string):
"""
Writes to the named output file every line of the named input file that
contains filter_string. You can think of it as filtering out the lines that
don't contain the filter string.
>>> print file('dir_list.txt').read()
60 -rwx------+ 1 Chris Ha None 61440 Apr 11 2005 Removell-xist.exe*
60 -rwx------+ 1 Chris Ha None 61440 Jul 16 18:00 Removeorg.exe*
60 -rwx------+ 1 Chris Ha None 61440 Oct 29 15:43 Removepy2exe.exe*
60 -rwx------+ 1 Chris Ha None 61440 Jul 20 14:52 Removepycrypto.exe*
60 -rwx------+ 1 Chris Ha None 61440 May 6 12:31 Removepymedia.exe*
60 -rwx------+ 1 Chris Ha None 61440 Jul 5 11:05 Removewebstring.exe*
0 drwx------+ 2 Chris Ha None 0 Oct 29 15:43 Scripts/
0 drwx------+ 5 Chris Ha None 0 Jul 7 12:43 Share/
0 drwxrwx---+ 7 Administ SYSTEM 0 Jul 17 21:06 Tools/
<BLANKLINE>
>>> filter_file('dir_list.txt', 'oct_files.txt', ' Oct ')
>>> print file('oct_files.txt').read()
60 -rwx------+ 1 Chris Ha None 61440 Oct 29 15:43 Removepy2exe.exe*
0 drwx------+ 2 Chris Ha None 0 Oct 29 15:43 Scripts/
<BLANKLINE>
>>>
"""
#.
#.
#.
#.
#.
#.
#.
#.
#.
#.
The dir_list.txt file is a sample Windows command shell directory listing from which the test example extracts those files created in October.
Here's some pseudo-code for the filter_file function body:
open the input file and assign the input file object to a variableopen the output file and assign the output file object to a variablewhile Trueread a lineif the empty string was read, break from the loopif the line contains the application argument stringthen write the line using the output file objectclose both file objects
This week it is particularly important to study the class notes while completing the lab and assignment. The small amount of material in your text related to files uses mechanisms somewhat different from those presented in class, so it is best to ignore it.
Leave at least 20 minutes before the end of your lab to study this week's assignment and draft pseudo code for its solulution.
To get participation credit for this lab, show your program fiel and your pseudo code to your lab instructor. There is no electronic lab submission this week.
Assignment
A Microsoft Outlook address book (contacts database) may be exported to a file in a comma-separated-value format. Here is the first few lines of a simplified address book example:
First Name,Last Name,Home Street,Home City,Home State,Home Postal Code,Telex Sally,Porter,,4324 Hedrick Ln,,Jersy City,NJ,23403,HB Sam,Proxmire,,125 Hampton Cr,,Jefferson,NY,32084,BA Sue,Pinter,,32 Hawthorn Dr,,Jackson,MS,13034,HB
Each line contains the information for one contact. It corresponds to a table row, as when a csv file is viewed as a spreadsheet. (In database terminology each row is a "record".) The values in each column position of a row are separated by commas. (In database terminology, each column corresponds to a "field".) The first line is an exception: it is a header line, containing the names of each column.
An exported MS Outlook contacts database contains the column names in the example above and many others, most of which you're unlikely to ever use. You migth use the column named Telex (whose original purpose is now thoroughly obsolete) for your own special purpose: by putting an H character in this column, you indicate that the contact is to be on your on your holiday mailing list. (For other purposes there might also be other characters in this column.) In the example above, Sue and Sally are to get a holiday card, but not Sam.
This assignment is to write a function named output_labels that takes as arguments the name of a csv file and an output file. Assume the csv file contains, among others, the columns in the example above, and outputs mailing labels for your holiday cards. Each label should be output in the standard address format, as in the following exmaple:
Sally Porter 4324 Hedrick Ln Jersy City, NJ 23403 Sue Pinter 32 Hawthorn Dr Jackson, MS 13034
Each label is followed by a blank line, so they don't run together.
Be sure to only outputs labels for contacts with an H character in the Telex column. First assume the columns are just those in the example above, in the same order. Then, for A level work, modify your your program so that there may be any number of columns, in any order. In any case, you are only concerned with the columns in the example, whose header names are exactly as given.
You could read the csv file and parse it into rows and columns using what you already know. But that would be a fair amount of work, and processing csv files is so common that Python module library has code that does this for you.
Simply import the module csv, open the csv file for reading and store the file object in a variable, say csv_file, and then use the expression
list(csv.reader(csv_file))
to read the entire file and return it as a table. That is, as a list of row lists, where each row list contains the values associated with each column.
Begin by constructing a csv test file. You might extend the above example, or create one from scratch. Be sure to mix up the column order and add extra columns, which must have new column names. Data values may not contain commas, unless more advanced csv formatting possibilities are used. You may use a spreadsheet to generate your test file, selecting csv format when you save it. (Some spreadsheets will surround every cell value with double quotes, but that doesn't matter: the csv module reader will remove them.)
Develop this function in stages. For example, you might first just have it use the csv reader function to read your test file (indicated as the first command argument) and print the resulting table. Then write to a file named by the second argument just the first column of the table.
Use a helper function, called format_label, that takes the strings that make up the parts of the address label and return the label as a single string (with newline characters in it).
Finally, for those table rows that have the H code, call the label formatting function with the appropriate string values from the table row and write the resulting strings to the output file.
Submit your program as usual via Oncourse.
Hints
Some rough pseudo-code:
import csv moduesdefine the format_label functiondefine the output_labels functionprompt for the input and output file namesopen the csv (input) filetable = list(csv.reader(csv_file))close the csv filedefine a variable for each of the columns of interest containing theindex of the column in the table (index of column values in the row lists)open the labels (output) file for writingfor each index of the table listassign the table row indicated by the table index to variable rowif 'H' is in the telex column of the rowextract into a separate variable (such as city, state, etc) eachcolumn value that is needed for a labelformat the labelprint the formatted labelclose the labels filedefine a test functioninvoke the test functionTo handle columns that are in any order, including additional columns of no interest (required for an A on this assignment), use the list index method.
The easiest way to manipulate the data, such as adding extra rows or columns, is using a spreadsheet and saving the result as a csv file.