Due Thursday, October 7th, 3:00 PM
Individual work this week.
>python lab4.py Monty test
Traceback (most recent call last):
File "lab4.py", line 12, in ?
main()
File "lab4.py", line 7, in main
file = open(fileName)
IOError: [Errno 2] No such file or directory: 'test'
This is not friendly to the user. Though the message generated by the system does indicate the problem is that the file does not exist, this is buried in a lot of information that may help a programmer to debug programs, but is likely to confuse or even alarm most users.
This program can be used to count not just single words, but also phrases consisting of words separated by spaces (but not newlines). However, if the user fails to put quotes around the phrase, the result is a message that is of no help to anyone who is not a programmer.
>python lab4.py Monty Python test.txt
Traceback (most recent call last):
File "lab4.py", line 12, in ?
main()
File "lab4.py", line 6, in main
word, fileName = sys.argv[1:]
ValueError: unpack list of wrong size
The shell (command line interpreter) parsed Monty and
Python as
two separate arguments, so the program was passed three arguments instead of
the two it was designed for. Giving a command the wrong number of arguments is
a common error, usually due to the user simply forgetting an argument. If the
program had been written differently, it might have simply ignored the extra
argument and tried to use the word Python as the file name, resulting
in a different error message that might be even more confusing to the user.
(Though widely used software should never present users with unintelligible error messages, it is all to common to encounter them. Knowing some programming often helps in understanding such messages, even if one is not familiar with the program or even the language it is written in. This is another benefit of learning some programming.)
Start with your lab4.py solution, or the one posted on the course web. Rename it lab5.py and modify it to provide a helpful message if there are the wrong number of arguments or the file does not exist: For example
Notice that the file name was included in the first error message. The second error message might have been just Error: wrong number of arguments, but instead it follows a practice commonly used by Unix utilities of printing brief documentation for the command, beginning with a Usage line.>python lab5.py Monty test Error: test is not a readable file >python lab5.py Monty Pyton test.txt Usage: python lab5.py word file Returns the number of times word occurs in file. The word argument may be a phrase containing spaces if it is quoted. >python lab5.py "Monty Python" test.txt 1
Of course you will need to use a conditional statement to determine if there are the right number of arguments. Before trying to open the file, use the expression os.access(fileName, os.R_OK), which returns true if and only if the file whose name (path) is stored in the variable fileName. Of course the os module must be imported first.
You may wish to use another feature of Python: if the first expression or statement in a module is a string, it is stored in a module variable named __doc__. If the program prints general documentation of its usage in response to errors, the documentation can then be placed where the programmer most expects to find it, at the beginning of the program module, and not have to be repeated for the error message. Python has a number of such handy features that enhance the pleasure of programming.
>>> numbersToStrings(3)
Traceback (most recent call last):
File "", line 1, in -toplevel-
lab3.numbersToStrings(3)
File "C:\home\202\a\5\lab3.py", line 11, in numbersToStrings
for n in numList:
TypeError: iteration over non-sequence
>>> getTitle('<html><body><titl>Holy Grail</titl>')
'<body><titl>Holy Grail</titl'
For a function intended to be used by programmers, rather than an application often used by non-programmer, the traceback information is appropriate, for it helps the programmer know where the error occurred. But the TypeError: iteration over non-sequence message is not very helpful, even to a programmer. It would nice to know what the non-sequence was and that it was supposed to be a list. The behavior of getTitle is even less satisfactory from a programmer's perspective: it silently returns garbage! This will surely result in a problem at some later point in the program that calls this function, and it may be difficult then to determine where the error originated.
Modify these functions so they handle bad arguments more appropriately, as illustrated by
>>> numbersToStrings(3)
Traceback (most recent call last):
File "", line 1, in -toplevel-
lab5.numbersToStrings(3)
File "C:\home\202\a\5\lab5.py", line 11, in numbersToStrings
raise 'Not a list: ' + str(numList)
Not a list: 3
>>> numbersToStrings((3,4))
['3', '4']
>>> numbersToStrings(['three', False])
['three', 'False']
>>> getTitle('<html><body><titl>Holy Grail</titl>')
Traceback (most recent call last):
File "<pyshell#19>", line 1, in -toplevel-
lab5.getTitle('<html><body><titl>Holy Grail</titl>')
File "C:\home\202\a\5\lab5.py", line 26, in getTitle
raise 'No ' + endTitleTag + ' in ' + string
No </title> in <html><body><titl>Holy Grail</titl>
This is much better, but does not try to catch every possible error. It turns out, at least the way the sample program was written, that numbersToStrings will work properly given a tuple instead of a list. There is probably no harm in allowing this, and it might be appreciated. (However, using features of a program that are not documented, but just happen to work, is always dangerous. Such features may, without warning, fail to work when the program is revised.) Also observe that this function actually converts a list of any values at all, not just numbers, to their printed representations. Taking advantage of this is even more likely to lead to trouble. If you wish, you may add a test to generate an error if the sequence elements are not numbers. Hint: the expression type(x) == list only returns true if x is a list. Corresponding expressions may be used to test for other types (such as int, float, or str).
Though it is often handy when the values of offending data are printed as part of an error message, it is possible that the bad data may be such a large data structure that it takes many pages to print its value! Though this sort of problem must sometimes be taken seriously, techniques for dealing with it are beyond the scope of this assignment.
When you have completed the last exercise above, or 15 minutes before the lab ends, whichever comes first, submit your lab5.py file as lab 5 in Vincent.
>python wc.py test.txt Monty Python wc.py
1 4 23 test.txt
Error: Monty is not the name of a readable file
Error: Python is not the name of a readable file
34 120 1006 wc.py
35 124 1029 totalFor example
>>> date('123456')
Traceback (most recent call last):
File "<pyshell#12>", line 1, in -toplevel-
date('123456')
File "C:\home\202\a\5\wc.py", line 35, in date
raise 'Bad day in: ' + mmddyy
Bad day in: 123456
When you are done, submit your final wc.py file as a5 using Vincent.