CSCI A114 / INFO I111
Lecture Notes Two

First semester 2000-2001


Relational databases: Terminology.

Recall that a database system is a computerized record-keeping system

(a kind of electronic file cabinet)
whose overall purpose is to The user of such a system will be given facilities to perform a variety of operations, the following among others: What have we done so far, in the lab?

We've learned how to

Let's take a look at our tables.

There are two of them.

First, we have the Marina table:

Marina table
Marina Number Name Address City State Zip Code Warranty Non-warranty Tech Number
AD57 Alan's Docks 314 Central Burton MI 49611 $1,248.00 $597.75 23
AN75 Afton's Marina 21 West 8th Glenview MI 48121 $1,906.50 $831.25 36
BL72 Brite's Landing 281 Robin Burton MI 49611 $217.00 $0.00 36
EL25 Elend Marina 462 River Torino MI 48268 $413.50 $678.75 49
FB96 Fenton's Boats 36 Bayview Cavela MI 47926 $923.20 $657.50 23
FM22 Fedder Marina 283 Waterfront Burton MI 49611 $432.00 $0.00 36
JB92 JT Boat Club 28 Causeway Torino MI 48268 $0.00 $0.00 36
NW72 Nelson's Wharf 27 Lake Masondale MI 49832 $608.50 $520.00 23
SM72 Solton's Marine 867 Bay Ridge Glenview MI 48121 $462.50 $295.00 49
TR72 The Reef 92 East Bay Woodview MI 47212 $219.00 $0.00 36

The Marina table has the following structure:

Structure of Marina table
Field Name Data Type Field Size Primary Key? Description
Marina Number Text 4 Yes Marina Number (Primary Key)
Name Text 20 Marina Name
Address Text 15 Street Address
City Text 15 City
State Text 2 State (Two-Character Abbreviation)
Zip Code Text 5 Zip Code (Five-Character Version)
Warranty Currency Current Warranty Amount
Non-warranty Currency Current Non-warranty Amount
Tech Number Text 2 Number of Marina's Technician

Then we have the Technician table:

Technician table
Tech Number Last Name First Name Address City State Zip Code Hourly Rate YTD Earnings
23 Anderson Trista 283 Belton Port Anton MI 47989 $24.00 $17,862.00
36 Nichols Ashton 978 Richmond Hewitt MI 47618 $21.00 $19,560.00
49 Gomez Teresa 2855 Parry Ashley MI 47711 $22.00 $21,211.50

The Technician table has the following structure:

Structure of Technician table
Field Name Data Type Field Size Primary Key? Description
Tech Number Text 2 Yes Technician Number (Primary Key)
Last Name Text 10 Last Name of Technician
First Name Text 8 First Name of Technician
Address Text 15 Street Address
City Text 15 City
State Text 2 State (Two-Character Abbreviation)
Zip Code Text 5 Zip Code (Five-Character Version)
Hourly Rate Currency Hourly Rate of Technician
YTD Earnings Currency YTD Earnings of Technician

The fact that Access stores data in tables is not accidental.

Relational Systems and Others

Almost all of the database products developed since the late 1970s have been based on what is called the relational approach (or the relational data model).

What does it mean for a system to be relational?

At this stage in the course the following answer can be given:

Briefly, a relational system is one in which:

  1. the data is perceived by the user as tables (and nothing but tables)

  2. the operators at the user disposal (e.g., for data retrieval) are operators that generate new tables from old.

Let's clarify what we mean by the second aspect, regarding operators. Essentially there will be an operator to extract a subset of the rows of a given table, and another to extract a subset of the columns, and of course a row subset and a column subset of a table can both in turn be regarded as tables themselves.

We can refines this even further, to be more precise.

In a relational database management system the operators at the user's disposal (e.g., for data retrieval) are operators that generate new tables from old, and those operators include at least the following three operators:

  1. SELECT (also know as RESTRICT) -- extracts specified rows from a table

  2. PROJECT -- extracts specified columns from a table

  3. JOIN -- joins together two tables on the basis of common values in a common column

Let's illustrate how we can use these operators.

Queries

Now that Bavant has created a database with marina and technician data, the management and staff of the organization hope to gain the benefits they expected when they set up the database. One of the more important benefits is the capability of easily asking questions concerning the data in the database and rapidly obtaining the answers. Among the questions they want answered are the following:

  1. What are the warranty and non-warranty amounts for marina EL25

  2. Which marinas' names begin with Fe?

  3. Which marinas are located in Burton?

  4. What is the total amount (warranty amount plus non-warranty amount) for each marina?

  5. Which marinas of technician 36 have warranty amounts of more than $1,000 dollars?

You'll learn how to ask these questions of Access in lab.

Until then let's repeat the questions, look at their answers, and think how we could express the queries in terms of the three operations that we have identified above:

Here are the questions again, with their answers:

  1. What are the warranty and non-warranty amounts for marina EL25
    Marina Number Name Warranty Non-warranty
    EL25 Elend Marina $413.50 $678.75
  2. Which marinas' names begin with Fe?
    Marina Number Name
    FB96 Fenton's Boats
    FM22 Fedder Marina
  3. Which marinas are located in Burton?
    Marina Number Name Address
    AD57 Alan's Docks 314 Central
    BL72 Brite's Landing 281 Robin
    FM22 Fedder Marina 283 Waterfront
  4. What is the total amount (warranty amount plus non-warranty amount) for each marina?
    Marina Number Name Total Amount
    AD57 Alan's Docks $1,845.75
    AN75 Afton's Marina $2,737.75
    BL72 Brite's Landing $217.00
    EL25 Elend Marina $1,092.25
    FB96 Fenton's Boats $1,580.70
    FM22 Fedder Marina $432.00
    JB92 JT Boat Club $0.00
    NW72 Nelson's Wharf $1,128.50
    SM72 Solton's Marine
    TR72 The Reef
  5. Which marinas of technician 36 have warranty amounts of more than $1,000 dollars?
    Marina Number Name
    AN75 Afton's Marina

You see that the results are all expressed in terms of tables.

Can you express the results in terms of the relational operators described before?

Some More Terminology

A table (with values, or data, in it) is a relation.

The columns in the table are called attributes.

The number of columns is called the degree of the relation.

Each row in the table is called a record, or a tuple.

The number of tuples in a table is called the cardinality of the relation.

Each attribute has a domain (or range) of values.

Database Design

We can use the database that we have created to introduce some more terminology.

Our database tries to model part of the world which can also be described as follows:

Database design refers to the arrangement of data into tables and fields. In the example in this project, the design was specified, but in many cases, you will have to determine the design based on what you want the system to accomplish. It is the purpose of this class to teach you both how to use Access effectively, as well as how to design well regardless of the specifics of (implementation tools such as) Access.

With large, complex databases, the database design process can be extensive. Major sections of advanced database textbooks are devoted to this topic. Often however you should be able to design a database effectively by keeping one simple principle in mind: Design to remove redundancy. Redundancy means storing the same fact in more than one place.

To illustrate we will look at the JOIN of the two tables we have.

The JOIN will be one new table, with 17 columns:

  1. Marina Number
  2. Name
  3. Address
  4. City
  5. State
  6. Zip Code
  7. Warranty
  8. Non-warranty

  9. Tech Number
  10. Last Name
  11. First Name
  12. Address
  13. City
  14. State
  15. Zip Code
  16. Hourly Rate
  17. YTD Earnings

Notice that the data for a given technician (number, name, and so on) occurs on more than one record.

We can always obtain this data, if we need it, by asking for the JOIN.

If we had stored the data in just one such table from the beginning, however, we would have caused ourselves a few problems, such as:

  1. Redundancy wastes space on disk.

    The address of technician 23 should be stored only once. Storing the address of technician 23 more than once is wasteful.

  2. Redundancy makes updating the database more difficult.

    If technician 23 moves we should only need to make one single change in the database, regardless of the number of marinas the technician is servicing.

  3. Redundancy creates the possibility of inconsistent data.

    Assume that the address is stored in more than one place, and the change of address does not update (by mistake) all of the records that contain the address of technician 23.

The solution to the problem is to place the redundant data in a separate table, and provide a connection between the two.

The connection is provided by the fact that the tech number appears in both tables.

But this redundancy is minimal, and necessary.

Without it, no way exists to tell which technician is associated with which marina.

All the other technician data, however, was removed from the Marina table and placed in the Technician table. This new arrangement corrects the problems of redundancy in the following ways:

  1. Because the data for each technician is stored only once, space is not wasted.

  2. Changing the address of a technician is easy. You have only to change one row in the Technician table.

  3. Because the data for a technician is stored only once, inconsistent data cannot occur.

Designing to omit redundancy will help you to produce good and valid database designs.


Last updated: August 31, 2000 by Adrian German for A114/I111