Fall Semester 2002


Database Design (II): Design Methodology
Learning Objectives

Key Terms

Chapter Outline

I. Introduction
II. User Views
III. Information-Level Design Methodology
IV. Represent a User View as a Collection of Tables
V. Normalize the Tables
VI. Represent All Keys
A. Database Design Language (DBDL)
B. Entity-Relationship Diagrams
VII. Merge the Result into the Design
VIII. Database Design Examples
IX. Physical-Level Design
X. Entity-Relationship Diagram Alternatives
XI. Top-Down Versus Bottom-up
XII. Survey Form
XIII. Obtaining Information from Existing Documents
XIV. One-to-One Relationship Considerations
XV. Many-to-Many Relationship Considerations
XVI. Nulls and Entity Subtypes
XVII. Avoiding Problems with Third Normal Form When Merging Tables
XVIII. Summary
XIX. Key Terms
XX. Review Questions
XXI. Premiere Products Exercises
XXII. Henry Books Case

Notes and Main Ideas

User Views
Database design is a two-step process. The first step is called information-level design, and it is completed independently of any particular DBMS. In the second step, physical-level design designers transform the information-level design into a design for a specific DBMS. A user view is a set of requirements that is necessary to support the operations of a particular database user. For each user view, designers must design the database structure to support the view and then merge it into a cumulative design, that is, a design that supports all the user views encountered thus far in the design process.

Information-Level Design Methodology
The information-level design methodology presented in this text involves representing individual user views, refining them to eliminate any problems, and then merging them into a cumulative design. When creating user views, a user can be a person or group that will use the system, a report, or a type of transaction. For each user view, the methodology requires you to complete the following steps:

  1. Represent the user view as a collection of tables
  2. Normalize these tables
  3. Identify all keys in these tables
  4. Merge the results of Steps 1 through 3 into the cumulative design.

Represent a User View as a Collection of Tables
When given a user view or some sort of stated requirement, you must develop a collection of tables that will support it. Determine entities involved and create a separate table for each type of entity. Determine the primary keys for each of these tables. Determine the properties for each of these entities. Determine the relationships among the entities.

Normalize the Tables
Once relationships are established between entities, the next task is to normalize each table, with the target being third normal form.

Represent All Keys
For each table, you must identify the primary key, any alternate keys, secondary keys, and foreign keys. Database Design Language (DBDL) is a mechanism for representing tables and keys. In DBDL, you represent a table by listing all columns and then underlining the primary key. Below the table definition, you list any alternate keys, secondary keys, and foreign keys. An entity-relationship (E-R) diagram visually represents the structure of a database. There are several different styles of E-R diagrams. This text uses a style called IDEF1X.

Merge the Result into the Design
As soon as you have completed steps 1 through 3 (represent view as a collection of tables, normalize, identify all key) for a given user view, you can merge the results into a cumulative design. You combine tables that have the same primary key to form a new table, eliminate duplicate columns, check the design to determine if the new tables are in third normal form. Repeat the process for each user view.

Database Design Examples
This section walks you through the information-level design of the two databases used in this text, Premiere Products and Henry Books.

Physical-Level Design
The physical-level design begins after the information-level design is completed. You must implement the design for a specific DBMS. Most DBMS support primary, alternate, secondary, and foreign keys. If a DBMS does not support these keys, you must devise an alternate scheme to ensure uniqueness of primary and alternate keys. You also must ensure that values in foreign keys are legitimate, that is, they match the value of the primary key in some row in another table. For secondary keys, it should be possible to retrieve data rapidly on the basis of a value of the secondary key.

Entity-Relationship Diagram Alternatives
This section covers alternative styles for E-R diagrams.

Top-Down Versus Bottom-Up
You can design a database using a bottom-up design methodology or a top-down design methodology. In a bottom-up design methodology, specific user requirements are synthesized into a design. A top-Down design methodology begins with a general database design that models the overall enterprise and that repeatedly refines the model to achieve a design that supports all necessary applications. One advantage of a top-down approach is that there is a more global feel to the project. A bottom-up approach provides a rigorous way of tackling each separate requirement and ensuring that it will be met. The ideal strategy combines the best of both approaches.

Survey Form
When designing a database, it may be helpful to design a survey form to obtain the required information from users. The survey form should contain the following information: entity information, attribute (column) information, relationships, functional dependencies, and processing information.

Obtaining Information from Existing Documents
Existing documents, such as invoices, can provide helpful information concerning the database design. If an organization has a computerized system, current file layouts can furnish additional information about entities and attributes.

One-to-One Relationship Considerations
This section discusses potential problems and possible solutions for implementing one-to-one relationships between tables.

Many-to-Many Relationship Considerations
This section discusses potential problems and possible solutions for implementing many-to-many relationships between tables. It also describes many-to-many-to-many relationships that involve three entities.

Nulls and Entity Subtypes
A null actually represents the absence of a value in a field. Nulls are used when a value is either unknown or inapplicable. Suggested methods for handling null values are discussed. When you remove a column that contain null values from a table and create a separate table containing only non-null values, you create an entity subtype. There are specific methods for representing entity subtypes in E-R diagrams.

Avoiding Problems with Third Normal Form When Merging Tables
When you combine third normal form tables, the result may not be in third normal form. You can attempt to avoid the problem of creating a table that is not in third normal form by being cautious when representing user views. The problem occurs when a column A in one user view functionally determines a column B in a second user view. Thus, column A is a determinant for column B, yet column A is not a column in the second user view.

Quick Quiz

  1. What is the set of requirements that is necessary to support the operations of a particular database user?
    ANSWER: (user view)

  2. What type of key is a column or collection of columns that could have been chosen as a primary key but was not?
    ANSWER: (alternate)

  3. What is a popular type of diagram that visually represents the structure of a database?
    ANSWER: (entity-relationship (E-R) diagram)

Important Aspects (Self-Study Tips)

But please also see above (a little).

Comprehensive Design Example

Chapter Outline

I. Introduction
II. Marvel College Requirements
A. General Description
B. Report Requirements
C. Update (Transaction) Requirements
III. Marvel College Information-Level Design
IV. Final Information-Level Design
V. Review Questions
Notes and Main Ideas

Marvel College Requirements
This appendix details the database requirements of a fictitious college, Marvel College. We are given a general description of the college environment and the report requirements. Reports include: report card, class list, grade verification report, time schedule, registration request form, student schedule, full student information report, faculty information report, work version of the time schedule, and course report. Update requirements also are specified.

Marvel College Information-Level Design
In this section, the design methodology discussed in Chapters 5 and 6 is used to develop an information-level design for Marvel College. Entities, attributes, and relationships are identified. Each report requirement and each update requirement is considered as a user view.

Final Information-Level Design
Once the cumulative design has been reviewed and approved, the physical-level design process can begin.

Solutions

Answers to Review Questions

1. (Changes on the Marvel College requirements)

a. Remove FacultyNum as a column in the Section table. Create a separate table whose key is the concatenation of SemesterCode, ScheduleCode, and FacultyNum. This new table will have no other columns.

b. One solution would be to include MajorNum and Description as columns in the Department table and delete the Major table from the design. In the Advises table, replace MajorNum with DepartmentNum. A second solution would be to make DepartmentNum an alternate key within the Major table and leave everything else as is.

c. Remove MajorNum from the Advises table and change the key to the concatenation of StudentNum and FacultyNum.

d. Include the number of the faculty member who advises the student in the Student table and delete the Advises table.

e. Designate the concatenation of LastName and FirstName as a secondary key in within the Student table.

f. Remove OfficeNum from the Faculty table and create a new table whose key is the concatenation of OfficeNum and FacultyNum.

g. Remove CreditsEarned and GradePoints from the StudentGrade table. (The CreditsEarned for a course can be obtained by finding the number of credits in the Course table for the given course provided the grade is a passing grade.)

h. Replace CourseNum with CourseCode in all tables. Remove DepartmentCode from any table in which CourseCode appears other than the Course table itself. Although DepartmentCode still appears in the Course table, it will no longer be part of the primary key. It will still be a foreign key identifying the Department table, however.

i. Create a new table whose key is the concatenation of StudentNum, PrimaryCode, SemesterCode, and AlternateCode. Remove AlternateCode from RegistrationRequest.

Holt Distributors Project Solution

General Comments:

  1. There is no single correct solution. Each design must be judged on its own merits. The following represents one possible solution. In addition, at a number of points in the design, we have made comments either clarifying some aspect of the design or indicating possible alternatives one may want to consider.

  2. There are cases where, if this were an actual design problem for a company, the designers would need to consult users for further information. There are basically two ways you might choose to handle this. You can make whatever assumptions you want to make in order to proceed with the design exercise, but then you need to justify the assumptions you have made. Alternatively, you might (when you work in groups) designate one or more of the group members to act as the user and have the rest of the group interview them to obtain any additional information that they feel is necessary.

    User view 1 (Enter/edit territories):
    Territory (TerritoryNum, TerritoryName)
    User view 2 (Enter/edit Reps):
    Rep (RepNum, LastName, FirstName, Address, City, State, Zip, MTDSales, 
         YTDSales, MTDCommission, YTDCommission, CommissionRate, TerritoryNum)
    
    FK TerritoryNum -> Territory
    User view 3 (Enter/edit customers):
    Customer (CustomerNum, CustomerName, Address1, Address2, City,
              State, Zip, MTDSales, YTDSales, CurrentBalance, CreditLimit, 
              ShipName, ShipAddress1, ShipAddress2, ShipCity, ShipState, 
              ShipZip, RepNum)
    
    FK RepNum -> Rep

    Comment:

    If the ship-to address fields are null for a particular customer, this customer has no special ship-to address. In this case, goods are to be shipped to the customer's regular address.
    User view 4 (Enter/edit parts):
    Part (PartNum, Description, UnitPrice, MTDSales, YTDSales, 
          OnHand, Allocated, ReorderPoint)
    User view 5 (Enter/edit vendors):
    Vendor (VendorNum, VendorName, Address, City, State, Zip)
    
    VendorPart (VendorNum, PartNum, Price, MinimumOrderQuantity, ExpectedLeadTime)
    
    FK VendorNum -> Vendor	
    FK PartNum -> Part	
    User view 6 (Order entry):
    Orders (OrderNum, OrderDate, CustomerNum, CustomerPONum)
    		
    FK CustomerNum -> Customer
    
    OrderDetail (OrderNum, SEQNum, PartNum, NumOrdered, QuotedPrice)
    
    FK OrderNum -> Orders	
    FK PartNum -> Part

    Comment:

    You might ask how the design would differ if users are to be able to override the ship-to address for a customer on a particular order; i.e., the ship-to address for the order is allowed to be different from the ship-to address currently on file for the customer. In this case, the Orders table would also include ship-to fields. These must accept nulls, since they will not apply to each order. If on a given order the values are not null, they are understood to be the ship-to address. If they are null, the ship-to values from the Customer table are used instead.

    Cumulative collection thus far:
    Territory (TerritoryNum, TerritoryName)
    
    Rep (RepNum, LastName, FirstName, Address, City, State, Zip, MTDSales, 
         YTDSales, MTDCommission, YTDCommission, CommissionRate, TerritoryNum)
         
    FK TerritoryNum -> Territory
    
    Customer (CustomerNum, CustomerName, $ddress1, Address2, City, State, 
              Zip, MTDSales, YTDSales, CurrentBalance, CreditLimit, ShipName, 
              ShipAddress1, ShipAddress2, ShipCity, ShipState, ShipZip, RepNum)
    
    FK RepNum -> Rep
    
    Part (PartNum, Description, UnitPrice, MTDSales, YTDSales, 
          OnHand, Allocated, ReorderPoint)
    
    Vendor (VendorNum, VendorName, Address, City, State, Zip)
    
    VendorPart (VendorNum, PartNum, Price, MinimumOrderQuantity, ExpectedLeadTime)
    
    FK VendorNum -> Vendor	
    FK PartNum -> Part	
    
    Orders (OrderNum, OrderDate, CustomerNum, CustomerPONum)
    
    FK CustomerNum -> Customer
    
    OrderDetail (OrderNum, SEQNum, PartNum, NumOrdered, QuotedPrice)
    
    FK OrderNum -> Orders	
    FK PartNum -> Part
    User view 7 (Invoicing cycle):
    Customer (CustomerNum, CurrentInvoiceTotal, ...)
    
    Orders (OrderNum, ...)
    
    OrderDetail (OrderNum, SEQNum, NumShipped, ...)
    Note:
    Other fields in the Customer, Orders, and OrderDetail tables are already in the cumulative collection and will not add anything new.
    Invoice(InvoiceNum, ShipDate, Freight, OrderNum, ShipStatus)
    
    FK OrderNum -> Orders	
    
    InvoiceSummary (InvoiceNum, InvoiceDate, InvoiceTotal, Freight, CustomerNum)
    
    FK CustomerNum -> Customer
    Comment:
    Invoice records are created during the first phase of the invoicing cycle, when users are entering the numbers of the orders to release for invoicing. These records are used as invoices are printer, files are updated, and invoice summary records are created. Finally, as released orders are deleted from the database, these records are deleted also. Because of the limited life of these records and the relatively simple manner in which they are used, they are often kept external to the database in a simple flat file. In this case, deleting these records is accomplished by deleting the entire file. This is just an alternative approach we may keep in mind.

    Another thing worth paying attention to concerns the relationship, or, rather, the lack of a relationship between Invoice and InvoiceSummary. Both tables have the same primary key and yet they should not be combined. To appreciate the reason there is no relationship, you have to consider the sequence of events: an Invoicerecord is created that must match some order, the invoice is printed, files are updated, an InvoiceSummary record is created, and then the order and corresponding Invoice record are deleted. Thus, an Invoice record cannot be required to match an InvoiceSummary record since the InvoiceSummary record is created after the corresponding Invoice record. An InvoiceSummary record cannot be required to match an Invoice record, since the Invoice record will be deleted long before the InvoiceSummary record will. (InvoiceSummary records are not deleted until the end of the month.) For similar reasons the Invoice and InvoiceSummary tables cannot be combined. Note that if the Invoice information is kept outside of the database, we are not even faced with such an issue.

    This is a good place to discuss about system-generated numbers, that is, users may wish to assign invoice numbers themselves or, what is actually more likely, they might prefer that the system automatically assigns these numbers. Assuming that the numbers are assigned sequentially, the system merely needs to store the last invoice number assigned. Then, when a new invoice is created, the system will retrieve this number, increment it by one giving the new invoice number, and then update the field containing the last number assigned with this new number. A natural question to raise at this point is: Where should this number be stored? Should it be a field in some special record in the database or in some separate file? It could actually go either place. If the number is only used by programs that access the database, placing it in the database is preferable. If other programs also use it, it is usually preferable to place it in a separate file.

    User view 8 (Receive payments):
    Customer (CustomerNum, CurrentPaymentTotal, ...)
    Note:
    Other fields in the Customer table are already in the cumulative collection and will not add anything new.
    Payment (PaymentNum, CustomerNum, Payment Date, Amount)
    
    FK CustomerNum -> Customer
    Comment:
    The same comment made concerning the system generating invoice numbers also applies to payment numbers.

    User view 9 (Statements):
    Customer (CustomerNum, CustomerName, Address1, Address2, City, 
              State, Zip, CurrentBalance, CreditLimit, RepNum, CurrentInvoiceTotal, 
              CurrentPaymentTotal, CurrentAmount, Over30, Over60, Over90, 
              PreviousBalance)
    Final cumulative collection:
    Territory (TerritoryNum, TerritoryName)
    
    Rep (RepNum, LastName, FirstName, Address, City, State, Zip, 
         MTDSales, YTDSales, MTDCommission, YTDCommission, 
         CommissionRate, TerritoryNum)
    
    FK TerritoryNum -> Territory
    
    Customer (CustomerNum, CustomerName, Address1, Address2, City,
              State, Zip, MTDSales, YTDSales, CurrentBalance, 
              CreditLimit, ShipName, ShipAddress1, ShipAddress2, 
              ShipCity, ShipState, ShipZip, CurrentInvoiceTotal,
              CurrentPaymentTotal, CurrentAmount, Over30, Over60, 
              Over90, PreviousBalance, RepNum)
    
    FK RepNum -> Rep	
    
    Part (PartNum, Description, UnitPrice, MTDSales, YTDSales, 
          OnHand, Allocated, ReorderPoint)
    
    Vendor (VendorNum, VendorName, Address, City, State, Zip)
    
    VendorPart (VendorNum, PartNum, Price, MinimumOrderQuantity,
                ExpectedLeadTime)
    
    FK VendorNum -> Vendor	
    FK PartNum -> Part	
    
    Orders (OrderNum, OrderDate, CustomerNum, CustomerPONum)
    
    FK CustomerNum -> Customer
    
    OrderDetail (OrderNum, SEQNum, PartNum, NumOrdered, QuotedPrice, NumShipped)
    
    FK OrderNum -> Orders	
    FK PartNum -> Part
    
    Invoice(InvoiceNum, ShipDate, Freight, OrderNum, ShipStatus)
    
    FK OrderNum -> Orders	
    
    InvoiceSummary (InvoiceNum, InvoiceDate, InvoiceTotal, Freight, CustomerNum)
    
    FK CustomerNum -> Customer	
    
    Payment (PaymentNum, CustomerNum, PaymentDate, Amount)
    
    FK CustomerNum -> Customer
Once again, you should feel free to use this set of notes in any way you like to prepare But please turn in an assignment that (even if it relies heavily on this material, should that be the case) is yours beyond any reasonable doubt. We're interested in your exposure to core problems of relational database management systems, and their solutions.


Last updated: Oct 13, 2003 by Adrian German for A114/I111