CSCI A114 / INFO I111
Homework Two

First semester 2001-2002


Due date
At the end of lab 5.

Late policy
No late policy. You need to turn this on time.

Work policy
Working in groups is accepted but you need to turn in your own version of the assignment. Also, please make sure you read, understand, and comply with the Computer Science Department's Statement on Academic Integrity before turning in your assignment.

Task
Explained in detail below.
Grading
You need to come up with a number of tables, each with a certain structure, then populate the tables with as much data as it's known. We'll deduct points for missing tables, columns, or missing data in the tables. The AI will provide you with feedback in a week.

Grades will be posted on-line.


Homework Two

First, let's review the rules of relational database design.

We use the entity-relationship model. We also said that this model does an adequate, albeit imperfect, job of modeling real-world situations, such as business enterprises or the records kept by schools, hospitals, governments, and so on, where database systems are likely to be used. However, if we view the structures defined in the entity-relationship model as conceptual schemes, we shall not be grossly deceived.

So we will have the following recipe:

  1. first list the entities, with attributes
  2. then list the relationships Make a table for each entity, identify the key attributes.

    Make a table for each relationship, in which the columns are keys in the tables for the entities that the relationship is connecting.

    Please apply this rule to the following problem for which the E-R diagram is given. Then store in the database (by hand) the facts listed at the bottom of the assignment.

    Assume a database that records baseball players, the teams they played for, their batting averages and positions played. Consider the following analysis of entities and relationships.

    The entities are:

    1. PLAYERS with attributes:
      NAME
      HOME   /* place of birth */ 
      BDATE  /* date of birth */ 
      Attribute NAME will be the key.

    2. POSITIONS with attributes:
      POSNAME    /* e.g., pitcher */
      POSNUMBER  /* 1 for pitcher, 2 for catcher ... */ 
      Let's take POSNAME to be the key.

    3. TEAMS with attributes:
      FRANCHISE /* explained below */ 
      CITY 
      YEAR 
      The FRANCHISE attribute is a unique identifier we give to a baseball franchise. As a franchise is a property, it has existence as an entity even when it moves to another city or changes its name. A team (as opposed to a franchise) is the collection of players, coaches, etc., working for a franchise in a given year. The key for the entity set TEAM is FRANCHISE and YEAR.

    4. BA, the set of batting averages.

      This entity set has one attribute, PCT, whose values are three digit decimal numbers between 0 and 1. Clearly, PCT is the key.

    Relationships:

    1. A relationship SEASON between PLAYERS, TEAMS, and BA. Notice that this is a ternary relationship (arity 3). It is many-one from PLAYERS and TEAMS to BA in the sense that given a player and a team that player has a unique batting average. Recall that a "team" exists for one particular year, so the typical player is on many teams and has many batting averages, but just one batting average per year.

    2. The relationship PLAYS between PLAYERS and POSITIONS. This is a many-many relationship indicating what positions were played by a player, over the course of his career.

    Here's the E-R diagram:

    Now please enter the following data in the database:

    George Ruth was born in Baltimore, MD, in February 6, 1895. He played for the Red Sox and averaged 0.325 in 1916. He also averaged 0.393 for the Yankees in 1923. Tyrus Cobb was born in Narrows, GA, on February 18, 1866. He could play center field and first base. He played for the Tigers in 1911, with a batting average of 0.420 in a year when the Tigers where located in Detroit. Jack Robinson was born in Cairo, GA, on January 31, 1919. He played for the Dodgers (in Brooklyn) in 1949, and had a 0.342 batting average that year. The Red Sox were located in Boston in 1917. Stanley Musial had a batting average of 0.376 in 1948 for the Cardinals. Stanley Musial used to play first base. The position number for first base is 3. Other position numbers are: 1 for pitcher, 2 for catcher, and so on.
    Note: you need to draw the tables and actually enter the data in the tables, on paper, by hand.


    Last updated: September 6, 2001 by Adrian German for A114/I111