# Databse Lecture Note - L2 Relational Model

## Table of Contents

- 1. What is relational model
- 2. Structure of Relational Databases
- 3. Fundamental Relational-Algebra Operations
- 3.1. Select
- 3.2. Project
- 3.3. Union
- 3.4. Set difference
- 3.5. Cartesian product
- 3.6. Rename
- 3.7. Some Examples
- 3.7.1. E.g. 1: Find all loans of over $1200.
- 3.7.2. E.g. 2: Find the loan number for each loan of an amount greater than $1200.
- 3.7.3. E.g. 3: Find the names of all customers who have a loan, or an account, or both, from the bank.
- 3.7.4. E.g. 4: Find the names of all customers who have a loan at the Perryridge branch.
- 3.7.5. E.g. 5: Find the largest account balance.

- 4. Additional Relational-Algebra Operations
- 5. Extended Relational-Algebra Operations
- 6. Modification of the Database

## 1 What is relational model

- Relational databse: A collection of one or more relations, which are based on the relational model.
- Relation: A table with rows and columns.

## 2 Structure of Relational Databases

### 2.1 Attribute Types

- Each attribute of a relation has a name.
- The set of allowed values for each attribute is called the domain (域) of the attribute.
- Attribute values are (normally) required to be atomic, i.e., indivisible
- The special value null is a member of every domain.
- The null value causes complications in the definition of many operations.

### 2.2 Concepts about Relation

A relation is concerned with two concepts:

- relation schema, which describes the structure of the relation
- relation instance, which corresponds to the snapshot of the data in the relation at a given instant in time.

在**关系型数据库**中，因为 table 就是 relation 的表现形式，所以个人理解
relation schema 就等同于 table 的结构，relation instance 就等同于 table 里的数据。

Formal definition: Assume \( A_1, A_2, \dots, A_n \) are attributes, \( R = A_1, A_2, \dots, A_n \) is a relation schema. \( r(R) \) is a relation on the relation schema \( R \).

E.g. \[ instructor = (ID, name, dept\_name, salary) \] \[ instructor(instructor\text{-}schema) = instructor(ID, name, dept\_name, salary) \]

### 2.3 Properties of Relation

- The order of tuples is irrelevant.
- No duplicated tuples in a relation.
- Attribute values are atomic.

### 2.4 key

Let \( K \subseteq R \), then \( K \) is:

**super key**of \( R \) if values for \( K \) are sufficient to identify a unique tuple of each possible relation \( r(R) \).

E.g., both`{ID}`

and`{ID, name}`

are super keys of the relation instructor.**candidate key**of \( R \) if \( K \) is minimal super key.

E.g.,`{ID}`

is a candidate key for the relation instructor.**primary key**of \( R \) if \( K \) is a condidate key and is defined by user explicitly.

### 2.5 Foreign Key

Assume there exists relations \( r \) and \( s \): \( r(\underline{A}, \boldsymbol{B}, C), s(\underline{\boldsymbol{B}}, D) \), we can say that attribute \( B \) in relation \( r \) is foreign key referencing \( s \), and \( r \) is a referencing relation (参照关系), and \( s \) is a referenced relation (被参照关系).

E.g. \[ account(\underline{\boldsymbol{account\text{-}number}}, branch\text{-}name, balance) \] \[ depositor (\underline{customer\text{-}name}, \boldsymbol{account\text{-}number}) \]

下划线为主键、粗体为外键

### 2.6 Schema Diagram

### 2.7 Cartesian product

Given sets \( D_1, D_2, \dots, D_n (D_i = a_{ij|j=1 \dots k}) \), a relation r is a subset of \( D_1 \times D_2 \times \dots D_n \).

### 2.8 Schema Diagram

## 3 Fundamental Relational-Algebra Operations

### 3.1 Select

- Notation: \( \sigma_p(r) \), where \( p \) is called the selection predicate
- Definition: \( \sigma_p(r) = \left\{ t| t \in r \text{ and } p(t) \right\} \)

### 3.2 Project

- Notation: \( \prod_{A_1, A_2 , \dots, A_k}(r) \), where \( A_1, A_2, \dots, A_k \) are attribute names and \( r \) is a relation name.
- Definition: The result is defined as the relation of \( k \) columns obtained by erasing the columns that are not listed.

### 3.3 Union

- Notation: \( r \cup s \)
- Definition: \( r \cup s = \left\{ t| t \in r \text{ or } t \in s \right\} \)

### 3.4 Set difference

- Notation: \( r - s \)
- Definition: \( r-s = \left\{ t \in r \text{ and } t \notin s \right\} \)
- Notice: set differens must be taken between compatible relations.

### 3.5 Cartesian product

- Notation: \( r \times s \)
- Definition: \( r \times s = \left\{ \left\{ t, q \right\} | t \in r \text{ and } q \in s \right\} \)
- Notice: attributes of \( r(R) \) and \( s(S) \) are disjoint, if not, rename the attributes.

### 3.6 Rename

- Notation: \( \rho_x(E) \), rename \( E \) under the name \( x \).

### 3.7 Some Examples

#### 3.7.1 E.g. 1: Find all loans of over $1200.

\[ \sigma_{amount>1200}(loan) \]

#### 3.7.2 E.g. 2: Find the loan number for each loan of an amount greater than $1200.

\[ \prod\nolimits_{loan\text{-}number}(\sigma_{amount>1200}(loan)) \]

#### 3.7.3 E.g. 3: Find the names of all customers who have a loan, or an account, or both, from the bank.

\[ \prod\nolimits_{customer\text{-}name}(borrow) \cup \prod\nolimits_{customer\text{-}name}(depositor) \]

#### 3.7.4 E.g. 4: Find the names of all customers who have a loan at the Perryridge branch.

\[ \prod\nolimits_{customer\text{-}name}( \sigma_{borrower.loan\text{-}number = loan.loan\text{-}number}(borrower \times (\sigma_{branch\text{-}name='Perryridge'}(loan))) ) \]

#### 3.7.5 E.g. 5: Find the largest account balance.

\[ \prod\nolimits_{balance}(account) - \prod\nolimits_{account.balance}( \sigma_{acount.balance < d.balance}(account \times \rho_d(account)) ) \]

## 4 Additional Relational-Algebra Operations

### 4.1 Set intersection

- Notation: \( r \cap s \)
- Definition: \( r \cap s = \left\{ t| t \in r \text{ and } t \in s \right\} \)
- Notice: set differens must be taken between compatible relations.

### 4.2 Natural join

### 4.3 Division

### 4.4 Assignment

## 5 Extended Relational-Algebra Operations

### 5.1 Generalized Projection

Extends the projection operation by allowing arithmetic functions to be used in the projection list.

- Notation: \( \prod_{F_1, F_2, \dots, F_n}(E) \)

\( E \) is any relational-algebra expression, and each of \( F_1, F_2, \dots, F_n \) are arithmetic expressions involving constants and attributes in the schema of \( E \).

### 5.2 Aggregate Functions and Operations

- Notation: \( G_1, G_2, \dots, G_n g_{F_1(A_1), F_2(A_2), \dots, F_n(A_N)}(E) \)

\( E \) is any relational-algebra expression, \( G_1, G_2, \dots, G_n \) is a list of attributes on which to group (can be empty), each \( F_i \) is an aggregate function, and each \( A_i \) is an attribute name.

## 6 Modification of the Database

### 6.1 Deletion

\(r \leftarrow r-E \)

### 6.2 Insertion

\( r \leftarrow r \cup E \)

### 6.3 Updating

\( r \leftarrow \prod_{F_1, F_2, \dots, F_n}(R) \)