Breadcrumbs

Relational Databases Introduction

What is SQL? SQL is a language for communication with databases on your computer. After reading this series you will have a decent understanding of SQL. Armed with what you will learn in this series, you will know enough to extend your SQL knowledge on your own.

If you've never heard about SQL before, then it is likely that you have never worked with databases on the computer either. Databases are used to store and manipulate large quantities of information easily. You can add, change, or edit the information within the database, but most importantly you can view its data. Programs designed to carry out these task are called database management systems (DBMS). phpMyAdmin (external link) and Microsoft Access (external link) are examples of such programs.

Do It Yourself

To experiment with the examples in these lessons, you can use any relational DBMS which uses SQL. However, you will most likely want to use a MySQL database in combination with PHP. To make database management easier it is good to install phpMyAdmin too. As you haven't started yet, you can easily set them up on your computer now. For instructions on this, refer to Lowter's excellent installation guide.

A Little History

In the past, hierarchical systems (external link) were most often used to sort the information in a database. Just think of the average family tree and you'll understand how difficult a hierarchical system is to manage.

It was as far back as 1970 when Edgar F. Codd (external link), working for IBM at the time, came up with an idea completely different to what had been used so far: the relational model. In the relational model a database exists of one or more tables which are subdivided into rows and columns.

Let's Get The Party Started

The term relational means that there is a relation between the various columns holding information in the database. Take a look at the following table, it illustrates how a table holding various voting topics and their results could be constructed.

Screenshot of what the SQL looks like

Take a look at the first row of the table. This is called the "heading", showing the field names. All the fields of one column are of the same type. Except the first row, each table row exists of one group of informational elements, divided over the columns of a table. In this example, the information is about a poll in progress. One row of a table is referred to as a record.

The voting table uses vote_id to identify a row. A column, or a combination of columns, which can identify a row in a table is called a key. A key has to be unique, which means that one value may only be assigned to a key once.

How Redundancy Causes Inconsistency

Now imagine that we also want to store who voted. Should we add an extra column labeled vote_name? Although, then you would have to create multiple rows to give enough information about the same voting topic. When information is present more than once in a table it is called redundancy. Redundancy can be useful now and then, but it is preferred to avoid it as much as possible.

You might wonder why? Well, let's continue on the previous example. We could decide to put a closing date on the Screen Resolutions poll (the current value of 0 means there is none). Since six people have already voted, you would have to modify six entries associated with that poll! That is surely annoying and a lot of extra work will be involved. Even worse, if you forget to change it for one, the database will tell you two different things. One will say there is no closing date, and the other will say there is a closing date. This is called inconsistency.

Yet another problem is that we would need another key. Vote_id would no longer be unique, so we would need to think of something else. A combination of vote_id with vote_name would be a good unique identifier.

Functional Redundancy

It is better to alter the table in a different way, which makes such a form of inconsistency impossible. This can be done by creating two tables.

Screenshot of what the SQL looks like

Here we see our second table, which shows which people have voted for which voting option. As you can see, there is redundancy in the table, but only the redundancy we require. Every form of non-functional redundancy is avoided by this construction.

A practical system would build yet another table, which would contain the voting options. This way, redundancy is used functionally and there is no chance inconsistency will ever exist.

Conclusion

To work with a relational database you need a database management system. All popular DBMS's contain the same language to query them, namely SQL. In the following tutorials you will learn how to use SQL.