Flat file databases
Introduction - What is a relational database and why bother?
The NoKats dog club, formed in 1973, currently keeps a record of its members and the dogs they own on little index cards (an example of which is shown below). The cards are kept together by the secretary and can be used for reference, e.g. to inform members of a show. This was an excellent system for the first few decades of the dog club when membership was around 100. In the last few years, however, membership has grown to over 5000. This is proving a real headache for the secretary.
Problems for the secretary
- It takes a long time to find an individual member’s record because there are now over 5000 of them. This is especially true if you search using multiple search criteria.
- It takes even longer to find a dog’s record, especially if a member’s details aren’t known for some reason.
- Finding all dogs who meet certain criteria, e.g. all poodles born after a certain date takes a long time manually.
- Occasionally, all of the records need to be sorted into a different order. This can take a long time.
- When records are filled in, some details are recorded over and over again. For example, a spaniel, its origins and its life expectancy are recorded in a number of different records. This is clearly a waste of the secretary’s time. It would be far better if the details for each breed could be stored just once.
- Details for personalised letters and address labels have to be manually transferred over to the letters and labels from the index cards. This is another time-consuming job.
- When new records are entered, spelling mistakes or incorrect details are sometimes entered onto the cards.
Why a computerised database would help the secretary
- When new records are entered into the database using a data input form, validation rules could trap some errors.
- All records can be easily sorted by different fields and multiple fields. A ‘field’ is an attribute of a thing or person, a piece of information held in a record. In this case, fields include Record number, Initial, Surname, Breed, Origin of breed and so on.
- The secretary can easily and quickly search the entire database for records that meet certain criteria.
- The results of sorting and searching can be presented in well-designed reports.
- Details held in the database can be merged into a word processor to produce personalised letters and labels.
What the dog club needs to think about when setting up a database
- Buying computer equipment is initially expensive. Savings to the dog club may not materialise for years.
- Computer equipment can break down. Who will repair it? How will the dog club function without a computer?
- The secretary may need to be trained and retrained.
- The secretary may find using computers a stressful experience if he or she does not have an IT background.
- How will software problems be dealt with? Does anyone in the dog club have the necessary skills?
- Computerised equipment is attractive to thieves. Measures will need to be taken to protect the hardware and software.
- The data in the system will be very valuable. A backup procedure will need to be put in place.
- The Data Protection Act will need to be complied with.
A simple flat file database has been created. A ‘flat file’ means that all the records are held in one table. Each row in the table corresponds to one record. Each column in the table corresponds to a different field. Tables are two-dimensional structures. They are therefore also referred to as ‘flat’, hence the name ‘flat file’.
The good points about flat files
- Flat files are relatively quick and easy to set up and use.
- They are ideal for smaller databases.
- They provide many of the sorting and searching tools commonly needed by users of the database.
The problems with flat files
- You can see that in the flat file, every member has an ID number. This should be unique for each member and is known as the Primary Key. The problem is that some records (some rows) have the same ID number. This means that you cannot pick out one and only one record if you searched for a member by their membership number.
- Dogs do not have their own unique identity number. It is therefore impossible to find an individual dog. Consider Gov in the above flat file. There are different two dogs, both called ‘Gov’. Their personal details just happen to be the same! You cannot tell them apart from the details held in the flat file.
- Because dogs do not have their own ID number, you cannot enter in a ‘new’ breed until a member owns a dog of that breed.
- Lots of details are held over and over again, for example, the origins and life expectancy of poodles, and the details about the member known as ‘A fish’. This is known as ‘data redundancy’ and is a waste of (hard disk) space. Data redundancy also contributes to larger files, which means longer search times. In addition, if you enter the same data over and over again in different places in the flat file, you are more likely to make a mistake when entering in one of the entries. This results in some of your data being inconsistent.
- If ‘A Fish’ got married and changed her name to ‘A Haddock’, then the secretary would have to make four changes to the database rather than just one. This is clearly a waste of time. The secretary could also introduce some data inconsistencies if he or she changed the name in one of A Haddock’s records to A Hadock (with one ‘d’). This is referred to as an ‘amendment anomaly’.
- If one member such as ‘Y Nott’ left the club, the secretary would have to delete five records (not just one). This is sometimes referred to as a ‘deletion anomaly’.
- If one new member joined with more than one dog, you would have to store more than one record (each one holding details about the member) - even though only one member has joined! We will call this an ‘adding anomaly’.
To summarise, flat flies are easy and quick to set up. They are easy to use and are ideal for small databases. They have a series of problems. These relate to the repetition of data stored in the database (known as ‘data redundancy’) as well as problems to do with adding and removing records or changing the data in records. These are referred to as ‘adding, deleting and amendment anomalies’ and can result in the data in the records becoming inconsistent.