Transaction processing, ACID, record locking and redundancy
Introduction
To be able to discuss transaction processing, you need to know what a transaction is. A transaction is an atomic unit of work that either fails or succeeds - there is no half-way house! A transaction can be made up of many parts, each part in the transaction must be completely successful for the whole transaction to be successful. If any one part of the transaction does not succeed, then the whole transaction fails. In this situation, when a transaction fails, the system will need to return back to the state that it was in before the transaction begun. This is called a 'rollback'. When a transaction fails, any changes made up to that point are 'rolled back'.
Transaction processing and ACID
Sometimes, you need to change a database. This is called a transaction and carrying out the transaction is known as 'transaction processing'. Transaction processing should be done by the system automatically by applying a set of properties known as ACID. Developers should that use a database system should not have to check any of the ACID properties themselves - they are done by the system.
-
- Atomicity
- Consistency
- Isolation
- Durability
By applying ACID properties, you are helping to ensure successful tranasction processing.
Atomicity means that the database guarantees that all of a particular transaction occurs, or none of it does; a transaction is made up of lots of steps and they must all be successfully carried out or a rollback occurs. If there is a power failure, any error, or anything else, this won't result in a state in which some parts of a transaction have been applied whilst others haven't.
Consistency means that all the rules you have set up in your database will be applied at all times. Your data will be consistent. There can never be a situation where any of the rules are broken. When a transaction is finished, the database will still be consistent because all the rules set up for it have been successfully and correctly applied.
Isolation means that all transactions happening are happening in isolation, and each transaction has exclusive access to the whole system. Whilst any one transaction is being processed, it may temporarily not be in a consistent state. It would be wrong for another transaction to have access to the database system whilst this inconsistency was in place. If two transactions are happening at the same time, and one of them needs some data from the other one, then they have to happen in sequence. The one that needs the data must wait until the other is finished.
Durability The steps in a transaction are carefully logged as they proceed. If there was a failure of any kind, the system can be recovered to a known state. This means that there will be no data loss and the system cannot be corrupted. Once a transaction has been successfully completed, all of the changes that were made are saved to a permanent secondary storage device like a hard drive, along with the fact that the transaction was completed successfully.
ACID is a mechanism that when applied to a transaction, ensures the successful completion of the transaction.
Locking
Databases occasionally need to be 'locked'. This is important when multiple users potentially have access to the same file at the same time. For example, if a file is being updated by User A, you don't want User B accessing the file until User A has finished with it. Reserving a file is known as 'locking' and ensures that the integrity of the file contents is not compromised. An example of a file needing to be locked is in a real time booking system. You wouldn’t want two different booking agents who book cinema tickets booking the same seat at the same time! This will lead to overbooking. One of them must have the right to modify the file and to book tickets. The other agent should have the right to look at the current seat availability, but not the right to book seats, until the file has been updated and released as a result of any bookings taken by the first booking agent.
Redundancy
This topic has been covered in great detail in the Normalisation notes in 1.3.2 as well as the Normalisation Workbook found here:
http://theteacher.info/index.php/a-level-computing/normalisation
Briefly, however, the aim of normalising a database design is to remove the possibility of redundant data from any of the tables. Redundant data is data that has been duplicated somewhere - the same data appears in more than one database table. The result of this is that one set of data is not actually needed i.e. it is redundant and taking up unnecessary storage space. If you have redundant data in a database then problems will start occuring. If you need to delete or edit some data and it appears in more than one place, then you need to remember to delete or edit it in all the different places that it occurs. Data redundancy can confuse the results of searches and can result in searches taking longer than they should (searching a big database with lots of redundant data in takes longer than the same smaller database with no redundant data in).
Please refer to the Normalisation Workbook for a much more detailed explanation of data redundancy.