Character sets
Introduction
We know that each time we press a key on our keyboard, it gets saved as a binary code. If we wrote out a list of all the possible letters, numbers, symbols, special characters and punctuation on the keyboard and the code that was associated with each one, then we would have a character set.
Character set
A character set is the set of symbols that we can represent on a computer system at any particular time. If you look at your keyboard, each of the possible keys you could press has its own code. If you wrote down all the possible characters you could press then you would have in front of you the character set known as ASCII.
Standard ASCII and Extended ASCII
The most important set of codes to represent all of the possible key presses on a computer keyboard is the American Standard Code for Information Interchange, or ASCII (pronounced 'ass-key'). It is the set of codes used by Personal Computers. In Standard ASCII, each character on the keyboard is represented by a 7 bit code. There are 96 displayable characters and 32 codes that are used for controlling e.g. printing. In Standard ASCII, for example, the letter 'A' is represented by the 7 bit code 1000001 (65 in decimal), the letter 'a' is 1100001 (97 in decimal), the '?' is represented by 0111111 (63 in decimal), a space is 32 in decimal and Null is decimal code 0. All of the different possible codes together make up what is known as the ASCII character set. If you are using 7 bits to represent a code, you have a total of 27, or 128 possible combinations. That means you can represent 128 different characters!
Standard ASCII uses 7 bits for each code. Programmers and computer people like to work in nice easy packets of 8 bits (called a byte). That means that we have an extra, 8th bit to play with! We can use this extra 8th bit in Standard ASCII for some error checking, looking for errors when bits are sent from one place to another. When bits are being transmitted, there is a real possibility that they get 'corrupted'. In other words, the bits and therefore the codes change and so the message changes, too! It is necessary to check for errors when data is transmitted. The error-checking that takes place using the 8th bit is known as ‘parity checking’.
An alternative is to use all 8 bits for a code instead of 7 bits so you would have a total of 28, or 256 different combinations in your character set. In other words, you can represent 128 more characters than in 7 bit ASCII. You can have a code for the letters that appear in other languages but not in the English alphabet or for graphics symbols, for example. All of the 8-bit codes together are known as the Extended ASCII character set. Most computers today use Extended ASCII so extra characters can be represented. ASCII isn't the only character set in use in computing but it is the most important one.
How the ASCII table is organised
Although memorising all of the ASCII codes might be a little difficult, knowing some key facts about it is useful:
-
- 0 is the ASCII code for null.
- 32 is the ASCII code for the space.
- 65 is the ASCII code for A.
- 97 is the ASCII code for a.
Generally speaking, the ASCII code table is organised like this:
-
- control codes
- punctuation
- digits
- capital letters
- small letters