Video interframe compression algorithms, spacial and temporal redundancy
Introduction
Data compression is where a Maths formula is applied to the data streams to make them take up less space than if they weren't compressed. A video recording typically involves a huge amount of data being stored and it is therefore very important to try to reduce the file size. Smaller files can help in transmission and video streaming (smaller files take less time to transmit than larger files) as well as smaller storage requirements and better playback (smaller files can be handled by media players better than much larger files).
A video can be reduced in size by looking at each frame in the video and seeing what changes have happened since the previous frame. Very often, only a very small part of a frame has actually changed since the previous one was displayed and so you don't really need to store all the information about that frame, only what changes have happened. Considerable savings on the amount of data that needs to be stored can be made. Video interframe compression is the term used to describe the methods of compressing video frames in a stream of video data. We will discuss this below.
Video interframe compression
An 'interframe' is a frame in a stream of video frames, which is defined by the frames that go before or after it. Typically, a video will be made up of a series of keyframes. These contain the entire image. In between these are delta frames. These are frames that only have information about changes from the previous frame. By using as few keyframes and as many delta frames as possible, you can compress or reduce the overall size of a video file.
Spatial redundancy and temporal redundancy
Spacial redundancy is the term used to describe the fact that there is a relationship between a pixel in one frame and a pixel in the next. A pixel can be predicted from its neighbouring pixels. Temporal redundancy is the term used to describe the similarity between successive frames. By removing spatial redundancy and temporal redundancy, a file can be greatly compressed.
Algorithms used
There are a number of ways of approaching how video interframe encoding is done, depending on the encoding used. It can involve the algorithm (the Maths formula):
-
- looking at only a few previous frames and areas in frames (called macroblocks).
- looking at a number of macroblocks in various frames both before and after the target frame and trying to predict the changes.
- looking at the size of the macroblock used in any particular frame.
Block-based difference encoding and Block-based motion encoding
These use a similar idea. Each frame that is being compressed is divided up into areas, called 'macroblocks'. When a compression algorithm is trying to compress a particular frame, it will look through the macroblocks of frames that it has previously encoded to try and find one that is identical or near identical. It it finds one that is identical, it doesn't have to store the information about the macroblock it is encoding - it just has to store a pointer (called a vector) to the macroblock that is identical. Clearly, this cuts down on the amount of data it has to store. When the file is being decompressed, to decompress a particular macroblock that has a vector associated with it simply involves going to find the macroblock pointed at by the vector.
Typically, a macroblock will not be identical, but only 'nearly identical'. The encoding algorithm will then compare the two and identify the differences, called the 'prediction error'. It will then store the original macroblock along with the prediction error. When a macroblock with a vector and prediction error associated with it is being decompressed, again, the macroblock pointed at by the vector will be retrieved along with the prediction error. Using this information, the decoder can reconstruct the macroblock.
Ideally, the encoder wants to find as many identical matching macroblocks as possible, with few prediction errors. This will result in macroblocks being compressed. If it cannot find a matching or near matching macroblock, the whole of the raw data in the macroblock will be stored with no compression. Care has to be taken by the encoding algorithm that it doesn't try to match macroblocks with other macroblocks that have been built using a vector and prediction error. Each time this happens, there is a loss of quality, but it could quickly magnify into a large loss of quality.
Bi-directional motion encoding
Bidirectional motion encoding uses matching blocks from past frames and future frames to encode the current frame.
Conditional replenishment (pixel-based difference encoding)
Similar to block-based difference encoding, each frame in a sequence is analysed and only the pixels that have changed since the previous frame is stored.
Sub-sampling
Sub-sampling involves transmitting only certain frames, perhaps only every 10th frame. The missing frames would need to be interpolated (worked out from existing data) either by using a decoder or by allowing the brain to work out what the missing frames were.