A still image. Video is a series of frames played in rapid succession that we perceive as movement. Like a flip book.


The dimensions of each frame of video. You could have video that is just 100 pixels across and 200 pixels high - but you wouldn’t be able to make out a lot of detail.

High Definition (HD) video is usually 1080p (1920x1080) or 720p (1280x720).

Framerate (FPS)

How many frames (still images) there are in each second of video. This can be very high (thousands of frames) but is usually pretty low (around 30 frames per second).

Film (as in, the cinema) is usually shot at 24fps, which is why it looks a little different to the television or what you see on the internet most of the time.


The number of bits per second. 1 Byte is 8 bits. Higher corresponds to more data, faster, and therefore better image quality (usually).

If you were trying to watch a football match on the internet, you might see a selection of streams available at different bitrates. For example: 5000kbps (1080p), 2500kpbs (720p), and a low-res 700kps option. Note that those are just feasible examples, 5000kbps is not equal to 1080p.

1080p and 720p are resolutions (pixel dimensions). The bitrate is typically higher for higher resolutions because there are more pixels in each frame, and therefore more bits of information being sent and received each second.

Note that the bitrate also depends on the frame rate of the video. HD footage that only uses 24 frames a second might have a lower bitrate than low resolution footage that uses thousands of frames a second.

Coding/compression format

Compressed video comes in many flavours.

Examples: H.264. AV1. MPEG-4 Part 2.


A file format that bundles our compressed video with other stuff, like an accompanying audio stream (which itself has a compression format - like AAC or MP3).

Examples: AVI, FLV, MP4.


CO mpressor - DEC ompressor

Something that understands a certain compression format, and can perform the compression-decompression of video to and from that format.

Examples: Xvid. x264. Quicktime H264.


Strictly speaking - you encode video into a format.

Generally speaking - an encoder is something that converts information (any kind of information) between different formats. It’s the same for video. ‘Encoding’ is sometimes used to describe all of this stuff about codecs and compression formats as a whole topic, or someone might say that video is “H.264 encoded”, or they might refer to part of a codec as an encoder. For example: “x264 is an H.264 encoder”.


You decode video from a format.


Taking encoded video, decoding it, altering it, re-encoding it. Again, sometimes people just call this encoding. It’s a conversion of formats.

Note that you could actually re-encode your video back to the same format. You’d do that if you just wanted to change the video somehow, but not the format it’s compressed with. For example you might want to change the bitrate or resolution of some H.264 video, but still store it as H.264 afterwards.


Interlaced video doesn’t display one frame at a time. Instead it basically displays half of one frame and half of the next at the same time. Then it will display the other half of that next frame with half of the frame after that, and so on. It does this by splitting each frame into lots of thin horizontal bands, which are then interlaced. The purpose of this is to try and trick the human eye into perceiving a higher frame rate than the video was actually recorded at, making it look smoother.


Various ways the video can look wrong (distorted).


When information from multiple frames of video is shown in a single screen draw.


When video is sent over the network and can be viewed in real-time. Video doesn’t have to be live to be streamed. Netflix for example is a streaming service that only streams pre-recorded content. Instead, ‘real-time’ refers to the fact that a user is watching content as it arrives over the network, before the rest of the video has downloaded. The opposite of streaming would be to download a video file to your device, and then open that file in a media player when it has finished downloading in its entirety.


Real-time Transport Protocol. A network protocol designed specifically for the real-time delivery of media.


Real-Time Streaming Protocol. A network control protocol for controlling media servers. RTSP doesn’t actually transport the media, but instead sets up and controls connections between the server and client.


HTTP Live Streaming. An alternative to RTP for streaming media using only standard HTTP transactions.


FFmpeg (with big ‘F’s) is a suite of tools (with little ‘f’s) for doing all manner of things with media:

  • ffmpeg: convert it
  • ffprobe: analyse it
  • ffserver: stream it
  • ffplay: play it


Presentation Timestamp and Decoding Timestamp.

Aptly named, these tell us when a piece of information needs to be decoded, and when it needs to be displayed. To understand why those times might be different you just need to know that most video (and audio) formats store the information in much more complex ways than a big list of frames to be decoded and played one after another - specifically, frames come in different flavours and might depend on each other. So you might have to decode frames 1, 2, 3 and 4 before you can actually display frame 1.