GStreamer
November 20, 2019
There are some tools out there that are extraordinarily powerful, but with a steep learning curve and infrequent usage, it’s hard to justify investing the time into learning thorougly. GStreamer is one of them. And here is a basic introduction to what it’s all about.
You can look up the full article on Wikipedia and this historic LWN article I found, but the gist of it is that GStreamer is a media pipeline utility composed of many available plugins. Its principle usage is from the command line via gst-launch-1.0
, where you arrange a giant string using syntax that looks like Unix-shell pipes and pass it into the program. It starts with a source (src) and ends with a sink.
For example:
gst-launch-1.0 autovideosrc ! autovideosink
will automatically give you some input and output pair that are highly likely to “just work”. Usually, this will just open up your webcam if you have one and attempt to pipe that feed out to a window on your desktop.
The autovideosrc and autovideosink element pair is most useful for testing/debugging one or both ends. You can start with them and incrementally add other plugins in between, and if the pipeline fails you’ll know what you added that broke it.
Plugins and Elements
An element is an individual pipe in the pipeline. A plugin is just another layer of organization around elements, but it is often the case that a single plugin encapsulates a single element. Sometimes the terms are used interchangeably. The official documentation goes into more detail.
Each element has 1 or more pads of direction: source (output) or sink (input) that defines how the plugin is meant to be used. Since a GStreamer launch line is read from left to right, the mental model is:
[src] => [sink | src] => [sink | src] => [sink]
Use Cases
There are a few general use cases that I’ll briefly cover. More advanced use cases can be derived from them.
Opening a webcam
You can use GStreamer to open a video capture device and display the stream in a window. It’s a little overkill if this is all you wanted to do and you already had some other desktop application for it, but it’s a building block for other usage.
On Linux, an example launch line is:
gst-launch-1.0 v4l2src device=/dev/video0 ! xvimagesink
v4l2src has only one source pad that outputs a video stream from a capture device on the system. On Linux, it defaults to device /dev/video0
and can be supplied with property overrides to constrain the capabilities of the device.
xvimagesink behaves like a subset of autovideosink
, that specifically tries to draw the video frames to a desktop window. autovideosink
, on the other hand, tries to make that easier by falling back to other interpretations of a final sink in case that isn’t possible (like on a headless server).
Or record from it straight to a file:
gst-launch-1.0 v4l2src num-buffers=50 ! queue ! x264enc ! mp4mux ! filesink location=video.mp4
and play it back:
gst-launch-1.0 -v filesrc location=video.mp4 ! avdec_h264 ! videoconvert ! autovideosink
The num-buffers
property just sets a limit on the source so that it doesn’t run forever since we’re recording to a file. Several new elements to talk about here:
queue is an element that provides buffering between elements that can run at different rates. In this case, it’s used to buffer the raw input that will usually outpace an encoder with significant latency.
x264enc is an encoder plugin that compresses the raw video using the h264 codec, but the data remains as a format-less stream until…
mp4mux is a stream-to-container multiplexer plugin. It is intended to merge a video stream and an audio stream into a file in its most simple/common use case.
filesink is a tail-end sink element that writes the stream to a file on disk.
filesrc is basically the opposite of filesink.
avdec_h264 decodes an h264-encoded stream.
videoconvert is one of the “auto-magical” plugins that will fix most incompatibility between adjacent elements for format reasons. Otherwise, it simply passes data right through.
We could have just as easily replaced avdec_h264
+ videoconvert
with decodebin and used more magic in the pipeline. In fact, it’s usually easier to start out with the more magical plugins and then swap them out as you need more finer-grained control. *bin
plugins are the most magical kind of plugins that do almost everything you need out of the box. See: playbin as well.
Video transcoding
You can use GStreamer to open a file and transcode it, that is, decode from one codec and then re-encode into a different codec.
The following is a launch line where we decode our mp4 file from the previous example and re-encode it into a mkv file:
gst-launch-1.0 filesrc location=video.mp4 ! decodebin ! queue ! vp8enc ! matroskamux ! filesink location=video.mkv
vp8enc compresses a video stream using the VP8 codec that is common in web applications that utilize WebRTC (historically the default codec).
matroskamux is another multiplexer plugin that formats the data stream into a Matroska container. The Matroska container, or format, is a superset of WebM.
Wrap Up
By now you might realize that you can accomplish many tasks that you could with a graphical program like good old Handbrake or VLC Media Player by merely piecing together a GStreamer launch line. Also, GStreamer can be used for audio-only tasks as well.
Because of the plugin architecture of GStreamer, third-party plugins can enable a lot of more interesting and advanced use cases like streaming video between multiple pipelines across unix domain sockets or over a network via UDP. These are cases where you actually would want to get your hands dirty tuning your encoder of choice around a desired balance between video quality and latency. I might cover these in separate blog posts.
Lastly, I’m obligated to mention that ffmpeg is usually a viable (and sometimes superior) alternative to GStreamer. But taking into consideration the needs of the project and how they differ in licensing, architecture (pipeline vs executable vs library), you should always use the right tool for the job.
I'm currently a Software Engineering Manager (with a very generalist engineering background across embedded systems, robotics, and Frontend/UI) and I most recently worked at Cruise in the SF Bay Area. Welcome to my blog, where I write about tech, development, having a family, and other interests. You can follow me on X. Or check out my LinkedIn.