Best Practices

From Theora Playback Library
Revision as of 03:19, 28 November 2014 by Kspes (talk | contribs) (Encoding Videos)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Before reading this page, make sure you've integrated libtheoraplayer in your app. More details on the Tutorial page.

Number of precached frames

When creating a TheoraVideoClip object, you can choose how many frames you'd like to "precache". Precaching means decoding frames in advance and storing them until they are ready to be displayed. This mechanism ensures smooth playback but it consumes memory. Because some frames take more time to decode than others, depending on how much of the picture has changed in between frames, you should always precache at least a few frames.

How much frames you should precache depends on many factors the most important of which are:

  • How much memory your target device / platform has
  • How fast is the CPU that decodes the video
  • How much rapid movement your video has (quick changes in picture require more decoding time)

If you have a CPU or Theora is slow to decode on a given system, you can buff out complex frame decoding by increasing the number of precached frames. Fast CPU's generally don't require too much precaching because they can decode fast enough, but if you have many concurrent videos playing, then decoding performance can vary.

In our experience, you should precache 4-16 frames and make the decision dynamically based on 2 factors: how much RAM your device has and how many CPU cores (or hardware threads) you have available. The more RAM, the more precached frames you can afford, but on multi core CPU's decoding is faster.

Keep in mind that each precached frame takes a lot of RAM. For example, a 512x512 RGB video consumes 768kB, which means frame cache of 16 frames takes 12 MB. a 1080p video with a 16 frame cache would take 95 MB!

Number of decoder threads

When initialising the TheoraVideoManager, you can choose how much decoding threads the system creates that handle background video decoding. And as in the previous question, there is no single good answer, that's why you can choose :) If you plan to display only one video in your project or at least one at a time, then you're good with only one decoder thread. If however you plan to play multiple videos at the same time then you should create at least 2 threads, and up to the number of hardware threads your CPU supports (number of cores or twice the number of cores if the CPU supports hyper-threading).

The more threads you have (assuming you have more or the same amount of active videos then the number of decoder threads), the faster videos are decoded. If you have less videos then the number of decoder threads, then you gain no performance bonus by adding more threads.

Videos in Games

If you're using libtheoraplayer for a game project and are using a 3D graphics API such as OpenGL or DirectX then there are many performance factors you have to be aware of.

Texture Swapping

Unlike primitive 2D graphics frameworks, modern display cards tend to parallelize everything. When you're done doing draw calls for one frame in OpenGL for instance, the device probably hasn't finished rendering the frame before you start with the next iteration. Because of this if you use one single texture to upload video frames and draw them, there could be performance issues. The texture you're now uploading video data could be currently in use for your previous display frame. Thus you'll have to wait untill the frame is drawn before you can upload the next video frame onto it. Therefore it is suggested that you use 2 textures to display video data, swapping their use each time you upload new video frame data.

Texture format

Choosing the correct texture format for your platform is very important. The simplest format to convert the video buffer to is RGB, but that format is rarely optimised for the texture. Fore example, DirectX prefers BGRX while OpenGL ES on iOS and ANDROID prefer RGBX. If you use any other format, when uploading pixel data, the display card has to first convert your pixels in it's native format, thus slowing down texture upload speed. So, be careful which format you choose as video decoding performance doesn't always have to be related to slow video decoding!

Texture size

If you're targeting low end devices, it's best not to have videos larger than 1024x1024 (or 2048x1024 if you're using an alpha channel) because low end video cards may not support textures larger than 1024x1024. Also if you're on a low end device, it's also very likely it won't be fast enough to decode such a big video anyway :)

Audio / Video sync

Don't ever just play an audio file and a matching video file and expect them to stay in sync :) The proper way to do A/V sync in libtheoraplayer is to either use the audio track embedded in the actual file or to provide your own file and use a TheoraTimer instance to control your video timer. In the second case, it's best to use the current timestamp of the playing audio file as a time indicator for the video file. That way you're ensuring sync. However, as we have been unpleasantly surprised, be weary of the following fact: Some audio API's don't give you a very precise audio timestamp! For example OpenAL on Android and iOS gives a very crude timestamp resulting in many frame drops as the timer doesn't progress continuously. In this case, we recommend using a float or double timer variable that increases with time but to correct it with the audio timestamp, but not to override it when the timestamp changes but to slowly sync it over the next second or so, to avoid sudden movements.

Encoding Videos

Your decoding performance and file size will greatly vary based on the file you've encoded and how you've encoded it.

Video size

The larger the video is in terms of resolution, the more time it'll take for it to decode. Keep in mind that Theora isn't the fastest codec around and decodes on the CPU, not on a dedicated video hardware. Although this library uses a lot of CPU multimedia extensions to speed up the decoding process, it's still not comparable to using a native hardware accelerated codec (such as decoding MP4 files on iOS and Mac that this library also supports, see Codecs page).

So always use smaller videos where applicable, or be sure you're decoding on a fast machine.

If you're targeting both low end and high end devices, it's a good idea to pack several different versions of the video file. For instance, use a 1080p video on a fast CPU and a 576p on a slower one.

Dimensions divisible by 16

Many codecs (Theora included) process pixels in blocks. Blocks are usually 8x8 or 16x16. Theora uses 16x16. If possible, you should try to match your video dimensions to be divisible by 16, otherwise when the video is encoded, the extra pixels will be filed with pixels taken from the borders of the video. libtheoraplayer supports this feature, so if you have a video with padding, you can extract the actual resolution and offset within the video frame. If you're uploading such an image to a display card texture or any other source that you can draw with a sub-rect, it's best to upload the whole image and then draw the subrect, then to make a copy of the subrect and upload that, to avoid extra processing.

Don't transcode an already compressed video

The biggest mistake you can make is to transcode an already compressed video into the Theora format. If you do that, then you're not only encoding the video itself, but the compression artefacts as well. This will make your Theora file significantly bigger and much slower to decode.

So, ALWAYS encode from an uncompressed source! Read more about encoding video on the Encoding Videos page.

Avoid using filters if possible

If you're making your own video or doing post processing, try to be careful with filters. Grain, noise and filters that otherwise create a lot of distortion of the frames creates a big impact on the video file. Because video codecs encode video segments that have changed, applying such a filter makes the codec think the entire frame has changed, thus resulting in a bigger file which leads to slower decoding performance. Some Codecs deal with this better than others but if possible, avoid using it. If you really have to, it's better to encode a separate video containing the filter and to apply it over the original one when playing the videos in your app than to merge it all in one file.