Video Lecture Post-Production

Voxel-based Viterbi Active Speaker Tracking (V-VAST) with Best View Selection for Video Lecture Post-Production

This page contains some video demos from Sigmedia's automated tool for video lecture post-production. The automated post-production tool uses a novel speaker tracking algorithm called Voxel-based Viterbi Active Speaker Tracking (V-VAST) to track conversational interactions between lecture participants. This speaker tracking algorithm uses measurements from multiple cameras and multiple microphones. Using this information, the system creates a best view summary of people who are speaking over the duration of the lecture from the available multi-camera views. An example output of the system can be seen in Fig. 1. A demo video showing the output of the system applied to different lecture recordings can be seen by clicking the link "Demo Video" below.

Sample output of the V-VAST and best view selection for video lecture post-production.
Figure 1: Sample output of the automated video lecture post-production tool. The example lecture footage shown is from the CHIL database [1]

Demo Video

Demo Video


[1] ELRA catalogue, "CHIL 2005 Evaluation Package, catalogue reference: ELRA-E0010,""

Page last modified on October 28, 2010