Thrive AudioUltra-realistic Spatial Audio for Virtual Reality
What is it?
THRIVE is a headphone audio system that unlike other "3D audio" technologies, actually recreates an ultra-realistic soundfield at the listeners ears including height and depth. We use the Oculus Rift's head-tracking to inform our soundfield reproduction and in this way our technology is intrinsically linked to the Rift and complements it creating a truly immersive audio-visual experience. In order to enable that, we have developed a Unity plug-in for games developers along with a standalone API for everyone else. We know that the user is not experiencing true VR unless the sounds match the sights and we believe our solution is the most advanced out there.
Who are we?
Thrive Audio is a company made up of a group of Research Engineers from Trinity College Dublin who live and breathe 3D virtual audio environments. Our technology, on which two patent applications were recently filed, stems from almost a decade of internationally acknowledged research into spatial audio. This research has been reported in over 20 peer reviewed publications by the group over the past three years alone. Thrive can now deliver highly realistic, dynamic and very efficient soundfields which react in real-time to listener’s movements providing the most realistic VR experience. Before the advent of the Oculus Rift our team has experimented with virtual surround audio over headphones using the Microsoft Kinect and other head tracking technologies. From this experience, we have worked with multiple developers on the Vibrant Indie Games scene in Dublin, Ireland. However, now, the fusion of VR headset technology with our own has finally made it possible to deliver a truly immersive experience.
We are thrilled to announce that in early 2015 the Thrive Audio team joined Google, with the shared objective of bringing immersive audio to VR!
A virtual environment in which visual and auditory cues are not spatially coincident is obviously flawed. The perceptive effect of this spatial disjunction is often unpredictable as it depends on the events in the virtual environment and the nature of the actions expected of the user or participant. For example, in Virtual Reality gaming with a combined audio over headphones and an immersive HMD video presentation, the disconnect between the displayed audio source and within-head perception of the audio devalues the experience. While in a more complex multi-player game the immersive experience of the player is greatly enhanced by the presentation of spatially accurate audio clues that are stable with voluntary or random head movements by the user.
The THRIVE solution to this problem has its origins in its developers’ research success in establishing three novel concepts:
Our proposition that approximately equal, angle independent sub-systems, exist within the large data sets of response functions known as head related impulse responses (HRIRs) or equivalently head related transfer functions (HRTFs), associated with binaural hearing, has been shown to be correct. This is important as a means of both reducing the length of these angle dependent responses and for the examination of angle dependent features in the response. We have recently reported a theoretical underpinning of this work by showing its equivalence to the approximate factorisation of sets of polynomials with random coefficients and we have also shown that the method can be used to equalise headphone transfer functions. Our recent work shows that applying our analysis to room impulse responses links the root clusters of the room transfer function to its RT60 acoustic characteristic. Our research will simplify binaural transfer functions and in particular allow to extract response features associated with acoustic elevation cues.
Our research into Room Impulse Response (RIR) functions has developed an algorithm, based on dynamic time warping (DTW) that enables location dependent sparse early reflectors to be interpolated to model the response at room locations within a grid of measurements. Our recent research has shown that DTW is a robust method for adapting the direct or first arrival pulse in a matched pursuit type algorithm to estimate the location of arrivals in a room impulse. Both the interpolation and the detection algorithms provide methods for identifying and refining the RIR functions required to create an accurate and enveloping spatial audio environment.
We have investigated the challenge of the signal processing, which combined with head-tracking, performs real-time rendering of spatial audio for delivery over a set of regular headphones. Our system is an approach based on a spatial array of virtual loudspeakers implementing holophonic soundfield synthesis that can incorporate factorised, i.e. shortened angle dependent HRIRs. This enables the soundfield to be rotated and use head-tracking to maintain a stable orientation. This novel spatial audio system has been used in our research to explore the importance of head-tracking in creating an accurate source distance perception in audio only and in audio with 3D video.
Under the Hood
Thrive comprises of a set of C++ implemented core functions based on proprietary Digital Signal Processing (DSP) algorithms which perform the task of faithfully recreating an auditory environment at the listeners ears. The algorithms can roughly be divided among four key sections:
Encoding the incoming audio into a Thrive Soundfield format. This allows Thrive to localize an infinite number of sources with no loss of efficiency.
Processing the encoded sounds with sets of advanced dynamic audio filters which can account for all aspects of spatial hearing, from room reflections to anthropometrics.
Dynamic rotation of the complex soundfield around the listener while maintaining all room acoustic cues. This step is controlled by the user movement data collected from the VR headset.
Decoding of the Thrive Soundfield data into a pair of binaural spatial headphone channels. These are then fed to the users headphones just like conventional left right audio channels.
Our audio system has been developed using native C++ code to provide the best performance possible and grant the widest range of targetable platforms. In keeping with this, we have integrated it into the Unity 3D games development environment in the form of a plugin. Unity shares a similar broad range of applicability with support for desktop computers, consoles and mobile devices.
Through use of the Oculus Rift virtual reality headset we are granted rapid and accurate data on the head movement of the user which is integral to our goal of immersive and realistic audio.
We are excited to announce that in July we started to collaborate with Abydos Entertainment in order to develop a Unity game demo using our full 3-D ultra-realistic sound library! More to come, but here are some early screenshots:
This is our first attempt to implement THRIVE in a large-scale virtual audio-visual model of a cathedral. We have modelled the Dublin landmark Christ Church Cathedral which is spiritual heart of the city. To test the algorithm we use 20 directional sources, real-time generated early reflections and a long reverberation time of 3.5 seconds. Initial tests, both objective and subjective, show a very good match between measured and real-time synthesized acoustic responses!
Currently, we are working on the THRIVE + Unity 3D + Oculus Rift version of the demo, so stay tuned for any updates!
This is a project that we developed using Blender 3D and Pure Data and which we used extensively as a real working bench in order to research multiple real-time immersive audio rendering issues like optimized 3-D spatial reproduction, efficient and accurate rendering of early reflections and late reverberation or approximation of sound source directivity. Multiple outcomes of these investigations became a solid scaffold of THRIVE in its current form.