Ambisonics

Sound is three dimensional. We hear sounds from all around us, and from above and below, and we can distinguish their direction reasonably accurately. We use several mechanisms to do this, which I will not discuss here.

The first recordings were mono; just a single speaker was used to reproduce the sound. Stereo is a means to spread reproduced sounds along a line between two speakers, and in music recording "5.1 surround" commonly does little more than adding some reverberation behind the listener to make the experience more "enveloping". These techniques are inherently limited.

Ambisonics is a technique for representing the sound present at a single position, giving equal importance to all directions in three dimensions. When an ambisonic representation of a sound field is well decoded to a suitable number of speakers (which needs to be "sufficient", but the number is not fixed), it provides a listener at the central point with a sound field similar to that which was recorded, such that as many as possible of the cues used by the ears are present.

"First-order" ambisonics uses just four channels, which correspond in microphone terms to an omnidirectional microphone and three orthogonal figure-of-eight (bi-directional) microphones placed at the same point. "Higher-order" ambisonics uses more channels (9, 16, 25... you get the idea), but the additional channels do not correspond to any easily visualised physical microphone (technically they are "spherical harmonics"). All you need to know is that more channels, in this usage, is like more pixels in a picture - it corresponds to a more focussed and precise representation of direction in the sound.

Ambisonic recordings can be used as a basis for extracting stereo or surround mixes which are what most people are able to play back. Higher-order ambisonic recordings can be effectively transformed to binaural for headphone use, as well (first-order, not so much).

Who uses ambisonics?

Although ambisonics was invented way back in the 1970s, because of the practical domestic constraints on numbers of speakers in a home music system it never took off in music recording and reproduction. None the less, a considerable number of recordings of classical music have been made using the technique in a restricted form not using height, and using an encoding called "UHJ" which can be played as stereo but also expanded to the three channels required for full horizontal surround (with a slight loss of directional precision). In particular all the recordings made by the Nimbus record company are ambisonic, encoded as UHJ.

The advent of computer gaming and virtual reality, together with the availability of computers able to handle the necessary maths, have caused an explosion in the use of the technology. The comparative simplicity of the representation makes rotation of a sound field in three dimensions in real time entirely practical, and combined with the ability to generate binaural signals for headphone listening it is fast becoming the standard way of handling sound in these environments.

Additionally, there's a community of composers who create music and soundscapes directly in ambisonics for use in immersive installations.

But there are also people like me who simply record in ambisonics because they can, and without compromising their use of stereo for practical distribution, doing this in the hope that someone might be able to enjoy their recordings to the fullest extent - and in fact some of my recordings have been used in demonstrations of the technique at conferences around the world.

How is ambisonics recorded?

First-order ambisonics is recorded using a microphone with four capsules arranged as closely as possible in a tetrahedral formation; these are made by SoundField, Core Sound, Sennheiser, and others. If height is to be omitted, it is also possible to use a "native B-format" arrangement of an omnidirectional capsule and two figure-of-eight capsules placed closely on a vertical axis.

Higher-order recordings can be made using a microphone with a larger number of capsules placed evenly on the surface of a sphere. The Core Sound OctoMic has 8 capsules for second-order recording (though with reduced vertical resolution), the Zylia microphone has 19 capsules which can be used up to third-order, and the Eigenmike has 32 capsules and can generate a full fourth-order recording; however, the higher orders involve substantial compromises in noise and low-frequency response.

In gaming and VR usage, and in musical composition, mono or stereo sounds may be placed as desired in a higher-order ambisonic sound field, and this doesn't involve the compromises that higher-order microphones have.

How is Ambisonics reproduced?

Unlike other forms of surround recording, Ambisonics uses a description of the soundfield to be reproduced. The playback system can be set up with any suitable loudspeaker configuration, and it will be possible to generate suitable signals for each speaker to give optimum results. Of course "suitable" does imply enough speakers to give the spatial resolution that the order of the Ambisonic recording contains - typically something greater than the number of Ambisonic channels but no more than twice as many - and these speakers should be more-or-less uniformly distributed on a sphere. In practice, systems with speakers above, but not below can give good results. These requirements are already arduous for home use in three dimensions - but for first-order horizontal-only even four speakers in a square can give good results.

It is also possible to generate binaural signals from the Ambisonic data. This can be done in real time, tracking the movement of the listener's head to maintain the apparent position of the soundfield.

How is Ambisonics delivered?

Ambisonics requires more channels than stereo, so cannot be delivered on fundamentally 2-channel media such as LP, CD or cassette tape (though a matrixed version of first-order horizontal-only Ambisonics known as UHJ has been used effectively by the Nimbus recording company and some others). But these days it is easy to deliver files containing arbitrary numbers of channels, or to stream them from a server across the Internet. Because most people do not yet have systems which can process the Ambisonic signals into speaker signals (though this may be built in to virtual reality systems or games), some streaming sites can deliver stereo or binaural streams which reflect the position of an image being displayed.

Who invented ambisonics?

The name of Michael Gerzon, based in Oxford, is the first that comes to mind in the history of ambisonics. He formalised the mathematics as well as many of the practicalities of decoding the signals to play most effectively on any particular arrangement of speakers.

Michael Gerzon and Peter Craven (university friends of mine) together invented the soundfield microphone, with its arrangement of four capsules.

But the initial development of ambisonics arose in parallel in several places - so other names should be mentioned: Peter Fellget of Reading University, and John Wright of IMF Electronics in particular. All those involved would also acknowledge the pioneering work done in stereo by Alan Blumlein of EMI around 1930.

There is a documentary video the the Oxford University Music Department's web site which talks about the early experiments of Michael Gerzon and friends, to which I contributed: watch it here. The diaries of Stephen Thornton, who documented that period, are also now hosted on that site.