Although most descriptions of WebRTC tout its rich media sharing capabilities, most discussion centers on video and data sharing. As it happens, WebRTC’s audio codec, Opus (at one time called “Harmony”), is among the most remarkable of its features. A free audio codec in development since 2007 by Octasic, Mozilla and the Xiph Foundation in collaboration with Google, Microsoft’s Skype unit and Broadcom, Opus was recently ratified in 2012 by the IETF as RFC 6716. Moreover, after considering ten codecs, the IETF also reached “strong consensus” to adopt Opus as a mandatory-to-implement (MTI) codec for WebRTC.
Update: Since then there has been a push to adopt more legacy codecs to facilitate interworking. See http://tools.ietf.org/html/draft-marjou-rtcweb-audio-codecs-for-interop-00
In any case, to handle the demands of WebRTC, Opus is not your run-of-the-mill codec locked into a specific bitrate and narrowly optimized for speech, high quality music or video. No, Opus can handle everything from low bitrate voice to high bitrate music coding thanks to the fact that it is an amalgam of Skype’s SILK codec that is optimized for low rate voice coding, and Xiph.org’s Constrained Energy Lapped Transform CELT codec, a very low delay, low CPU/memory requirement successor to Vorbis that handles higher bitrates and thus higher quality audio.
As Wikipedia puts it: “Psychoacoustics is the scientific study of sound perception”. While there’s a lot of theoretical research on the topic, one of the main application of psychoacoustics is lossy audio coding. One of the first codecs to make use of psychoacoustic tricks — long before MP3 was born — is the G.711 (u-law/A-law) codec. In general, lossy audio codecs attempt to reduce the bitrate by coding the audio signal with just enough accuracy to avoid the distortion being audible.
What you can get away with
There are many types of distortion that can be inflicted on an audio signal without causing too much audible degradation. Here are some examples below.
The human ear is almost completely insensitive to the phase of signals. For example, we can’t distinguish between a waveform and its inverted version (the only reason loudspeakers have a red and a black connector is to avoid wiring them 180 out-of-phase with each other and getting cancellation effects). As long as the phase distortion is constant (or nearly constant) in time and that the variation in group delay across frequencies isn’t enough to cause temporal smearing, then the phase can take a lot of …
- No public Twitter messages.
Tags2g 3G 4G advanced military radios Apple audio audio codec backhaul basestation broadband bts Cisco civilian codec DSP echo cancellation enterprise G.711 gsm HD video hspa ipad lte military opus codec over the top (OTT) PBX power psychoacoustics rural small cell basestations small cells speech unified communications (UC) video video conferencing video transcoding voip WebRTC wireless