Audio/video editing systems are used by millions of people worldwide to edit all types of
digital audio and video content. The audio waveform display is commonly used by these
systems to represent audio tracks and is the focal point of user interaction when editing
audio.
One of the greatest challenges for users of these systems is navigation,
that is, finding one's way around the audio or video file being edited. Users try
to recognize and remember the shape of amplitude envelopes, which are the only visual
landmarks in the waveform display. Such mental gymnastics are taxing and usually
unsuccessful: users must repeatedly audition segments to orient themselves within the file.
The Comparisonics® patented invention for coloring the waveform display is a
dramatic and revolutionary enhancement.
(See examples in the Sound Gallery.)
The coloring communicates a wealth of information to the user, making the editing process
faster, easier, more accurate, and more enjoyable. Imagine editing music and being able to
see each note of a melody in the waveform display. Video editors are frequently tasked
with synchronizing audio with video; now they can see in the waveform display the onset of
a sound to be aligned with a particular video frame.
While the colored waveform display facilitates visual navigation, sound matching enables
automatic navigation. Document editing systems (word processors) permit users to
search a text file for specific words. Audio/video editing systems have had no counterpart
for searching the content of audio and video files. However, using the Comparisonics
sound-matching technology, an editor can select any sound in the
waveform display and automatically locate similar sounds in the file. Editors can search
for occurrences of any sound, including speech, applause, sound effects, and musical
elements (e.g., notes, chords, drum beats, cymbal crashes).
Like a word processor, an audio editing system can offer a "search-and-replace"
capability. That is, the system could locate the occurrences of a sound and apply a
user-specified operation to each occurrence. This operation might be the application of
an effect or filter, or simply a deletion. For example, a user might wish to find and
suppress breath sounds in a speech recording.
It is possible to search beyond the material being edited. An editor can select any
sound in the waveform display and automatically find similar sounds in audio/video
databases and locate similar sounds on the Web.
This capability gives the editor unprecedented access to available sounds.