The Multi-Scale Short-Time Fourier Transform

Overview

This is the companion page to the paper "Enhancing the Quality of Audio Transformations Using The Multi-Scale Short-Time Fourier Transform". It contains various musical excerpts illustrating digital audio effects using the multi-scale short-time Fourier transform (MS-STFT), and comparing it with the standard short-time Fourier transform (STFT).

Background - The Time-Frequency Trade-off

Several audio effects can only achieve high quality by processing the audio signal in the frequency domain. To convert an audio signal to and from the frequency domain, the Short-Time Fourier Transform (STFT) is commonly used. However, the frequency domain suffers from the time-frequency trade-off: it is not possible to have arbitrary good time and frequency resolutions at the same time. Time resolution can only be increased at the expense of frequency resolution and vice-versa. The STFT allows one to use various time-frequency resolutions, however it only permits the use of the same time-frequency resolution during the whole processing and at all frequencies.

The use of a constant time-frequency resolution (with the STFT) introduces artifacts in the audio signal as soon as non trivial audio effects are applied: high frequency resolution (such as with 4096-point DFTs) smears the transients (drums, attacks, etc); high time resolution (such as with 1024-point DFTs) makes steady tones sound dirty, and the best trade-off (such as with 2048-point DFTs) usually introduces both problems.

The Multi-Scale Short-Time Fourier Transform (MS-STFT) aims at:

Paper and presentation

The paper is published by ACTA Press.
Here you can download the paper.

Implementations

The PitchTech collection of LADSPA and VST plugins features several effects implemented with either the STFT or the MS-STFT. They are all the effects that have a "Quality" parameter: quality values less than the middle value are using the STFT while quality values greater than or equal to the middle value are using the MS-STFT.

Related work

For the special cas of audio time stretching, a more recent and faster technique is available.

PitchTech Home Page