SoX数字音频处理的瑞士军刀

SoX is a cross-platform (Windows, Linux, MacOS X, etc.) command line utility that can convert various formats of computer audio files in to other formats. It can also apply various effects to these sound files, and, as an added bonus, SoX can play and record audio files on most platforms.

最近在翻译ffmpeg的文档,发现ffmpeg主要是用来操作多媒体文件的流,对于处理流的能力并不很强,主要依赖于滤镜组。容易发现ffmpeg的滤镜组主要提供的是相对简单和粗糙的滤镜功能,如果需要强大的图像和音频处理功能的情况,ffmpeg确实相对困难。这个时候就是SoX出力的时候了,正如前面的SoX项目自己的介绍,她可以完成多格式之间的转换,添加多种声音效果,还可以播放和录制音频文件。

当然,处理音频Audacity是很多人第一个想法,毕竟Audacity的功能更强大,还有良好的界面。但是,我不觉得SoX是用来替换Audacity的,因为Audacity更多的是用来做数字音频的,而不是用来批量化处理音频的。(虽然Audacity也有批量化功能,不过很多效果都没有加到那个里面)相比之下,SoX可以明确的写出需要的音频处理的效果,可以方便的重复使用,在目前的条件下是一个比较方便使用的项目。不过相信随着Audacity的发展,很有可能在未来可以逐渐替代SoX的功能。

对于SoX主要关心的是她的音频效果功能,因为文件格式转换,播放,录音功能在ffmpeg中已经更大程度上的得到实现了。SoX的效果列表如下:

Tone/filter effects [滤波器啦]

allpass: RBJ all-pass biquad IIR filter
bandpass: RBJ band-pass biquad IIR filter
bandreject: RBJ band-reject biquad IIR filter
band: SPKit resonator band-pass IIR filter
bass: Tone control: RBJ shelving biquad IIR filter
equalizer: RBJ peaking equalisation biquad IIR filter
firfit+: FFT convolution FIR filter using given freq. response (W.I.P.)
highpass: High-pass filter: Single pole or RBJ biquad IIR
hilbert: Hilbert transform filter (90 degrees phase shift)
lowpass: Low-pass filter: single pole or RBJ biquad IIR
sinc: Sinc-windowed low/high-pass/band-pass/reject FIR
treble: Tone control: RBJ shelving biquad IIR filter

Production effects [合成效果]

chorus: Make a single instrument sound like many
delay: Delay one or more channels
echo: Add an echo
echos: Add a sequence of echos
flanger: Stereo flanger
overdrive: Non-linear distortion
phaser: Phase shifter
repeat: Loop the audio a number of times
reverb: Add reverberation
reverse: Reverse the audio (to search for Satanic messages ;-)
tremolo: Sinusoidal volume modulation

Volume/level effects [音量效果]

compand: Signal level compression/expansion/limiting
contrast: Phase contrast volume enhancement
dcshift: Apply or remove DC offset
fade: Apply a fade-in and/or fade-out to the audio
gain: Apply gain or attenuation; normalise/equalise/balance/headroom
loudness: Gain control with ISO 226 loudness compensation
mcompand: Multi-band compression/expansion/limiting
norm: Normalise to 0dB (or other)
vol: Adjust audio volume

Editing effects [音频编辑效果,主要是添加、删除一类的功能]

pad: Pad (usually) the ends of the audio with silence
silence: Remove portions of silence from the audio
splice: Perform the equivalent of a cross-faded tape splice
trim: Cuts portions out of the audio
vad: Voice activity detector

Mixing effects [混音效果]

channels: Auto mix or duplicate to change number of channels
divide+: Divide sample values by those in the 1st channel (W.I.P.)
remix: Produce arbitrarily mixed output channels
swap: Swap stereo channels

Pitch/tempo effects [节奏和音高的功能,pitch 就是那个传说中调音高的算法了]

bend: Bend pitch at given times without changing tempo
pitch: Adjust pitch (= key) without changing tempo
speed: Adjust pitch & tempo together
stretch: Adjust tempo without changing pitch (simple alg.)
tempo: Adjust tempo without changing pitch (WSOLA alg.)

Mastering effects [这个不知道是啥意思,就下面两个效果,一个是添加抖动来提升信噪比的,一个效果采样率的。第一个效果不错]

dither: Add dither noise to increase quantisation SNR
rate: Change audio sampling rate

Specialised filters/mixers [比较奇怪的效果]

deemph: ISO 908 CD de-emphasis (shelving) IIR filter
earwax: Process CD audio to best effect for headphone use
noisered: Filter out noise from the audio
oops: Out Of Phase Stereo (or `Karaoke`) effect
riaa: RIAA vinyl playback equalisation

Analysis effects [用来显示统计信息的功能]

noiseprof: Produce a DFT profile of the audio (use with noisered)
spectrogram: graph signal level vs. frequency & time (needs `libpng`)
stat: Enumerate audio peak & RMS levels, approx. freq., etc.
stats: Multichannel aware `stat`

Miscellaneous effects [附加功能, 比较有意思]

ladspa: Apply LADSPA plug-in effects e.g. CMT (Computer Music Toolkit)
synth: Synthesise/modulate audio tones or noise signals
newfile: Create a new output file when an effects chain ends.
restart: Restart 1st effects chain when multiple chains exist.

Low-level signal processing effects [底层的处理功能,感觉和滤波器那组比较像啦]

biquad: 2nd-order IIR filter using externally provided coefficients
downsample: Reduce sample rate by discarding samples
fir: FFT convolution FIR filter using externally provided coefficients
upsample: Increase sample rate by zero stuffing
亲,给点评论吧!