SOX(1)
NAME
sox - SOund eXchange - universal sound sample translator
SYNOPSIS
sox infile outfile
sox infile outfile [ effect [ effect options ... ] ]
sox infile -e effect [ effect options ... ]
sox [ general options ] [ format options ] ifile [ for-
mat options ] ofile [ effect [ effect options ... ] ]
General options: [ -h ] [ -V ] [ -v volume ]
Format options: [ -t filetype ] [ -r rate ] [ -s/-u/-U/-A
] [ -b/-w/-l/-f/-d/-D ] [ -c channels ] [ -x ]
Effects:
copy
rate
avg [ -l | -r ]
resample
mask
stat
echo delay volume [ delay volume ... ]
vibro speed [ depth ]
lowp center
highp center
band [ -n ] center [ width ]
DESCRIPTION
Sox translates sound files from one format to another,
possibly doing a sound effect.
OPTIONS
The option syntax is a little grotty, but in essence:
sox file.au file.voc
translates a sound sample in SUN Sparc .AU format into a
SoundBlaster .VOC file, while
sox -v 0.5 file.au -rate 12000 file.voc rate
does the same format translation but also lowers the
amplitude by 1/2 and changes the sampling rate from 8000
hertz to 12000 hertz via the rate sound effect loop.
File type options:
-t filetype
gives the type of the sound sample file.
-r rate Give sample rate in Hertz of file.
-s/-u/-U/-A
The sample data is signed linear (2's comple-
ment), unsigned linear, U-law (logarithmic), or
A-law (logarithmic). U-law and A-law are the
U.S. and international standards for logarithmic
telephone sound compression.
-b/-w/-l/-f/-d/-D
The sample data is in bytes, 16-bit words,
32-bit longwords, 32-bit floats, 64-bit double
floats, or 80-bit IEEE floats. Floats and dou-
ble floats are in native machine format.
-x The sample data is in XINU format; that is, it
comes from a machine with the opposite word
order than yours and must be swapped according
to the word-size given above. Only 16-bit and
32-bit integer data may be swapped. Machine-
format floating-point data is not portable.
IEEE floats are a fixed, portable format. ???
-c channels
The number of sound channels in the data file.
This may be 1, 2, or 4; for mono, stereo, or
quad sound data.
General options:
-e after the input file allows you to avoid giving
an output file and just name an effect. This is
only useful with the stat effect.
-v volume Change amplitude (floating point); less than 1.0
decreases, greater than 1.0 increases. Note: we
perceive volume logarithmically, not linearly.
Note: see the stat effect.
-h Print version number and usage information.
-V Print a description of processing phases. Use-
ful for figuring out exactly how sox is mangling
your sound samples.
The input and output files may be standard input and out-
put. This is specified by '-'. The -t type option must
be given in this case, else sox will not know the format
of the given file. The -t, -r, -s/-u/-U/-A,
-b/-w/-l/-f/-d/-D and -x options refer to the input data
when given before the input file name. After, they refer
to the output data.
If you don't give an output file name, sox will just read
the input file. This is useful for validating structured
file formats; the stat effect may also be used via the -e
option.
FILE TYPES
Sox needs to know the formats of the input and output
files. File formats which have headers are checked, if
that header doesn't seem right, the program exits with an
appropriate message. Currently, raw (no header) binary
and textual data, IRCAM Sound Files, Sound Blaster, SPARC
.AU (w/header), Mac HCOM, PC/DOS .SOU, Sndtool, and
Sounder, NeXT .SND, Windows 3.0 RIFF/WAV, Turtle Beach
.SMP, CD-R, and Apple/SGI AIFF and 8SVX formats are sup-
ported.
.aiff AIFF files used on Apple IIc/IIgs and SGI.
Note: the AIFF format supports only one SSND
chunk. It does not support multiple sound
chunks, or the 8SVX musical instrument descrip-
tion format. AIFF files are multimedia archives
and and can have multiple audio and picture
chunks. You may need a separate archiver to
work with them.
.au SUN Microsystems AU files. There are apparently
many types of .au files; DEC has invented its
own with a different magic number and word
order. The .au handler can read these files but
will not write them. Some .au files have valid
AU headers and some do not. The latter are
probably original SUN u-law 8000 hz samples.
These can be dealt with using the .ul format
(see below).
.hcom Macintosh HCOM files. These are (apparently)
Mac FSSD files with some variant of Huffman com-
pression. The Macintosh has wacky file formats
and this format handler apparently doesn't han-
dle all the ones it should. Mac users will need
your usual arsenal of file converters to deal
with an HCOM file under Unix or DOS.
.raw Raw files (no header).
The sample rate, size (byte, word, etc), and
style (signed, unsigned, etc.) of the sample
file must be given. The number of channels
defaults to 1.
.ub, .sb, .uw, .sw, .ul
These are several suffices which serve as a
shorthand for raw files with a given size and
style. Thus, ub, sb, uw, sw, and ul correspond
to "unsigned byte", "signed byte", "unsigned
word", "signed word", and "ulaw" (byte). The
sample rate defaults to 8000 hz if not explic-
itly set, and the number of channels (as always)
defaults to 1. There are lots of Sparc samples
floating around in u-law format with no header
and fixed at a sample rate of 8000 hz. (Certain
sound management software cheerfully ignores the
headers.) Similarly, most Mac sound files are
in unsigned byte format with a sample rate of
11025 or 22050 hz.
.sf IRCAM Sound Files.
SoundFiles are used by academic music software
such as the CSound package, and the MixView
sound sample editor.
.voc Sound Blaster VOC files.
VOC files are multi-part and contain silence
parts, looping, and different sample rates for
different chunks. On input, the silence parts
are filled out, loops are rejected, and sample
data with a new sample rate is rejected.
Silence with a different sample rate is gener-
ated appropriately. On output, silence is not
detected, nor are impossible sample rates.
.auto This is a ``meta-type'': specifying this type
for an input file triggers some code that tries
to guess the real type by looking for magic
words in the header. If the type can't be
guessed, the program exits with an error mes-
sage. The input must be a plain file, not a
pipe. This type can't be used for output files.
.cdr CD-R
CD-R files are used in mastering music Compact
Disks. The file format is, as you might expect,
raw stereo raw unsigned samples at 44khz. But,
there's some blocking/padding oddity in the for-
mat, so it needs its own handler.
.dat Text Data files
These files contain a textual representation of
the sample data. There is one line at the
beginning that contains the sample rate. Subse-
quent lines contain two numeric data items: the
time since the beginning of the sample and the
sample value. Values are normalized so that the
maximum and minimum are 1.00 and -1.00. This
file format can be used to create data files for
external programs such as FFT analyzers or graph
routines. SOX can also convert a file in this
format back into one of the other file formats.
.smp Turtle Beach SampleVision files.
SMP files are for use with the PC-DOS package
SampleVision by Turtle Beach Softworks. This
package is for communication to several MIDI
samplers. All sample rates are supported by the
package, although not all are supported by the
samplers themselves. Currently loop points are
ignored.
.wav Windows 3.0 .WAV RIFF files.
These appear to be very similar to IFF files,
but not the same. They are the native sound
file format of Windows 3.0. (Obviously, Windows
3.0 was of such incredible importance to the
computer industry that it just had to have its
own sound file format.) Normally .wav files
have all formatting information in their head-
ers, and so do not need any format options spec-
ified for an input file. If any are, they will
overide the file header, and you will be warned
to this effect. You had better know what you
are doing! Output format options will cause a
format conversion, and the .wav will written
appropriately. Note that it is possible to
write data of a type that cannot be specified by
the .wav header, and you will be warned that you
a writing a bad file !
.maud An Amiga format
An IFF-conform sound file type, registered by MS
MacroSystem Computer GmbH, published along with
the "Toccata" sound-card on the Amiga. Allows
8bit linear, 16bit linear, A-Law, u-law in mono
and stereo.
.vwe Psion 8-bit alaw
These are 8-bit a-law 8khz sound files used on
the Psion palmtop portable computer.
EFFECTS
Only one effect from the palette may be applied to a sound
sample. To do multiple effects you'll need to run sox in
a pipeline.
copy Copy the input file to the
output file. This is the
default effect if both files
have the same sampling rate,
or the rates are "close".
rate Translate input sampling
rate to output sampling rate
via linear interpolation to
the Least Common Multiple of
the two sampling rates.
This is the default effect
if the two files have
different sampling rates.
This is fast but noisy: the
spectrum of the original
sound will be shifted
upwards and duplicated
faintly when up-translating
by a multiple. Lerp-ing is
acceptable for cheap 8-bit
sound hardware, but for CD-
quality sound you should
instead use:
resample [ rolloff [ beta ] ] Translate input sampling
rate to output sampling rate
via simulated analog filtra-
tion. This method is slow
and uses lots of RAM, but
gives much better results
then rate.
mask Add "masking noise" to sig-
nal. This effect deliber-
ately adds white noise to a
sound in order to mask quan-
tization effects, created by
the process of playing a
sound digitally. It tends
to mask buzzing voices, for
example. It adds 1/2 bit of
noise to the sound file at
the output bit depth.
avg [ -l | -r ] Reduce the number of chan-
nels by averaging the sam-
ples, or duplicate channels
to increase the number of
channels. Valid combina-
tions are 1 - 2, 1 - 4, 2 -
4, 4 - 2, 4 - 1, 2 - 1. The
-l or -r option averages
from just left or right
channels/duplicates to just
the left or right channels.
stat Do a statistical check on
the input file, and print
results on the standard
error file. stat may copy
the file untouched from
input to output, if you
select an output file. The
"Volume Adjustment:" field
in the statistics gives you
the argument to the -v
number which will make the
sample as loud as possible.
echo [ delay volume ... ] Add echoing to a sound sam-
ple. Each delay/volume pair
gives the delay in seconds
and the volume (relative to
1.0) of that echo. If the
volumes add up to more than
1.0, the sound will melt
down instead of fading away.
vibro speed [ depth ] Add the world-famous Fender
Vibro-Champ sound effect to
a sound sample by using a
sine wave as the volume
knob. Speed gives the Hertz
value of the wave. This
must be under 30. Depth
gives the amount the volume
is cut into by the sine
wave, ranging 0.0 to 1.0 and
defaulting to 0.5.
lowp center Apply a low-pass filter.
The frequency response drops
logarithmically with center
frequency in the middle of
the drop. The slope of the
filter is quite gentle.
highp center Apply a high-pass filter.
The frequency response drops
logarithmically with center
frequency in the middle of
the drop. The slope of the
filter is quite gentle.
band [ -n ] center [ width ] Apply a band-pass filter.
The frequency response drops
logarithmically around the
center frequency. The width
gives the slope of the drop.
The frequencies at center +
width and center - width
will be half of their origi-
nal amplitudes. Band
defaults to a mode oriented
to pitched signals, i.e.
voice, singing, or instru-
mental music. The -n (for
noise) option uses the
alternate mode for un-
pitched signals. Band
introduces noise in the
shape of the filter, i.e.
peaking at the center fre-
quency and settling around
it.
cut loopnumber Extract loop #N from a sam-
ple.
map Display a list of loops in a
sample, and miscellaneous
loop info.
pick Select the left or right
channel of a stereo sample,
or one of four channels in a
quadrophonic sample.
split Turn a mono sample into a
stereo sample by copying the
input channel to the left
and right channels.
Sox enforces certain effects. If the two files have dif-
ferent sampling rates, the requested effect must be one of
copy, or rate, If the two files have different numbers of
channels, the avg effect must be requested.
reverse Reverse the sound sample
completely. Included for
finding Satanic subliminals.
BUGS
The syntax is horrific. It's very tempting to include a
default system that allows an effect name as the program
name and just pipes a sound sample from standard input to
standard output, but the problem of inputting the sample
rates makes this unworkable.
FILES
SEE ALSO
NOTICES
The echoplex effect is: Copyright (C) 1989 by Jef
Poskanzer.
Permission to use, copy, modify, and distribute this soft-
ware and its documentation for any purpose and without fee
is hereby granted, provided that the above copyright
notice appear in all copies and that both that copyright
notice and this permission notice appear in supporting
documentation. This software is provided "as is" without
express or implied warranty.