[ home ]

Music File Processing on Linux

Music (or audio) files are available in a bewildering array of formats other then than the well-known MP3 format, and as bandwidths and storage capacity keep increasing the lossless compression formats will come increasingly into use, especially by those concerned with maintaining the integrity of the original analog or digital recordings for as long as possible. This is a guide - designed to be as brief yet complete as possible - to how to convert amongst these formats via the command line on a Linux platform.

We'll now briefly cover the gist of the philosophical issues. If you are also interested in maintaining the integrity of the original recordings as they pass through successive generations and hands, then always keep and make available the highest quality version of each recording you obtain. That is, if you obtain something in CD, i.e. WAV, format, then make it available either in that format or in one of the lossless compression formats, e.g. SHN, APE, FLAC, etc. If you obtain something in a lossless compression format, then keep a copy in that format to make available or to make further copies. That is, if you make MP3 copies of, say, SHN audio files, keep the original SHN files around.

The philosophical issues that verge on religious issues concerning the sound quality of various MP3 sampling rates can be pragmatically resolved only on an individual basis. Why? Because hearing capabilities vary widely on an individual basis. An instructive and fun exercise is to make CDs from MP3s recorded at several different rates, e.g. 96, 128, 192, 256, and listen to them in a blind listening test. This will tell you something about your hearing capabilities, e.g. if you can't tell the difference between, say, 128 and 192 then you can get a whole lot more music on your portable listening device without sacrificing discernable sound quality.

A final point to make: Don't confuse the quality of the recording with the quality of the performance. There are thousands of marvelous performances by masterful musicians that - for one reason or another - were recorded with less than the crystal clarity of just about any of the latest releases. For instance, the Louis Armstrong Hot Five recordings of the 1920s and 1930s were recorded on the equipment available at the time, and myself and countless other fan(atics) would give our (or at least someone else's) left cojone for those to have been recorded on the technology of even the 1950s. But it ain't so, and will remain so until one of us finishes a time machine. That doesn't make Armstrong's performances any less incredible than they were; it just makes the medium on which they've been transmitted to us less than completely satisfactory. Many crystal-clear recordings (as well as many murky recordings, to be fair) might as well be of a toilet flushing for all the pleasure they give, and many murky recordings (as well as crystal-clear recordings) are of stunningly good performances. Don't make the mistake of judging a performance solely on the quality of the recording.


Specific Tasks


Specific Software


Specific Tasks


Converting an MP2 File into a WAV File

Use lame, e.g.

lame --decode file.mp2

or

mlame -o "--decode" *.mp2


Converting the Sample Rate from 48000 to 44100

Use sox, e.g.

sox file-48000.wav -r 44100 file-44100.wav


Creating a Music CD from MP3 Files

The goal is to create a music CD from MP3 files. By music CD we mean one that will play in an "old-fashioned" CD player, i.e. one that won't play MP3 files burned onto a CD-ROM as data files. There are philosophical, i.e. religious, issues surrounding this sort of conversion, of which you can find reams of debate about and by angels and pinheads via any search engine. We are not concerned about that here (which, by the way, is not to imply that we are not concerned about the topic at all). Here we are concerned merely with getting from point A to point B with a minimum of hair loss and wall damage.

Software Needed

The software needed for this task is:

Convert the MP3 Files Into WAV Files

That being accomplished, we'll start processing the MP3 files. First, we must convert the MP3 files into WAV files via:

lame --decode -v musicfile.mp3

which will by default create a file named musicfile.mp3.wav, which should be much bigger than the original file musicfile.mp3. You can also do this in batch mode via mlame, e.g.

mlame -o "--decode" *.mp3

Process MP3 files into WAV files until you have enough to fill a blank CD, which is usually less than 700 megabytes, although your mileage may vary with different sized blank CDs.

Tweak the WAV Files

Next, we must fix the boundaries of the WAV files with shntool, e.g.

shntool fix -noskip *.wav

The standard for music CDs requires that track breaks occur at multiples of 2352 bytes. When lame converts an MP3 file to a WAV file, one of the things it doesn't do is ensure that the length of the resulting WAV file is a multiple of 2352. Thus, we must use shntool to do it.

Next, we might want to normalize all of the boundary-corrected WAV files, that is, check the sound level of each and adjust them all so the average sound level is the same in each. This is done via normalize, e.g.

normalize -m *.wav

This step is usually not needed, although may be required for some live recordings wherein the sound recording levels vary drastically amongst the tracks. If you're not sure, run a test. If the indicated corrections are no more than a few percent for all tracks, then you can either skip this step or go ahead and burn the tracks containing what will be unnoticeable level corrections. If you're getting corrections in the tens of percents, then you probably need to do this to prevent future episodes of constantly having to twiddle with the volume level when playing the CD.

Write the Finished WAV Files to a CD

Finally, we can write them to disk using cdrecord, e.g.

cdrecord -v speed=10 dev=0,3,0 -audio *.wav

The number after dev can be found by running:

cdrecord -scanbus

This will give you a list resembling:

scsibus2:
        2,0,0   200) 'HP      ' 'CD-Writer+ 9300 ' '1.0c' Removable CD-ROM
        2,1,0   201) *
        2,2,0   202) *
	2,3,0   203) *
	  ...

with the number you need being the comma-separated triplet preceding the name of the CD writer you are using, in this case 2,0,0. If you have multiple burners on your system you'll need to pick out the one you want for a given burn.

The maximum value of the speed number will be based on the capability of your burner, i.e. if it's rated 10X then you can specify a maximum of speed=10. You can also set this to lower numbers, which will cut down on the number of write errors, although probably not at any noticeable level, i.e. to the point where you'll hear any difference. You can also set this higher than the rated capacity of your hardware, and it may go a bit faster than the actual rating. If speed is set higher than the hardware can support, it will default to the highest rate that cdrecord can detect.

Another very highly recommended option - especially if you're trying to do this on a busy computer - is driveropts=burnfree. This will prevent buffer underruns on drives that support Buffer Underrun Free technology, i.e. it will keep the buffer from emptying and stopping your CD write in mid-burn. Use this as a default and this won't happen to you, i.e.

cdrecord -v speed=10 dev=0,3,0 driveropts=burnfree -audio *.wav

A less necessary option is -overburn. This will allow the burner to attempt to write more than the official size of the medium you're using. Most blank media may hold more space than the official listed size, with there usually being around 90 seconds extra available. This is dependent on whether your drive implements this feature. If the needed extra space is greater than the available extra space, the last track or WAV file will get truncated.


Creating MP3 Files from Music CDs

Software Needed

The software needed for this task is:

Ripping the Tracks

The first step in creating MP3 files from music CDs is to extract the 0s and 1s from the CD with cdparanoia. The quick and dirty way is to put a CD into your drive and enter the command:

cdparanoia -B

This will create a separate WAV file for each track on the CD, e.g.

track01.cdda.wav
track02.ccda.wav
...
track16.cdda.wav
You can choose to rip individual tracks as well. For instance, if you want only the second track you specify:

cdparanoia -B 2

If you want a range of tracks, for instance the 2nd through the 5th, you would specify:

cdparanoia -B "2-5"

If your argument is going to be more than a single digit, e.g. 2, it's a good idea to place it within quotations as is shown. You can also rip spans within tracks, with the syntax explained in the man page for cdparanoia, accessible via the command man cdparanoia.

Converting the Tracks into MP3s

Once you have all the WAV files, you can convert them into MP3 files one at a time using lame, for example. the command:

lame track01.cdda.wav

will by default convert WAV file track01.cdda.wav into MP3 file track01.cdda.wav.mp3.

The batch command mlame can be used to process multiple files with one command, e.g.

mlame *.wav

will process all the WAV files in your current directory. For example, it will by default convert:

track01.cdda.wav
track02.cdda.wav
track03.cdda.wav
into:

track01.cdda.mp3
track02.cdda.mp3
track03.cdda.mp3
The default behavior for mlame is to convert the WAV files into 44.1 kHz, variable bit rate MP3 files. The -l option flag will cause it to instead create 44.1 kHz, 128 kbps MP3 files. The mlame script default options can be overridden via the -o option flag. For example, if you want to create 44.1 kHz, 192 kbps MP3 files you would specify:

mlame -o "-b 192" *.wav

There are many, many further options for lame that can be perused via man lame.


Splitting/Concatenating MP3 Files

Software Needed

The software needed for this task is:

Occasionally one wants to either concatenate several individual MP3 tracks or files into a single, larger file, or to split such a larger file into the individual tracks contained therein. These tasks can be accomplished with mp3splt and mp3wrap.

Concatenating Single Files

Entering mp3wrap without any arguments gives a quick summary of the available options, of which there are only a few.

USAGE
        mp3wrap [options] OUTPUTFILE MP3FILE1 MP3FILE2 [MP3FILE3]...
OPTIONS
        -a  Add the specified files to an existing wrap file
        -l  List files wrapped in OUTPUTFILE. (-lv for complete infos)
        -v  Verbose mode. Will display additional informations.

So to concatenate several files, say:

track01.mp3
track02.mp3
...
track07.mp3

you would use the command:

mp3wrap album.mp3 track*.mp3

which will create a file named album_MP3WRAP.mp3 containing all 7 of the desired tracks. The software creators recommend that you do not remove the string MP3WRAP from the name of the concatenated file since it provides mp3splt with useful information should you want to recover the original 7 files.

Splitting Concatenated Files

Splitting concatenated files can be a more complex task than concatenating them. This is illustrated by the veritable plethora of options one gets - as opposed to those for mp3wrap - when entering mp3splt without any arguments.

USAGE (Please read man page for complete documentation)
      mp3splt [OPTIONS] FILE... [BEGIN_TIME] [END_TIME...]
      TIME FORMAT: min.sec[.0-99], even if minutes are over 59.
OPTIONS
 -w   Splits wrapped files created with Mp3Wrap or AlbumWrap.
 -l   Lists the tracks from file without extraction. (Only for wrapped mp3)
 -e   Error mode: split mp3 with sync error detection. (For concatenated mp3)
 -f   Frame mode (mp3 only): process all frames. For higher precision and VBR.
 -c + file.cddb, file.cue or "query". Get splitpoints and filenames from a
      .cddb or .cue file or from Internet ("query"). Use -a to auto-adjust.
 -t + TIME: to split files every fixed time len. (TIME format same as above).
 -s + PARAMETERS (th,nt,off,min,rm or auto). Scan the file to find silence
      points and splits all or a user number (nt parameter) tracks.
 -a + PARAMETERS (th,gap,off or auto) try to adjust splitpoints with silence.
 -o + FORMAT: output filename pattern. Can contain those variables:
      @a: artist, @p: performer (only CUE), @b: album, @t: title, @n: number
 -d + DIRNAME: to put all output files in the directory DIRNAME.
 -k   Consider input not seekable (slower). Default when input is STDIN (-).
 -n   No Tag: does not write ID3v1 or vorbis comment. If you need clean files.
 -q   Quiet mode: do not prompt for anything and print less messages.

The easiest example of the use of mp3splt is to simply undo what was done in the previous mp3wrap example, i.e.

mp3splt -w album_MP3WRAP.mp3

which will recover all of the original MP3 files used to create album_MP3WRAP.mp3.

Many further examples of the more complex use of this utility can be found by entering man mp2splt and looking through the man page. I'll extract a few of those examples sometime in the next year or so.


Converting Amongst Lossless Formats

The shntool is nicely configured to convert amongst the lossless audio formats, e.g. WAV, FLAC, SHN, APE, etc. The general format is:

shntool conv -o [fmtout] *.[fmtin]

where [fmtout] is the desired output audio file format (amongst the available wav, aiff, shn, flac, ape, etc.) and [fmtin] is the audio format of the input files. For example, if you want to convert a bunch of SHN files (*.shn) into WAV files, use:

shntool conv -o wav *.shn

Since WAV is the default output format for conversions, you can omit the -o wav piece when converting to WAV files.


Creating/Modifying Metadata within Sound Files

Software Needed


Converting Real Audio Streams to MP3

Software Needed

Download the Audio Stream into a File

The easiest way to convert Real Audio streams to MP3 format is via MPlayer, the incredibly wonderfully marvelous multimedia player for Linux. If you install it correctly, i.e. with the correct Real Audio libraries, the process is as simple as:

mplayer -playlist file.ram -ao pcm -aofile file.wav -vc dummy -vo null

where file.ram is the file containing the actual URL of the Real Audio file. An example URL from the always interesting New Sounds radio show is:

pnm://66.150.15.101:7070/realimpact/wnyc/rans/newsounds2073.ra?cloakport=80,554,7070

You can usually obtain the files containing the URLs by shift clicking on the desired Real Audio stream file rather than just clicking on it. This will allow you to save the URL to a file that will replace the file.ram placeholder file in the mplayer command. It's a good idea to replace file in the string file.wav with the same thing you used to replace it in file.ram. And by the way, there's nothing sacred about the ram suffix. You can use any file name you want for the URL file, with or without a suffix. It's a good idea, though, to retain the wav suffix since it's a WAV file you're going to create and that's a standard suffix for such things.

Converting to MP3

Once you've obtained your desired URL file and run the above command, mplayer will download the Real Audio file in real time, i.e. if the Real Audio file is an hour long it will take an hour, and convert it on the fly into WAV format. After the file's been completely downloaded - and these can get pretty big, e.g. 600 or more megabytes for a hour-long show - you can convert the WAV file to an MP3 file via the usual procedure, e.g.

lame file.wav file.mp3

It's just that easy.

Further details such as how to automate the process and what all the command-line options actually do can be found at Recording Streaming Audio with MPlayer, where I originally found the information.


Working with M4A/MP4 Files

Software Needed

Useful Sites


Working with 24-Bit Files

An open source package for creating 24-bit DVD-Audio discs is DVD-Audio Tools.

A couple of useful sites:


Renaming Several Files

Sometimes MP3 files for, say, many songs off of the same album have the same long and very annoying prefix, e.g.

	Rat Bastards - Time to Smash Your Face with a Brick - 01 - Feelings.mp3
	Rat Bastards - Time to Smash Your Face with a Brick - 02 - Body and Soul.mp3
                             ...
If you desire to remove that very long prefix containing the group name and album title - for instance, because such things can make xcdroast barf unless you trim them - it can be easily and quickly accomplished via the rename command, e.g.

rename "Rat Bastards - Time to Smash Your Face with a Brick - " "" *.mp3

This will leave the much more concise and compact:

	01 - Feelings.mp3
	02 - Body and Soul.mp3
	  ...
Character Set Conversion

Occasionally you find file names containing strange, unreadable characters. The utility convmv is good for converting these into better human and/or machine readable character sets. The command:

convmv --notest -f utf7 -t utf8 *

will usually convert your mess into standard UTF8 format.


Software


cdrecord

"Cdrecord is used to record data or audio Compact Discs on an Orange Book CD-Recorder."


cdparanoia

"Cdparanoia is a Compact Disc Digital Audio (CDDA) extraction tool, commonly known on the net as a 'ripper'. The application is built on top of the Paranoia library, which is doing the real work (the Paranoia source is included in the cdparanoia source distribution). Like the original cdda2wav, cdparanoia package reads audio from the CDROM directly as data, with no analog step between, and writes the data to a file or pipe in WAV, AIFC or raw 16 bit linear PCM."


easytag

"EasyTAG is an utility for viewing and editing tags for MP3, MP2, FLAC, Ogg Vorbis, MusePack and Monkey's Audio files. Its simple and nice GTK+ interface makes tagging easier under GNU/Linux."


flac

FLAC stands for Free Lossless Audio Codec. Grossly oversimplified, FLAC is similar to MP3, but lossless, meaning that audio is compressed in FLAC without any loss in quality. This is similar to how Zip works, except with FLAC you will get much better compression because it is designed specifically for audio, and you can play back compressed FLAC files in your favorite player (or your car or home stereo just like you would an MP3 file.


grip

"Grip is a cd-player and cd-ripper for the Gnome desktop. It has the ripping capabilities of cdparanoia builtin, but can also use external rippers (such as cdda2wav). It also provides an automated frontend for MP3 (and other audio format) encoders, letting you take a disc and transform it easily straight into MP3s. Internet disc lookups are supported for retrieving track information from disc database servers.Grip works with DigitalDJ to provide a unified "computerized" version of your music collection."


id3lib

"id3lib is an open-source, cross-platform software development library for reading, writing, and manipulating ID3v1 and ID3v2 tags. It is an on-going project whose primary goals are full compliance with the ID3v2 standard, portability across several platforms, and providing a powerful and feature-rich API with a highly stable and efficient implementation."


id3v2

"A command-line editor for id3v2 tags."

Usage: ./id3v2 [OPTION]... [FILE]...
Adds/Modifies/Removes/Views id3v2 tags, modifies/converts/lists id3v1 tags
 
  -h,  --help               Display this help and exit
  -f,  --list-frames        Display all possible frames for id3v2
  -L,  --list-genres        Lists all id3v1 genres
  -v,  --version            Display version information and exit
  -l,  --list               Lists the tag(s) on the file(s)
  -R,  --list-rfc822        Lists using an rfc822-style format for output
  -d,  --delete-v2          Deletes id3v2 tags
  -s,  --delete-v1          Deletes id3v1 tags
  -D,  --delete-all         Deletes both id3v1 and id3v2 tags
  -C,  --convert            Converts id3v1 tag to id3v2
  -1,  --id3v1-only         Writes only id3v1 tag
  -2,  --id3v2-only         Writes only id3v2 tag
  -a,  --artist  "ARTIST"   Set the artist information
  -A,  --album   "ALBUM"    Set the album title information
  -t,  --song    "SONG"     Set the song title information
  -c,  --comment "DESCRIPTION":"COMMENT":"LANGUAGE"
                            Set the comment information (both
                             description and language optional)
  -g,  --genre   num        Set the genre number
  -y,  --year    num        Set the year
  -T,  --track   num/num    Set the track number/(optional) total tracks
											        
You can set the value for any id3v2 frame by using '--' and then frame id
For example:
        id3v2 --TIT3 "Monkey!" file.mp3
would set the "Subtitle/Description" frame to "Monkey!".


lame

An LPGL MP3 encoder.

Install by entering configure, make and (after becoming root) make install. This will by default install the binary lame in /usr/local/bin.

Additional useful programs can be found in the misc subdirectory. They must be individually installed in /usr/local/bin. They are:


shntool

A multi-purpose WAVE data processing and reporting utility. File formats are abstracted from its core, so it can process any file that contains WAVE data, compressed or not - provided there exists a format module to handle that particular file type.

It consists of three parts - its core, mode modules, and format modules. This helps to make the code easier to maintain, as well as aid other programmers in developing new functionality. The distribution archive contains a file named 'modules.howto' that describes how to create a new mode or format module, for those so inclined.

The core of shntool is a wrapper around the mode modules, which are:

The following formats are supported:

When reading files for input, shntool automatically discovers which, if any, format module handles each file. In modes where files are created as output, you can specify what the output format should be - other- wise, shntool decides for you by selecting the first format module it finds that supports output (in a default installation, this will be the wav format).


lpac

LPAC is a codec (coder / decoder) for lossless compression of digital audio files. "Lossless" means that any compressed file can be decompressed in a way it will be bit-wise identical with the original. This is the main advantage of LPAC compared to lossy formats like MP3, WMA or RealAudio. On the other hand, lossy codecs can achieve higher compression ratios. For example, MP3 at 128 kbit/s achieves a (fixed) compression ratio of 11, whereas LPAC's compression ratios range from 1.5 to 4, strongly depending on the audio material. Typically they are around 2 for pop music and 2.5 for classical music. This may not seem much, but remember you will get back every single bit, no matter how often you subsequently compress and decompress a file. It is true that general archivers (Zip, LZH, gzip) are lossless, too, but they often achieve nearly no compression on audio files.


mac

Monkey's Audio is a fast and easy way to compress digital music. Unlike traditional methods such as mp3, ogg, or lqt that permanently discard quality to save space, Monkey's Audio only makes perfect, bit-for-bit copies of your music. That means it always sounds perfect - exactly the same as the original. Even though the sound is perfect, it still saves a lot of space. (think of it as a beefed-up Winzip for your music) The other great thing is that you can always decompress your Monkey's Audio files back to the exact, original files. That way, you'll never have to recopy your CD collection to switch formats, and you'll always be able to recreate the original music CD if something ever happens to yours.

This additionally requires the installation of nasm.


mpeg4ip

"MPEG4IP provides an end-to-end system to explore streaming multimedia. The package includes many existing open source packages and the "glue" to integrate them together. This is a tool for streaming video and audio that is standards-oriented and free from proprietary protocols and extensions.

Provided are a live MPEG-4/H.261 MP3/AAC broadcaster and file recorder, command line utilities such as an MP4 file creator and hinter, and an player that can both stream and playback from local file."


mplayer

"MPlayer is a movie player for Linux (runs on many other Unices, and non-x86 CPUs, see the documentation). It plays most MPEG, VOB, AVI, OGG/OGM, VIVO, ASF/WMA/WMV, QT/MOV/MP4, FLI, RM, NuppelVideo, YUV4MPEG, FILM, RoQ, PVA files, supported by many native, XAnim, and Win32 DLL codecs. You can watch VideoCD, SVCD, DVD, 3ivx, DivX 3/4/5 and even WMV movies, too (without the avifile library).

Another great feature of MPlayer is the wide range of supported output drivers. It works with X11, Xv, DGA, OpenGL, SVGAlib, fbdev, AAlib, DirectFB, but you can use GGI, SDL (and this way all their drivers), VESA (on every VESA compatible card, even without X11!) and some low level card-specific drivers (for Matrox, 3Dfx and ATI), too! Most of them support software or hardware scaling, so you can enjoy movies in fullscreen. MPlayer supports displaying through some hardware MPEG decoder boards, such as the Siemens DVB, DXR2 and DXR3/Hollywood+.

MPlayer has an onscreen display (OSD) for status information, nice big antialiased shaded subtitles and visual feedback for keyboard controls. European/ISO 8859-1,2 (Hungarian, English, Czech, etc), Cyrillic and Korean fonts are supported along with 12 subtitle formats (MicroDVD, SubRip, OGM, SubViewer, Sami, VPlayer, RT, SSA, AQTitle, JACOsub, PJS and our own: MPsub). DVD subtitles (SPU streams, VOBsub and Closed Captions) are supported as well."


mp3splt

Mp3Splt is a command line utility for splitting - without decoding - mp3 (VBR supported) and ogg files by selecting beginning and ending time positions. It's useful for splitting large mp3/ogg files to make smaller files, or for splitting entire albums to obtain the original tracks. If you want to split an album, you can select split points and filenames manually or you can get them automatically from CDDB (internet or a local file) or from .cue files. It also supports automatic silence splitting, which can be used to adjust cddb/cue split points. If you have a file created with either with Mp3Wrap or AlbumWrap you can extract the individual tracks in just a few seconds.


mp3wrap

Mp3Wrap is a command-line utility that can wrap two or more mp3 files into a single large playable mp3 file, without losing the filenames or ID3 information (and without the need for decoding/encoding). You can also include other non-mp3 files such as PlayLists, info files, cover images, inside the mp3. Files wrapped with Mp3Wrap can be easily split using Mp2Splt to obtain the original files. The utility of Mp2Wrap is that it can be used to concatenate several mp3 files into a single file for easier downloading.


normalize

Normalize is a tool for adjusting the volume of audio files to a standard level. This is useful for things like creating mixed CD's and mp3 collections, where different recording levels on different albums can cause the volume to vary greatly from song to song.


ofr

OptimFROG is a lossless audio compression program. Its main goal is to reduce at maximum the size of audio files, while premitting bit identical restoration for all input. It is similar with the ZIP compression, but it is highly specialized to compress audio data.

OptimFROG uses a new audio compression technology, the generalized stereo decorrelation concept (together with the optimal predictor), which was first introduced with OptimFROG 4.0b in December 2001. At the time of its introduction, the new technology yielded significant better (~1.5%) compression than existing state of the art lossless audio compressors.


shorten

A fast, low complexity waveform coder (i.e. audio compressor), originally written by Tony Robinson at SoftSound. It can operate in both lossy and lossless modes.


sox

The swiss army knife of sound processing programs. SoX is a command line utility that can convert various formats of computer audio files in to other formats. It can also apply various effects to these sound files during the conversion. As an added bonus, SoX can play and record audio files on several unix style platforms.


streamripper


wavpack

WavPack allows you to losslessly compress (and restore) both 16 and 24-bit audio files in the .WAV format. Unlike "lossy" compression schemes (like MP3) that discard information, WavPack converts the audio data into a more compact form so that the restored files are digitally identical to the original source. It's somewhat like the file compression portion of WinZIP except that it's optimized for audio data. Like other lossless compression schemes the data reduction varies with the source, but it is generally between 25% and 50% for typical popular music and somewhat better than that for classical music and other sources with greater dynamic range.


[ home ]