OGG vs. LAME
Today everybody knows quite well what sound possibilities a modern desktop computer or laptop has. A record library on a computer, together with audio cassettes and CDs is no wonder for many.
We all know that a CD is simple and easy to use. Why? Because it has a definite format which is everywhere the same, and the sound quality depends on a recording studio and usually is high. It is rather convenient.
And what about the music on a computer? The PCM (pulse-code modulation) established for CD-DA discs is not very compact and doesn’t suit for delivering music via the net. That is why developers now are working on a great deal of complex compression algorithms. All of them differ very much in the sound quality, that is why a user has always to make a choice of an algorithm for its favorite music to be recorded.
Despite a great variety of different algorithms and formats, the MPEG 1.0 Audio Layer III, simply called MP3 is an absolute leader. There are a lot of programs-encoders for recording music in this format. The Net contains a lot of different benchmark tests comparing programs that encodes into MP3. The best one for today is the LAME, a free open project without any license restrictions.
The MP3′ position, however, can shake soon. On the scene appears a new algorithm called Ogg Vorbis. After the Beta 3 version of this encoder appeared at the end of the summer of 2000 users have been facing the challenge. At the beginning of 2001 two new versions: LAME 3.88 and Ogg Vorbis 1.0 Beta 4 appeared at once on the market. They are both different very much from their predecessors, and today we will compare them.
Before the Start
The testing was carried out in 5 quality zones:
- “super” (the highest bitrate, 320/350 Kbit);
- “good” (high bitrate, 256 Kbit);
- “not bad” (average bitrate, 192 Kbit);
- “middling” (average bitrate, 160 Kbit);
- “satisfactory” (the favorite, used for music delivering via the net, 128 Kbit).
The OrlSoft MPeg eXtension 2.0 program was used as an encoder and a decoder. The following programs were used to analyze the results: CoolEdit Pro 1.2, Steinberg WaveLab 3.02, SpectraLab 4.32.13.
We took the following compositions for our testing: Roxette “Crush On You”, Richard Clayderman “Mano a Mano”, DJ BOBO “What A Feeling”, Bluemchen “Ist Deine Liebe Echt” and “Sehnsucht@herz.de”, Chicane “Autumn Tactics” etc. The most peculiar results for each zone will be used for conclusions and illustrations.
Terms of comparison
Unfortunately, it is impossible to compare LAME and OGG equivalently on the same bitrate. The format and encoder Ogg Vorbis isn’t intended for a coding mode with a constant bitrate (like in the MP3), any of 6 pre-established modes (112, 128, 160, 192, 256, 350) is a coding with a variable bitrate (VBR), though it will be more correct to call it ABR – an average bitrate. Moreover, apart from the speed that can be set as you will, the OGG allows for no manipulations with its parameters, while the LAME gives a possibility to control near everything. That is why we will compare both coders in the coding modes recommended by the developers. We will only deviate from this rule for a lowpass filter (the signal has the frequencies only below some definite level) in order to specify (for the LAME) what frequencies should be left to avoid severe cutting of them.
Not so long ago an average bitrate mode appeared in the LAME, though it exists for a long time already in the OGG. According to the developers, at the same bitrate the coding in the ABR mode mustn’t be worse than in the usual one. That is why for estimating the LAME quality we will use two sets of coded samples: with a constant bitrate and a variable one.
It is impossible to code non-stop albums into the MP3 format without pauses, since files always have to have pauses in the beginning and in the end of tracks. The LAME can correct the beginning of files. In the Ogg Vorbis format files coincide with the originals completely. But this coder can code files only at the sampling frequency of 44100 Hz, i.e. in the CD-DA format. The OGG can work at 48000 Hz when coding files.
Sample – an audio file, some fragment of a musical composition.
Original, an original sample – a fragment of the WAV format taken from an audio CD.
Coded sample – a sample coded (compressed) into one of the formats in question. In this case they are MP3 (LAME) or OGG.
Decoded sample – a coded sample converted from a compressed format into a usual WAV one for our testing.
Coder, encoder – a program for compression (coding) of a sample from one format into another, here – from the WAV into MP3 or OGG.
AFC – an amplitude-frequency characteristic of the sound, representation of the sound with the graph of the frequency vs. amplitude.
Sonogram – the sound representation with the graph of time vs. frequency.
Delta-signal – a differed (delta) signal obtained by subtraction of one sample from another, which characterizes difference between them. In this case it is used to calculate the difference between original and coded samples.
The aim of coding is to reach the maximum possible quality of the sound. That is why we set the maximum parameters for both coders. For the LAME we are taking a mode of 320 Kbit with a full sound range up to 22 KHz and the highest quality (-q0), the other parameters are set by the coder. For OGG we also set the mode of the highest quality – 350 Kbit. Unfortunately, other parameters can’t be regulated.
So, what is the result? I confirm that the psycho acoustic models of the coders have undergone severe changes, which are easy to define when analyzing decoded samples. High frequency processing has changed much. Earlier in the 320 Kbit mode the LAME left the complete range up to 22 KHz, and now these freequencies also pass the psycho acoustic model. This fact can be perfectly illustrated on the sonogram. Compare the original and the decoded sample (click the graphs to enlarge):
It is interesting that the sound is so close to the original that it is difficult to tell three compositions from each other. With the maximum parameters both coders give practically identical sounding of the original CD. The only thing I noticed is that the OGG gives more transparent sounding and better reproduction of upper middle frequencies. But this difference is so tiny that it can be noticed only with the high quality equipment. So, both coders get the highest score with the only difference in that the OGG has an average bitrate higher than 320 Kbit (usually it is within the range from 340 to 380). The averaged AFCs of the original and coded samples differ not much despite so impudent handling of high frequencies.
Now let’s estimate delta-signals, i.e. let’s calculate them and compare differences between the original and coded samples.
The delta-signal of the LAME samples sounds like a quiet wide-band noise through which one can hear the weak main sound with hoarse pattering and severely distorted high frequencies. For the OGG samples the picture is much more complicated: it reminds not just noise but a highly distorted original with phase distortion effects (flanger or phaser effects). I think that in the OGG different ranges are processed better, they seem to be well thought-out as compared with the LAME, which has very close parameters of a psycho acoustic model for the majority of subranges. It is well noticeable when analyzing the AFC of the delta-signals (the red graphs is the LAME’s, the white one is for the OGG). The lower is the signal, the better the quality of the sound on the corresponding frequencies.
The LAME, unlike the OGG, provides wide possibilities in controlling the coding process, psycho acoustic parameters and filters. And if we cut frequencies higher than 20 KHz (which are anyway inaudible) when coding at 320 Kbit, we can achieve better sounding. Let’s look at the graph of averaged AFCs of the delta-signals of the full and cut at 20 KHz samples.
So, in the zone of the highest quality the OGG and LAME are very close to each other, that is why music lovers can take a coder they like more.
The super high bitrate give the excellent quality of the sound, but it isn’t preferable among high bitrates since a file size at the stream of 25 MBytes/s is very big. As a rule, many prefer 256 Kbit bitrate as a rational compromise between good quality and a file size. That is why let’s compare the quality of the both coders in this case and estimate losses relatively to the highest bitrate. Here I will again cut frequencies beyond 20 KHz for the LAME mode in order to improve the quality of the basic audible range. The LAME was tested in two modes: with a constant bitrate and a variable one.
First let’s look at the frequency dynamics of the samples obtained (changing of the AFC with time with averaging in small intervals – from 20 to 100 milliseconds). The trend of losses of high frequencies keeps. The OGG cuts the high frequencies (higher than 18 KHz) to a greater degree than the LAME does:
The LAME developers has much changed the coder: there is no more the difference in sounding of the high frequencies. Samples coded into the ABR are better than those in the standard mode, that is why I recommend to refuse a constant bitrate in favor of the ABR. Cutting by the OGG of frequencies higher than 18 KHz doesn’t affect much the general sounding of the samples, the difference with the LAME is also insignificant. If you prefer 256 Kbit, keep on using it, but you should take the newer version of the LAME.
As for delta-signals, the sound is like in the 320 Kbit mode. The OGG samples have some hoarse sounds in the range of high frequencies, and in the LAME samples we can here the general increase of the noise level.
I will try to explain the discrepancy between two comparisons. The first sample is very loud and dense, the sound is set nearly to 0 dB, while the second sample is an orchestral composition recorded on average at -3 dB, i.e. without compacting a dynamic range. The denser the record, the higher the quality in case of the OGG as compared with the LAME. But you should remember that the LAME has some kind of a metallic effect at high frequencies, and the OGG cuts frequencies higher than 18 KHz. Below you can see AFCs of delta-signals for the sample of an average density.
So, when choosing a coder for working in the 256 Kbit mode you should define what is more important for you: middle frequencies or those higher than 17-18 KHz? Also note that in the ABR mode the coding quality of the LAME is much better than in case of a constant bitrate. In my opinion, the leader is the OGG.
256 vs 320/350: who wins?
Now I think we should compare the coding quality of these two zones.
The listening tests of the LAME samples prove my conclusion about the high quality of the ABR 256 Kbit mode. The difference between 256 ABR and 320 samples is noticeable: the sound becomes more acute, excessively sharp, but it is not critical. That is why if it is not important for you to have the maximum similarity with the original, but you need just high quality, then the 256 ABR mode will suit you. As for OGG samples, the situation is different: the sound diffuses, though it isn’t critical. Differences, however, can be heard only with the special equipment.
And now look at these differences in a graphical form.
In conclusion I should notice that coding in the 256 Kbit mode has so high quality that you can easily use this mode, if it is not vital for you to have the sound as close to the original as possible.
The 192 Kbit mode is a middle-of-the-road solution since it doesn’t offer decent quality and provides not a small size (1.5 MBytes – 1 minute). Let’s check what we have changed in the new version.
The situation with high frequencies has changed cardinally: the OGG reproduces them better, the LAME cuts them. The ABR coding is still better than the usual one, but it is yet worse than the OGG.
The general sounding is of course worse than that of the 256 Kbit, there is some metallic effect, smearing of high frequencies, and a small loss of a depth of both coders. But this time their sound is much different. The LAME reproduces high frequencies not bad, while the OGG deserves a good mark in the majority of cases. The biggest gap in their sound is noticeable on the Richard Clayderman’s sample. The OGG only smears the samples, while the LAME brings about a metallic sound. It is a disadvantage of the LAME. At the same time in the OGG middle frequencies fall a bit, though it was absent at higher bitrates. Nevertheless, on the whole the sounding in the 192 Kbit mode is not that bad.
Let’s look at the delta-signals of the coded samples where the difference in the sound was maximum. In the LAME samples apart from the general increase of noise noticeable distortions appeared in the range of high frequencies. In the OGG samples strongly distorted original in the mids is heard quite well through the noise and distorted highs. On the LAME samples a level of middle frequencies is not so high. It means that a level of delta-signals, noises and distortions grows with decreasing of the bitrate. The lower the bitrate, the bigger the difference between coded samples or originals.
The coders sound different. The LAME reproduces the sound not bad in general, but it makes highs metallic. The OGG smears them and falls behind the LAME in reproduction of middle frequencies. That is why I think that the LAME in the ABR mode codes the music at 192 Kbit better than the OGG. And in the constant bitrate mode it is beaten by the latter in all tests. I hope you understand that the ABR mode always gives better quality.
So, the LAME in the ABR mode leads at 192 Kbit.
Now we will concern the most popular bitrates – 160 and 128 Kbit. 128 Kbit mode is not enough for reproduction of high quality sound, since for coding of highs the width of the stream is not enough. And 160 Kbit are quite enough for an acceptable reproduction of 16-17 KHz frequencies.
The OGG sounds excellently on all samples! I couldn’t imagine that such quality might have been achieved at 160 Kbit. The LAME can’t handle highs, despite an artificial suppression beyond 18 KHz. They not only have a metallic sounding, there is also a “chewing” effect. But at middle frequencies the LAME sounds better than the OGG. Look at the averaged AFCs of delta-signals.
An absolute leader at 160 Kbit is the OGG. But the LAME in the ABR mode falls just a bit behind it.
It is used mainly for music on the Net. It is the most popular music format. Let’s look at our contestants. The LAME coding is carried out with suppression of frequencies higher than 16500 Hz in order to improve the sound in the main frequency range. When suppressing higher frequencies, reproduction of highs wouldn’t be better, but the general sounding will become worse.
When listening to them I can hear the same sound picture as in case of the 160 Kbit: the OGG samples sound much more beautiful and quality than the LAME ones. The general sounding is of course worse than the 160, but the trend keeps: the OGG reproduces better high and low frequencies, the LAME shines in middle ones. For 128 Kbit mode the quality of coding is very good.
Let’s look at the AFCs.
So, the leader is the new version of the OGG coder.
128 vs 160: who wins?
The rivalry between 128 and 160 bitrates is not weaker than between 256 and 320. Let’s compare the both modes separately for each coder. We will start with the OGG.
The difference is considerable. The lows sound quite different: they become diffused, sharpness of percussion instruments and basses is absent. Highs seem to be more metallic and diffused at the same time. Middle frequencies are not much different, because, in my opinion, the developers are again saving on highs and lows in favor of middle range. On the whole, the sound in the 160 Kbit mode is more juicy, that is why you’d better take it – for the Net this size is still acceptable. Look at the spectra of the delta-signals.
Now comes the LAME. Note that in the 128 Kbit mode the LAME started to code quite good not so long ago, that is why the comparison must be very interesting. The quality of the samples with a lot of high frequencies differs much. If at 160 Kbit the reproduction was normal, at 128 Kbit the sound becomes jerky, rough, metallic with a “chewing” effect and strong phase distortions. On samples with less highs the difference is not so severe, but the sound is rare satisfacory. But there is also an advantage. Low and middle frequencies don’t differ in this two modes! Even the difference in middle frequencies which can be seen on the sonogram doesn’t affect the sound.
The LAME has the most problems in reproduction of high frequencies, the OGG has also troubles in lows. In general, I would recommend 160 Kbit everywhere where it is possible, since the difference between 160 and 128 is not that big in size, but matters a lot as far as quality is concerned.
- The new versions of both coders handle their tasks much better than the previous ones.
- The quality in the ABR mode for the LAME is never worse than in a standard mode, and it is often much better at any bitrates.
- In the mode of the maximum similarity with an original (320/350 Kbit) both coders perform excellently.
- In the ABR 256 Kbit mode the LAME better reproduces highs, and the OGG works better with middle frequencies. I would rather recommend the OGG, but it hasn’t solid advantage as far as compatibility is concerned.
- Coding in the 256 Kbit mode for the OGG and in the ABR 256 Kbit mode for the LAME corresponds to high quality and it is a good choice if you don’t want the maximum similarity with an original.
- In the ABR 192 Kbit mode the LAME is on the whole better than the OGG.
- In case of 160 and 128 Kbit the OGG is an obvious leader.
- The coding quality of 160 Kbit is much better than that of 128, that is why I recommend to take 160 everywhere where it is possible.
The recommended versions of the LAME and OGG, as well as some useful software can be found on the OrlSoft site in the “download” section