Part 2. 24-bit Crystalizer
As you may remember, according to the manufacturer, Creative X-Fi is based on two key sound processing technologies: 24-bit Crystalizer and CMSS-3D. The fact is that Creative focused at promoting only two key features in sound cards of the new generation, instead of enumerating lots of features and innovations. That was why after we published Creative SoundBlaster X-Fi — a New Generation of Processors and Xtreme Fidelity Sound Cards, where the new technologies were subjected to severe criticism, brand manager of the European office Darragh O’Toole suggested that I should direct my criticism and any questions directly to the creator of 24-bit Crystalizer Mark Dolson, Director of Audio Research, Advanced Technology Center.
I must say that I have nothing against any subjective improvements of sound such as tone controls, equalization to compensate speakers imperfection, or some pseudo-surrounding to increase sound stage. Users like to turn ON/OFF magic buttons and to move mysterious faders. But I suppose we should use right words without marketing tales. We must provide correct information for adequate and qualified people, mustn’t we?
Please get acquainted with a part of my X-Fi review about Crystalizer. You can comment this, if you wish.
Can you tell us how Crystalizer works first-hand, please? With technical details, if possible. Is it a linear phase multi-band compressor or maybe an equalizer + compressor pair, or some kind of mastering maximizer?
Say, a cheap speaker has a small linear zone of amplitude, so its dynamic range is also small. Thus, if we reduce dynamic range of a sound record, then it will be easier for speakers to make sound more pleasant.
If we are talking about re-mastering, it’s quite another matter. Maximizers are used in FM stations, but sound records should be mastered in a studio with dynamic range in reserve. Compression is usually used in Dolby Digital decoders as well, where a user can choose various modes for dynamic range compression. For example, one mode is used for a set of laptop speakers or watching a movie at night, another mode – for a set of compact speakers or a noisy room. Or, finally, in case of an expensive home theater and hi-fi/hi-end speakers, one can use a mode with a full amplitude and frequency range. But sound tracks also have a reserve and are not maximized before a compressor in a decoder.
What if we re-master a CD-DA track, which has already been mastered by SONY Music? How can a $100 soundcard and its automatic immediate real-time processing measure its strength with a $1,000,000 mastering station capable of multi-pass record analyzing and expensive 64-bit processing manually set by an experienced sound producer?
How much distortions would we have after two aggressive masterings performed one after another (zero dB)? Does a sound recording become clearer after two and more masterings or its nonlinear distortions are just summarized?
Creative’s advertisement says that Crystalizer expands dynamic range of 16 bit records to 24 bit. I made some advanced experimenting in RMAA 5.5. I have played 16-bit and 24-bit WAV files on X-Fi and then recorded them via S/PDIF on the professional LynxTwo-B sound interface in 24 bit mode when Crystalizer was ON and OFF. The results were compared with the reference 16-bit and 24-bit WAV files, generated by the test application without any playback.
|Test||Reference 16/48||Reference 24/48||Crystalizer OFF||Crystalizer ON|
|Dynamic range, dB (dBA):||
|IMD + Noise, %:||
As we can see, Crystalizer does not expand DR of the 16 bit signal, it even reduces it. By default (50%), Crystalizer amplifies the signal so that the two 60 Hz @ -5 dB and 7000 Hz @ -17 dB sines (it’s a standard SMPTE-IMD test signal) cause about 1% of distortions (IMD+Noise). The distortions cannot possibly make the sound crystal clear. What are your comments?
How can we make the best use of the Crystalizer technology? Does it suit everyone’s sound records and speaker systems?
Does Creative have any plans to improve Crystalizer in its next versions? Maybe a buffer for anticipatory analysis in order to vary Crystalizer processing level to protect sound from distortions on really hi-quality speakers?
I am writing in response to the review of the Sound Blaster X-Fi and in particular of Creative’s 24-Bit Crystalizer technology that you have referred to my attention.This review includes a number of inaccurate statements about our proprietary technology that I would like to provide corrections and clarifications for.
As I see it, there are two crucial questions that need to be addressed here:
1. What is the goal of the 24-bit Crystalizer?
2. How does the 24-bit Crystalizer achieve this?
I will start with the first of these two issues, the goal.
The primary goal of the 24-bit Crystalizer is to partially compensate for the inherent dynamic range limitations of 16-bit audio content by applying appropriate post-processing during playback.This is admittedly a very ambitious goal with an unavoidably subjective component.However, this in no way prevents us from formulating a clear technical statement of the problem.
The problem is that we are presented with a composite mix of individual audio tracks. Each of these tracks was potentially subjected to its own independent dynamic range compression in the recording studio prior to mixing.
The audio on the released CD contains no clues as to the compression applied to individual tracks nor as to the relative volume levels at which these tracks were mixed together. This lack of auxiliary information makes it physically impossible to rigorously reverse engineer the recording process.
Nevertheless, it is still possible to design post-processing that produces an effect that is qualitatively similar to the one that we would ideally have obtained.This post-processing should selectively enhance transients in the composite audio without perceptually altering other aspects of the audio, and independent of the absolute level of the transients.
From this problem statement alone, it should be clear that:
(a) A multiband compressor will not provide a good solution (because its processing will be strictly level-dependent, and because we desire expansion rather than compression)
(b) The enhancement will be appropriate for auditioning on any playback system, no matter how expensive or cheap (because we are specifically correcting for limitations in the 16-bit content delivery medium, independent of the playback system).
This brings me to a discussion of our proprietary solution.
Creative’s 24-Bit Crystalizer is best understood as a signal-dependent, dynamic EQ.The source of its intelligence is an analysis front end that continuously calculates dynamically-normalized, separate low-frequency and high-frequency energy flux signals, based on nonlinear processing of the input audio streams.These two flux signals are used to apply proportionally-weighted, transient, low-frequency and high-frequency boosts to the input audio.
The careful design of the front-end analysis and the proportional response of the dynamic EQ are both critical to allowing audio signals to be perceptibly altered without introducing objectionable processing-induced artifacts. There is also an additional static component to the EQ, which contributes to the overall perceptual effect.
It is difficult to determine from the limited description in your article precisely what testing you attempted to perform. In one test, it appears that you ran a standard sine wave through our 24-bit Crystalizer process and compared it to Waves LinMB. This is fundamentally flawed in that the content provided (a test tone) does not trigger the majority of the enhancement process. Basically there are no dynamics or transients to enhance, and the only enhancement you see (which your test highlighted) is the static increase in low and high frequency response.
Taken as a whole, the 24-bit Crystalizer processing has the effect of selectively enhancing transients, independent of their absolute level, while leaving the bulk of the audio (i.e., the mid range of the spectrum) unaltered. This is significantly different from what would result from a multiband compressor (which would literally compress the dynamic range) and also from what would happen with a multiband expander (whose effect would still be signal-level dependent in a way that the 24-bit Crystalizer is not).
In the end, the extent to which the 24-bit Crystalizer achieves its goal (of partially compensating for the limited 16-bit dynamic range of CD audio) is a matter of subjective assessment.
It is certainly possible to find sound examples and/or listeners for which the results are not satisfying. For this reason we implemented this technology in a flexible way.We allow users who listen to very high-quality content to reduce or even to turn off the enhancement should they so desire. Meanwhile, others who listen to lower-quality content, or to specific types of music that 24-bit Crystalizer works well with, can increase the level of enhancement.
Creative’s experience to date is that actual listener response has been overwhelmingly positive in the general user market.
This does not mean that the 24-bit Crystalizer can literally transform the bits in a CD (or, even more challenging, MP3) audio stream into those that would be found on a corresponding DVD-Audio recording (and we have not made any such claim).However, it does — in our view — validate our claim that we perceptibly “improve” legacy audio through the application of innovative and well-motivated, proprietary signal processing. This is the basis of our claim that 24-bit Crystalizer (when combined with our X-Fi CMSS-3D technology) can make MP3 music sound better than CD.
I was not satisfied with this reply and carried on with my tests. Aristotle said: “Amicus Plato, sed magis amica veritas”. Hold on, Creative!
I can understand the desire to advertise and to promote the new technology to mass consciousness, highlighting the sensational announcements about the MP3 superiority over CD. It may be justified to attract attention of wide audience as much as the announcement made by Intel that Pentium-III processors accelerated Internet. But in my opinion, technically-educated people need intelligible ad-free arguments. Otherwise, the advertisement campaign will have an opposite effect.
You can chant 1000 times that the dynamic range is increased, but the standard tests by AES methods and approved by ANSI/IEC standard show that “Dynamic Range” gets narrower with Crystalizer.
Then, I strongly object to various scary tales about the CD-DA quality. I declare with all due responsibility that the 16-bit format does not suffer from the terrible problems with the dynamic range, described by Creative. After noise shaping in the final mastering stage, psychoacoustically weighted dynamic range of the 16-bit recording in the audible range may exceed 100 dB (A-weighted, AES17 filtered) and even reach parameters of a 20-bit one! Any processing and intrusions into the process of bit-by-bit transfer of the final recording from CD to DAC damage these efforts.
On the other hand, should I mention that the dynamic range of regular acoustic systems is much worse than even 96 dB (20 log 2^16) and the room noise under normal conditions is about 20-30 dB. So compressing the dynamic range rather than expanding it may render the sound subjectively clear and detailed. CD recordings do not use even the available dynamic range! Applying the expander after mastering compression and maximization will not strike home, as the details were already lost to the dynamics compression and the recording got a bunch of distortions. It’s a non-linear irreversible process!
Even though Mark writes that limitations of the 16-bit format are compensated, a recording processed by Crystalizer gets more aggressive, as the RMS signal power grows.
Mark places the emphasis on the dynamic enhancement of the recording level, calling Crystalizer a signal-dependent, dynamic EQ. A multiband compressor is also dynamic, having band-adjustable attack and release times.
I consulted Alexei Lukin, one of the authors of iZotope Ozone 3 mastering package. He surmised that Crystalizer uses the rate of signal change instead of the signal level as a threshold of the side-chain. Thus, it predicts and amplifies signal peaks irregardless of the current level. This procedure is also used by studios on percussion tracks to accentuate the attack. For example, Waves Transform or Waves Diamond plugins include the so called multiband signal shaper, Trans-X Multi (multi-band transient processor).
Auditioning proved that Crystalizer’s effect on a recording with sharp peaks, especially on percussions, is indeed similar to Trans-X Multi shaper with the settings shown on the screenshot.
However, even in case of several seconds of a static sine multitone signal, Crystalizer significantly equalizes the signal by increasing the RMS volume by approximately 3 dB.
Does it increase the dynamic range, that is does it increase the volume difference between peaks and the noise floor, including various noises+distortions? As you can see on the spectrums of the single sine as well as of the multitone signal, Noise+IMD grow together with the amplitude of the main oscillation, and they grow much faster. It has to do with the fact that any processing introduces distortions to this or that degree. Especially increasing the signal level, which threatens to cut signal peaks, if you don’t use additional tools (compressor, limiter).
In this case there are some distortions, but they are not critical, at least for such a multitone signal that imitates well a recoding with an average dynamic range. Unfortunately, modern recordings have a tendency towards lowering the dynamic range reserve.
Here is what Bob Katz, the well-known mastering expert, wrote:
In the music and broadcast industries, chaos currently prevails. Here is a waveform taken from a digital audio workstation, showing three different styles of music recording. The time scale is about 10 minutes total, and the vertical scale is linear, +/- 1 at full digital level, 0.5 amplitude is 6 dB below full scale. The “density” of the waveform gives a rough approximation of the music’s dynamic range and Crest Factor. On the left side is a piece of heavily compressed pseudo “elevator music” I constructed for a demonstration at the 107th AES Convention. In the middle is a four-minute popular compact disc single produced in 1999, with sales in the millions. On the right is a four-minute popular rock and roll recording made in 1990 that’s quite dynamic-sounding for rock and roll of that period.
The perceived loudness difference between the 1990 and 1999 CDs is greater than 6 dB, though both peak to full scale. Auditioning the 1999 CD, one mastering engineer remarked “this CD is a light bulb! The music starts, all the meter lights come on, and it stays there the whole time.” To say nothing about the distortion. Are we really in the business of making square waves?
The average level of popular music compact discs continues to rise. Popular CDs with this problem are becoming increasingly prevalent, coexisting with discs that have beautiful dynamic range and impact, but whose loudness (and distortion level) is far lower. There are many technical, sociological and economic reasons for this chaos.
The Insane Increase in “Hottest” Pop CD Levels RED – Average Level WHITE – Headroom for peaks.
The height of the red bar reflects perceived loudness and potential loss of quality and clarity
Thus, applying Crystalizer to modern maximized records will lead to manifold growth of distortions, resulting in cracks and pollution of the recording with spurious harmonics.
Let’s check it up by the example of the standard IMD-SMPTE test. We shall change sine amplitudes and look at the results. -6/-18, -5/-17, -4/-16, -3/-14 – still no clipping; peak levels are -4.05, -3.05, -2.05, -1.05 dB correspondingly. Maximum loud RMS signal level is -4.5 dB to imitate modern recordings. IMD varied from 0.04% to 10%.
By default (50%), Crystalizer clips all signal variants except for -6/-18. Before applying Crystalizer, it would have been reasonable to convert the signal into floating point format and reduce the amplitude in digital form, in order to avoid overflow. But in this case the recording will not sound subjectively louder, which will weaken Crystalizer’s positions, when compared directly to the original recording with the same volume level.
Considering the above said, that is being aware of the nasty Crystalizer’s feature to increase the general volume level in digital form, I compared Crystalizer records with the original, amplified by 3 dB to equalize the levels.
The comparison was based on two typical recordings: an old analog recording with a great dynamic range reserve and a modern recording with maximized levels. We used an X-Fi Elite Pro sound card and active studio monitors.
The original recording sounded much better to me in both cases. Crystalizer heavily equalized the recording, causing an evident trough in medium frequencies. All sharp noises (such as percussions, guitar strings, saxophone’s edge of attack) were stressed, often occulting the rest of the recording (automatics failed to detect that singing was more important than the cymbals part in this recording). Perhaps, such audio processing will do good only if you listen to some recordings on small computer speakers, which add their own distortions. Through these distortions you will make out the equalization and a sharper attack of percussions. Turning down Crystalizer produces a positive effect: it results in the reduction of equalization and general volume level, and in higher resistance to overload.
At last we’ve learnt enough about the 24-bit Crystalizer technology. It’s quite a complex combination of static and dynamic signal processing, which cannot be defined in a single term. That’s why Creative patented Crystalizer as an independent creation. However, no matter what Crystalizer is from inside, we are more interested in its effects and applications.
We have seen no miracle, the X-Fi sound card in cooperation with Crystalizer failed to wonderfully remaster a recording, thoroughly created by experts in a mastering studio. Even if the contrary was proved by hundred housewives from a focus group within Creative marketing research, in my tests MP3 didn’t overcome CD-DA and didn’t catch up with DVD-Audio.
Nevertheless, we should admit that the hardware betterizer Crystalizer is a plus for common users, as well as an equalizer or tone controls. Using these tools they may try to make up for shortcomings of their speakers or headphones. Anyway, you shouldn’t hope that MP3 128 Kbit/s will miraculously sound better than the original CD-DA. Whether this technology is justified is up to you to decide, taking into account how it makes your recordings sound.
I doubt that modern popular recordings need additional violation of their dynamic range. Probably you’d better carefully choose the Crystalizer effect until you hear the signs of overload. You must also take into account that Crystalizer increases the volume level by 3-4 dB in digital form. So perhaps all you need is to increase the volume by the same value and avoid distortions in the tonal balance and other problems.