Why Direct Drivers are better than Balanced Armatures

OK, many know that we are NOT fans of balanced armatures (BAs).  If you've ever been to a show and seen us, you've probably heard us loudly state BAs are of the Devil.  Well - time to show you why we hold that position...

Fundamentally, a direct driver (DD) is a very simple device.  A voice coil (a coil of wire) is attached to a diaphragm (or speaker cone).  A current (representing the musical signal) is passed through the voice coil, and that generates a magnetic field around the voice coil.  This interacts with the static magnetic field from the rest of the DD, and you get motion.

Yes, this is a speaker, but the tiny little DD in your IEM has the same basic design

So it's a pretty direct thing - voice coil gets current, creates a force, force moves diaphragm, diaphragm displaces air... SPL (Sound Pressure Level) happens.  Direct radiation - thus the "Direct Driver" name.  Really simple and straightforward, and mechanically very stable (voice coil is always held in the field, well damped design, etc).

What about a balanced armature?  Well, it's a bit more complex.  It still has a voice coil (and thus the current representing the musical signal), and a permanent magnet.  But the design is radically different.  For starters, both the magnet AND the voice coil are fixed.


Yep, you heard me.  Fixed coil!  Fixed magnet!

So how does it make sound?  Simple.  Take a look at this image below:

Cutway of a balanced armature


Basically, the voice coil's magnetic field induces a current in the armature (that bent, U shaped thing in the middle),  This causes the armature to subsequently create an opposite magnetic field.  And that field is then attracted/repulsed by the magnets themselves, over to the right of the voice coil.  This makes the armature move up and down.  A small pin connected to the armature then flexes a stretched membrane (like if you held a blanket, and pushed up and down on it) and that displaces air.  The displaced air is channeled around the motor (all those moving and electrical bits) and comes out a hole.


Anyone see any possible issues there?  We have induced currents, re-built magnetic fields, little tiny levers (the pin), a variable diameter cone (the stretched diaphragm) and a bunch of twists and turns for the sound.

Whereas the DD just has a voice coil, it's magnetic field, the field of the magnets, and cone.  That's it.  Voice coil, magnetic field, cone - sound.

So BAs are considerably MORE finicky to design and build, and a lot more fraught with errors.  Lots of fiddly bits and 6 layers of stuff before that sound comes out the end: voice coil, armature, armature field, magnets, pin diaphragm, channel - sound.  Twice the stuff to go wrong.

The other thing to recognize is that the DD has the entire moving assembly well suspended and constrained; its held axially and radially by a suspension.  Keeps everything nice and linear, well centered, and under control.  The only thing that will cause motion (or stop it!) is the voice coil.  And that's directly bonded to the diaphragm.  Direct, quick, complete.

The BA?  Everything is suspended from a little pin, and is suspended in two magnetic fields.  There really is no lateral (radial) control of motion, and the axial (vertical) control is only by the tension of the diaphragm.

So what does this do, and other than the massive complexity involved, why do we call BAs of the Devil?  Simple.  They screw up the sound.  Yes - that is a hard, measurable fact, and one we will show in a minute! 

Now, there are PLENTY of places where BAs are great, like when you need something super tiny.  Think "hearing aid".  We're talking about something that reproduces sound, in a super tiny package, and is really fairly efficient over a narrow range (like speech - 500 Hz to 3 kHz is about all you need).  We're not talking accuracy, we're talking intelligibility.  

Think Big Mac and Filet Mignon.  Both are foodstuffs, but one is a LOT more tasty than the other.

So how does this show in the audio world?  Well, lots of people will talk about frequency response variations, or limited low end output - both of which are true for BAs.  It's why you need lots of them.  And that leads to crossover tolerances, etc.  But there's one go-to measurement we use for nearly everything we do audio-wise: the Cumulative Spectral Decay, better known as "the waterfall".

What is a CSD, or a waterfall?  Think of it as a COMPLETE look at the performance of your transducer.  Let's do a car analogy!  In a car, you can accelerate, sustain, and brake.  Lots of car magazines test "0-100-0" for supercars - how fast can they accelerate to 100 MPH, and then how fast can they stop from that speed.

With normal audio measurements, you can think of the frequency response as simply the "100" - it's a measure of what the transducer is doing when it's fully running.  But what does that tell you about how it got to 100, or how it comes down from that?

That's where a CSD comes in.  Basically, it shows how the driver starts, maintains, then stops.  Here's a CSD for our Carbon transducer:

What they heck are you looking at here?  Well, we used our trusty AudioPrecision measurement system, with some Crysound 711 couplers to generate it.  And this shows basically the way it accelerates, holds, and stops all in one graph!

The vertical axis (Level) is SPL, relative.  It's in dB.  And I have it dialed in to a 50 dB window (-10 to -60), so that we get about 40 dB (-20 to -60) on most of the response.

The left-right (Frequency) axis shows the frequency response from 200 Hz to 20 kHz.  Pretty much all of the audio range.

The front-back (Time) axis shows how that frequency response changes after we turn the signal off.  Basically this shows how the unit accelerates, holds, then stops.

So - we can see that our Carbon accelerates really well - see that little "rise", early in time, down in the lower frequencies?  We start moving and reach "top speed" in 1 millisecond - wow!  really fast.  And we see the typical in-ear frequency response above that.

But look how things "stop".  Yeah, lower frequencies take longer to stop; think about it, a 200 Hz wave is 5 milliseconds long.  So it will take longer to stop.  But you'll also drop that time as you increase in frequency.

The key is how well it stops.  THAT'S what matters.  When you want to stop a note, or change tone/texture, then you need to stop the preceding.  If you're still playing the previous tune, well, things get congested.

Check the upper registers - it's all pretty much gone in 4 milliseconds (gone being more than 40 dB down - that's less than 1%; -20 dB is 10%, -40 dB is 1%, -60 dB is 0.1%).  So basically, the Carbon is 99% done within 4 milliseconds.  Not bad!

What about a BA? Well - here we go.  This is a hybrid IEM, using a DD for the bass (BAs are bad at low frequencies), and a pair of BAs for the top end:

Same drive levels (179 mV), same gear, same ranges - vastly different responses!  The bass rings a LONG time.  We see it stops in the same time as the DD - but then it starts up again!  That's partially because of the presence of a crossover (which itself stores and releases energy - magnetic for the inductors, electrical for the caps) and the other drivers (released energy from the other elements can radiate back into the woofer stage).  The amp said stop, but the other parts of the IEM said "let's give it one more go".

And we see that the BAs themselves also "ring" quite well.  See all the "Loch Ness Monsters" in the range above 1 kHz?  Remember everything hanging from a single pin, from a stretched piece of plastic?  Guess what - that rings like a bell!  Meaning it keeps radiating sound well after it's been told to stop. 

This is a bunch of sonic energy in the 1-10% range of the total sound that you hear - it's a bleed over of sound from one moment to the next. We're still listening to the opening act when the headliner takes stage and starts.

When people talk about "microdynamics" and "resolution" - here is how you cover it, by literally covering it with previous information, in wonderful Technicolor!

Looking at just the frequency response (the highest single trace - all by itself), these two don't look too different.  You'd think they sound really similar.  But when you listen to them, there is clearly a difference in the sound, even the perceived brightness.  Why?  All that excess energy, continuing well past the time it was to quit, adding another 1-10% to the total energy you hear.  Doesn't show in the static frequency response - but is very audible with music!

So that's why we love the CSD (it shows you not just that your supercar will do 100 MPH, but that it will get there - and leave there - in a hurry), and why DDs are vastly superior to BAs when it comes to audio reproduction.  You have a LOT of things going against you when you use a BA, and have to work like crazy to make them "small".  But with a DD you have to really, really work HARD to make them sound inferior to a BA!

Hope this sheds some light on why we do a DD only, and no BAs.  BAs are wonderful for many things where you need absolutely small sizes, but when it comes to accuracy like in the CSD plots - they will always be a distant second place.

Go with a quality, single DD (like our entire IEM line) and just smile knowing you've bypassed a ton of complexity, sonic confusion, and cost (BAs aren't cheap!) to experience some of the best sound you'll ever encounter.