Wednesday, December 12, 2012

Analog test-anti-test path

Earlier in Street Smart Analog Lingo I mentioned an analog test-bus or analog test-path.  These animals are excellent for debug of silicon, particularly deep-submicron where probe pads are REALLY Huge.  A 2u probe-pad was no big deal back in the day but that area is really useful in geometries below 0.13u.

The test-path issue came up recently so I figure I would blog about this useful debug tool.

The Good:
DC analog signals such as currents and voltage references can be sent on/off chip helping to isolate DC bias problems.  An "analog mux" is placed on the test-bus normally each block has a little mux that allows a signal to be passed from the INSIDE to a PAD on the outside of the chip. Outside of the chip the appropriate test-device or current/voltage source can be attached to the test pin.  This is useful for tuning in band-gaps, bias generators and debugging low-freqency clocks or slow-speed ADCs. 

High-speed (differential) signals can also be sent out an analog test-bus.  These are trickier to deal with but I have seen an 800MHz test-bus employed on an 12Gbps receiver.
(ISSCC 2006 - Keyeye 12Gbps).  That test-bus had a dedicated output buffer created from a thin-oxide PMOS transistor.  This was a "source-follower' with the off-chip resistor being a several-K Ohm resistor.  With the correct (~10V) power-supply, the circuit could be tuned to an impedance of 50 ohms to match the board trace.   The poor-little transistor was biased well beyond 10 year lifetime limits however it allowed us to "tune-in" our analog DFE, NEXT and ECHO cancellers.  This circuit also made a fine figure for our ISSCC paper.  We achieved about 8 bit linearity with a bandwidth of near 800MHz.  If you left it on too-long or raised the voltage too high the chip would blow.  The eye got cleaner until it popped.  Later on we included EQ on scope capture data to reduce the burn-out problem.

Medium bandwidth signals can also be sent through a mux into a front-end of a receiver.  The transformer in an Ethernet chip had a dual-purpose as a balun.  You could put a single-ended RF generator (with associated filter network) on the differential input side of the transformer.  Then on the "chip side" you could adjust the center-tap to give whatever common-mode was required for the internal block being tested.  A "leap-frog" test-path was included to send the signals to the various front-end blocks helping to debug harmonic-distortion problems, AGC ranges, low-pass filter bandwidths and ADC linearity.  This path should be simulated before tape-out.

One advantage of an analog test-bus is that you can always disconnect it in a metal-rev, so reliability is not a concern, especially in the early stages of  analog-front-end (AFE) bring-up.

The Bad:
I have also seen the analog test-bus cause failures.  These are subtle but this is the point of street smart analog.  The test-bus needs to be verified like any other circuit.  Neglecting to do so can cause bad things to happen.

The ultimate sin of the "test-bus" is to reduce the performance of the circuit's primary function.

Failure #1:  Some pads on chips have voltages that go "above the rail".  These are called "open-drain" where an off-chip pull-up resistor or transformer is required off-chip to supply current.  A common mistake is to connect a  PMOS switch to the pad with body tied to the chip supply.  If you take a PMOS terminal above the highest supply, a diode will turn on inside the chip and steal current with its characteristic nonlinear temperature dependent way often puzzling the layman.  Also these parasitic diodes can blow.  We learn in college that the PMOS body needs to be connected to the highest supply.  (source-body connections also have pitfalls and are do-able, but tricky and may affect a circuit in its normal mode.)  So as a general rule, unless you really have to, never us a PMOS switch, especially if you have an open-drain or a transformer.  Dan Ray said "No P on the Pad".  Notice my " ad", it has no P.

Failure #2:  Bad neighbor behavior.  What I mean by this is that several blocks normally share an analog test-bus such as a "DC" bus.  There is a desire to prevent noise from coupling back in from the test-bus so often we would employ a "T" switch.  This is a switch that consists of a T network with three switches.  When the bus is "off", the middle switch prevents noise coupling through.  When the test-bus is "on" the middle switch is off and the two outer switches connect internal node to the outside.  I have seen a case where someone left out one of the switches in the T.  So when the test-bus was disabled, it was pulled to ground preventing other blocks from using it.  So if you have an analog test-bus, a "test-case" should include "open". I would do this by loading the test-bus with a 1Meg resistor in sims to a voltage mid-rail in simulations.  You can also pull the resistor above the rail (on an open drain pin) to check for P on the pad if that is a concern.

Failure #3: Low priority verification.  The first shot at that 800Mhz differential test-bus did not work all that well.  We had hired an excellent consultant to design repeater to send a signal to the source-follower pad.  This IP never did make the first tape-out.  The focus was on tape-out and verification of the main function, but prevented debug later on forcing a quicker spin.  So if you are going to put a test-bus in, you should "Do it like you mean it" and verify it too.  If there are buffers they should be reviewed and plot reviewed.  The test-bus methodology should be done "up front" in the design and not snuck in at the last moment, since it could ruin your floor plan.  Thinking ahead and planning are always a good idea when it comes to analog chip design.  You can try to substitute long hours but you'll always lose to the thinker-planner.  Think tortoise and hare...

Keyeye Ref:  http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=1696054

Sunday, December 2, 2012

Missing Teeth

With switched-capacitor circuits, one of the most critical parts of the design is the clock generator.  As a friend of mine once said:

"When your switched-capacitor circuit doesn't work, check your clocks.  After that, check your clocks again."   (Perry Heedley-1998)

It was back in 1999 we had our first-generation gigabit SOC back in the lab.  The process was 0.35u.  Supply 3.3V.  We had a strange problem with non-uniform sampling.  When we sent the clock out the "test-bus" we saw that it had missing pulses.  Missing pulses are not a good thing and the Flash ADC ENOB was terrible.  Lots of tones!  On the scope the clock looked like a boxer who was missing teeth.  We also had supply dependence where high supply and cold spray made it worse.  What was going on?

On Friday I met a new friend who had a similar story.  (So sorry buddy!)  This inspired me to write this blog post on this common screw-up.  If I have seen one common mess-up in that the something goes wrong with a reference clock.

In these larger Ethernet chips, we distribute the clock as a differential signal.  The advantage of going differential is that the signal is not affected by clock skew and the rise/fall time match perfectly (by design).   If you distribute critical clocks with single-ended circuits stop reading now since you are hopeless.  The differential approach gives you a uniform sensitivity to noise on the chip and in the environment (see Ali Hajimiri's wonderfully written "Low-Noise Oscillators").  Another advantage of using a differential clock, is that ideally you can send it across power-supply domains. (when things are normal)

Now if you want a good non-overlapping clock its easy to go overboard.  Normally you have a "non-overlapping clock generator".   Its a circuit who's job it is to make sure a set of clocks do not occur at the same time.  A trade-off in those designs is the rise/fall time.  If the clock coming out of the block has a fast rise and fall time, the clocks are less apt to overlap.  However, this comes at a cost.  The designer keeps increasing the size of the generator to make the output edges faster and faster.  Eventually coming to a solution.  There is a trade off between non-overlap time and Operational Transconductance Amplifier (OTA) settling. It almost always seems easier to use a big clock buffer transistors than to beef-up your amplifier bandwidth.

A huge pitfall of these "massive" clock generators is that they can generate huge amounts of noise and "ground bounce".   Or as Stephen Lewis (UC Davis) would say "Making sparks".  The huge clock buffer circuits create massive amounts if dI/dT.  Huge current spikes with peaks upwards of close to an amp can find there way into your big clock buffer.  These currents hit your package (with inductance) which translate them into huge voltage spikes.

When it comes to "noisy neighbors" on a chip, it always takes an aggressor and a receptor.  In this case, I was able to debug this animal but putting the clock-generator into a schematic along with a simple package model consisting of package inductance.  I then put the clock source on a different power-supply in my schematic to see what happened.  I did this by hand in HSPICE since I am not the hugest fan of schematic capture.  I did this hand-written test-bench in real time in the lab right next to an oscilloscope with the bad clock on it.  It was me, Sailesh Rao, Jim Parker and Dave Nack all gathered around the setup.   I kept tweaking the test-bench, and Q factor (4) on the bondwires until BINGO.  I was able to match the waveform from the scope in HSPICE.  High-five from Dr. Rao!  What happened?

The ground-bounce was so big that it was measured in VOLTS.  Yes, our 3.3V supply had volts of ground-bounce on it from a huge clock generator.  By increasing the temperature or lowering the supply on the clock generator, we could work around the problem.  This part wasn't going to sample in this state.  The ground-bounce was too big from uber-big clockgen!

The main PLL and the ADC with the uber-clockgen were on different power supply pins.  Analog guys like to use A BUNCH of power supplies, normally to keep noise from coupling around.  However, this can sometimes backfire.  When breaking up power-supplies its important to visualize the return paths of all the currents and how they will affect each-other.  In this case, the PLL sent the clock to the ADC who caused so-much ground-bounce that the buffer amplifier receiving the clock in the ADC missed pulses.  This happened since the amplifier only had a common-mode range of about a volt, with more than a volt of ground bounce between the supplies.

So now, hopefully everyone knows that you can make a clock-generator "too-big".  A technique to finding these is to just turn-on base-layers in your layout and look for huge MOSfets.  Always ask yourself why you have a big transistor, since everything in the area will know about it.  Also people should be aware that more supplies are not always better.

So what is a solution?
A.  NERF your clockgen - Simulate it with bondwires
B.  Add on-chip bypass capacitors to prevent dI/dT from hitting the bondwire
C.  Improve the common-mode range of your clock buffer.
D. Design a set of  inter-supply "repeaters" with huge common-mode range
E.  Use DC Blocking capacitors

We solved this one with A and B.  The ADC worked much better after we fixed that.  We still had more challenges but....

"When your switched-capacitor circuit doesn't work, check your clocks.  After that, check your clocks again."