Street Smart Analog: November 2012

Sunday, November 18, 2012

Street Smart Analog Terminology

In everyday life I basically use these terms over and over again. I assume that everyone knows what these things mean, but that is probably not the case. So I am going to publish and maintain a list of my "Streetsmart Analog Lingo".

Most of this lingo comes from other smart people in this business. I can't give credit to everyone. The late Dan Ray was an expert in this area. Dan was a founder at Level One and one of my first mentors. He was awesome in analog. Tim Dyer (my identical twin brother), Perry Heedley (CSUS), David Viera, Patrick Isakanian, Paul Hurst, Stephen Lewis, Bob Pease, and Dave Nack contributed some of these over the years.

Street-smart Analog Lingo:

A1 Release: Release of silicon on the very first version. This happens very rarely, maybe 1% of the time with analog circuits. Assuming it will happen is unrealistic and can actually be discouraging.

All Layer: A design that requires all mask layers to be changed or all new masks

APR: Automatic place-and-route. Machine generated layout. Also called DDA (Digital Design Automation)

Antenna: Long piece of metal touching gate-poly - can damage poly leading to huge offsets. Also can be a single-ended wire (test-point) on a circuit board operating at over 200 MHz.

Bake It: Temperature cycle

Boomerang: Bad evaluation board returned from the customer

Brown-thumb: A designer who uses "unconventional" techniques or "special tricks" to do design. Often these characters are associated with unreliable circuit designs and poor execution. Also associated with using poor methodology and bad practice.

Change layer: Metal layer dedicated for changes/programmability

Carpet Bomb It: see NFS

Chip Designer: There are only chip designers. All block designers should be interested in how their circuit affects the chip then are working on. Especially important in a (System on a Chip) SOC

Cup-cake: Cross sectional shape of copper metalization

Expert-layout schematic: an analog schematic without any layout hints or notes

E^Overnight: Bit-error rate test requiring no errors if left overnight.

Fib Slut: A part that has been in and out of a "Focused Ion Beam" (FIB) more than twice

Follow the Dollar: The process of following the customer's money to your paycheck. It should be easy to "follow the Dollar" unless you know you are operating at a loss.

FOS Schedule: Full of shXt schedule. Normally used to get management pregnant on a chip design program

50% Schedule: Project schedule that requires everything to go perfectly. No competent marketing person or design manager ever commits to a 50% schedule unless its a FOS Schedule.

Hair-dryer: Heat gun

Hare: high-power low-impedance approach to design. Opposite of tortoise.

Hidden state: Circuit state not designed - normally from a bad reset circuit. Often appears when an over-confident analog person does digital design.

Leapfrog test-path: Analog test-path that allows blocks in a analog signal chain to by bypassed for debug.

Luck: When the mistakes you made didn't matter

Irregular layout: Any circuit block that is not square or rectangular. Also called "donut block" or "block with a tit on it"-Dan Ray.

Magic-fingers: The opposite of brown-thumb. Someone who executes

Magic-circuit: Circuit designed by someone who doesn't understand it. Often a Brown-thumb.

Magic smoke: When it leaves the chip it no longer works.

Maskview: Job-deck view which is a manual mask check. Ideally a "Zen" moment and never to be done in a panic.

Metal-up; Metal: A design that requires just metal changes - quicker and bypasses HTOL. Often a way to patch a design for a quick fab turn.

MAS Document Micro-architecture spec document or "chip Bible". So useful but so shunned, a one stop shop for circuit-block and interface information.

Nack Hack: {names after the late Dave Nack} A circuit board without a "toe-tag" containing unknown changes or hacks.

Nail: Type of die probe that is simply a piece of metal

Noisy Neighbor: Noisy circuit block interacting with nearby circuits

NFS: Nuke it From Space {Aliens II}. A circuit that is flakey or has unknown issues that should be completely re-designed. A circuit or a layout that is fundamentally flawed.

Onion Peel: (peel the onion) When a chip revision comes back and a new problem is uncovered (often hidden by what was fixed)

Pencil Tap: Before GPIB and Labview we would tap knobs with pencils for fine-tuning

Pizza Mask: A multi-reticle run of silicon. Can be "metal-up" or all-layer. Normally used for debug or system level designs without a full set of models.

Poke it with a stick: Low risk vehicle in trying a new technique or new process. Also used in debug to determine if the problem is sensitive to external stimulus.

Popcorn: moisture in the package causing trouble in re-flow splitting the part

Put a fork in it: Basically done, any more effort spent on it is wasted

Leakage: sub-threshold drain-source current in MOS that makes a sensitive temperature sensor.

Relentless beating: To solve a problem with several simultaneous solutions

Sim Slave: Design resource asked to do simulations without understanding

Smoke Test: First power-up of new silicon

Spoiled-via: Connection between metals that is open or flakey

Tape-out: Sending the plans of the chip to the FAB for mask generation. An important milestone for non-experts but meaningless for true experts, since you may have a "Turd"

Team Analog: Design, debug or architecture work done in an interactive manner.

Testbus: Test path snaked through an analog design that goes to an external pin for debugging

Trainwreck: When two layouts crash-into eachother. Also a machine-generated schematic or one in which the wires cross.

Thump test: Finding a signal integrity problem (loose connection) by thumping your fist on the bench.

Turd: An incompletely verified piece of silicon. Due to time-constraints, not all simulations are run.

Works by accident: Analog circuit or subsystem category with a flaw that works fine anyway. For example critical layout sensitivity that happens to balance, when later edited may surprise fail.

Works in simulations: Analog always works in sims... famous last words.

You can't bullshit electrons: Just because you designed it doesn't mean it will work

Zap: ESD testing

Zorch: Catastrophic failure of a DC DC converter IC. (Parasitic Zener)

Thursday, November 15, 2012

Hiroshi's Desk

Trust is the glue that holds business relationships together.

Today I made a visit to R2 semiconductor where I visited an old friend and met a new one. I saw intelligence, perseverance, and focus. Both very focused and technically solid people, you often find the cream of the crop in small outfits like R2. Its tough working in a start-up there are so many issues to deal with many non-technical. I really appreciate what their team has done. My visit reminded me of the story of Hiroshi's wallet.

I joined a start-up Keyeye around late 2002 or early 2003 time-frame. At my previous company we had been doing research and development on communication circuits until that changed. At the situation we were trying to start our family my wife didn't want to move. Keyeye was one of a very few ways to stay in Sacramento and still do cutting edge mixed-signal design outside of university research. I took a huge pay-cut with the goal to make it back in stock.

Hiroshi Takatori was a founder and the CTO of Keyeye at the time. We don't talk much anymore unfortunately, but what I can say about Hiroshi is that he is a brilliant and incredibly hard-working man. Hiroshi basically dedicated a big chunk of his life toward the success of Keyeye. He was very careful in who he hired. He had many criteria but one was to bring aboard straight-shooters (like himself) and people he could trust. His style is Japanese and he liked to do all the system simulations which he did at his desk which was located right in the center of our office building. He had no cube walls around his desk, he would sit watching the company work from his central location. We all had cube-walls fortunately. You could not go into the break-room, the front-door or the lab without passing by his desk. He pretty much had the same set of items on his desk all the time. His computer, butcher paper (for system diagrams), bucket of pens, FORTRAN print-outs, a container of dried sea-weed and his wallet sitting on the edge of the desk.

What I found interesting wast that over the first 3 years I worked there (before the move) his wallet basically sat in the same place everyday. It was a fat wallet with lots of notes, business cards and money popping out the sides. It was always there, always in the same spot. We would all walk by it every day multiple times. Guests visiting Keyeye would sometimes comment on it since it was so big and bulky looking, no wonder he didn't leave it in his pocket.

Nobody ever touched Hiroshi's wallet. We all feared the dried seaweed.

I found that Hiroshi's wallet symbolized one of the key elements of characters in a start-up which is trust. When in a start-up you wear many hats, do many functions. You focus on the success of the company your funding partners are helping you to create. There are few checks and balances. Your responsibility is huge, and your risk is high. If your character is weak, then you do not belong there. You do not deserve the responsibility. You need to trust each-other. Your funding partners need to trust you to deliver. If you are ever at a start-up and looking to hire someone, ask yourself if you would you trust them with your wallet? If not, then keep looking.

Sunday, November 11, 2012

$100,000+ Frizbee

After an IC design is completed the plans are "taped-out" to a FAB that processes the wafer. The first step is to generate the masks. Depending on the type of process there could be from 10 to over 40 masks. Each mask combined with photo resist and a light source are used to pattern layers on a wafer. These patters together form transistors, capacitors, resistors, inductors, diodes and the interconnect layers used to connect the elements together.

The cost of a wafer depends on several things, but for an older process technology, say something 15 years old the wafer cost may be between $600 and $1000 each. Now if you have a small die, you can get thousands of ICs (or dice) on the wafer giving them a cost of pennies each. Now if you can shrink the design (with a more advanced process) you can fit more die on a wafer and lower cost. Also yield increases with a smaller die size since you get more die per wafer. This all makes sense and is documented many a textbook. However this is when things go right...

Several times in my career I have seen the misprocessed wafer. Normally you are waiting for the wafer to get back from the FAB and you get a funny email. There are WAT (wafer acceptance test) structures on the wafer normally off to the side or between the "dice". The FAB probes these structures and records the data. They compare the WAT data to a specification table. If there is a problem, then the FAB lets you know, this is all part of their quality control process. Of course, how bad the failure is and how far off off spec are important. I will discuss one such case. This pretty much represents every case I have seen with bad material.

Case#1: Year~1999. Process 0.13u. Failure: "Transistor threshold off due to incorrect oxide thickness module". In the FAB, the process steps are often called "modules". These modules are sometimes mixed-up or done incorrectly. In this case, the wrong oxide was used for the IO transistors which were also used inside the analog front-end. FAB apologized and was making new material. Now, I was young and new in my career at the time. The chip was a "huge" SOC with more than 16 million transistors. About 1/2 of the content was analog. 1/2 digital. We all worked hard and were waiting for months to get the silicon on the bench. I thought "what hurt" to get an early look.

Got the packaged parts and plugged in the first one, and nothing happened. Plugged in another, nothing. Anyone who knows me understands I don't give up easy, so I asked for a "pile". After going through 20 I found one that "wiggled" or gave evidence of activity. We then trained a tech in the screen process and out of 100 parts we found 3 that wiggled. Only one of the 3 actually did anything interesting. The bad parts had a strange problem in that the IO pads would oscillate in different patterns at about 100 Hz.

Why was the chip IO oscillating at 100 Hz? Why was the chip performance so bad? Was it due to the process mistake or was there a problem in the design? We have an early look so lets use this time while we wait for new material. Since it was a base-layer screw-up at the FAB it took 2 months to get new material, so we plowed forward. We found that parts with different packages had better or worse yield. The ADC (which I did the architecture for) worked fine on the good parts. However other strange behavior existed. So we took the 100Hz problem and decided to debug it.

The team was me and 3 other people. We used the "company" debugging process. The team worked for two months (32 man-weeks) to find out what was going on. We started with an "ebeam" prober which tracks activity at junctions in the IC in a vaccuum. We isolated a section of the pad-ring called JTAG what was known as "bondary scan". Since we didn't scan the analog we could investigate the analog when the digital was not-available since the IO interface was oscillating. We used FIB (Focused Ion Beam) to isolate the elements of the JTAG circuitry. To get to this point took us about 4 man-months of work with the expensive debug equipment. We finally got to a point where we found a logic gate that appeared to have a floating input. I got a HSPICE simulation to demonstrate that the floating gate and its surrounding layout can oscillate at around 100 Hz. I though we found our smoking gun. We had our FA (Failure Analysis) FIB Expert cut the the metals around the gate to identify what appeared to be a bad Via (connection between layers). We then sent the photo to the FAB (at 24 man-weeks of debug) to ask if this is related to the oxide defect. The FAB said they didnt believe the photo since we did the FA ourselves. So we went back and found another part and isolated the bad-spot and had the FAB do the FA. Now we were 32 man-weeks into the debug, the new wafers were due back soon. I got a "sheepish" email from the FAB saying that the bad oxide layer affected the vias (don't ask me how). Attached was a photo of the bad spot without any name or record of the FAB or the design. The open via caused a "relaxation" oscillator to be formed by a combination of gate-leakage and parasitic coupling in a logic-gate in the JTAG circuitry.

The new material showed up and the part worked as designed.

So what did we learn during the 32 man-week exercise? We learned that the misprocessed wafer caused the problem. During this time the FA team left other priorities aside. Schedules slipped on the next generation part. Engineers and management were worried about the design, we learned nothing new. Or did we? It certainly was educational. What else could have been done with those resources is never to be known. What other things could we have done with that company time and money?

Well I sure learned something. We spent over $100,000 in labor and FA to prove that a bad wafer was bad. We also proved that this delayed the progress of the team and hurt next generation. The wafers were scrapped or "Frizbees".

Now, be careful if you work with me and have a "known bad wafer" shipped. I am very clumsy around those things these days. I tend to smash them against the wall or throw them in the parking lot. Its hard enough to get a mixed-signal IC working when it is processed correctly, but when its not, its pointless, especially with more complex designs.

Hopefully I just saved someone a few hundred thousand dollars... I have never seen the damage of a mis-processed wafer debug be any cheaper. I have seen it a total of three times in my career and in every case, greed, impatient people and disappointed customers are involved. I no case was it ever worth the effort.