Mantra VLSI : July 2014

Monday, 21 July 2014

Electromigration & Self Heating

Electromigration

Electromigration (EM) is the movement of material that results from the transfer of momentum between electrons and metal atoms under an applied electric field. This momentum transfer causes the metal atoms to be displaced from their original positions. This effect increases with increasing current density in a wire, and at higher temperatures the momentum transfer becomes more severe. Thus in sub-100nm designs, with higher device currents, narrower wires, and increasing on-die temperatures, the reliability of interconnects and their possible degradation from EM is a serious concern

The transfer of metal ions over time from EM can lead to either narrowing or hillocks (bumps) in the wires. Narrowing of the wire can result in degradation of performance, or in extreme cases can result in the complete opening of the conduction path. Widening and bumps in the wire can result in shorts to neighboring wires, especially if they are routed at the minimum pitch in the newer technologies.
Foundries typically specify the maximum amount pitch in the newer technologies.

Foundries typically specify the maximum amount of current that can flow through a wire under varying conditions. These EM limits depend on several design parameters, such as wire topology, width, and metal density. EM degradation and EM limits depend on the temperature at which interconnects operate, as well as on the material properties of the wires and vias, on the direction of current flow in the wire, and on the distance of the wire segment from the driver(s).

One common EM check employed is to measure the average or DC current density flowing through a wire and compare it against foundry-specified limits.

1) EM is a Physical Design issue

ii) Due to high electric fields, fast moving electrons knock off ions forming the metallic interconnect thereby eroding the interconnect
iii) EM takes place over a long period of time and becomes a reliability issue, becoming a factor in determining the lifetime of an IC
iv) Usually occurs in interconnects where electron movement is continuous and unidirectional (Power Network in an IC)
v) As technology nodes keep getting smaller, EM is becoming a physical roadblock in scaling of interconnects

Fix methods:

increasing the metal width to reduce the current density is a typical solution.
for a via EM violation, you can increase the number of vias to fix potential EM issues
additional straps for the current supply
Layer switching is another option; typically, upper metal layers in the technology have higher current driving capability (due to greater thickness).

Self heating

i) It is a physical design issue
ii) Takes place in the output nodes/interconnects of circuits that charge and discharge frequently
iii) Leads to other problems caused by heating, like increase in resistance of the interconnect and hence increase in charging time of the node
iv) Also cause thermal reliability issues

Monday, 14 July 2014

Why hold does not depend on clock frequency?

If there is setup violation, the frequency of chip can be reduced and we can make the chip still function. But if there is hold violation then your chip is lost for ever. This is the most used phrase in physical design. But did you ever try to analyse why the hold violations does not depend on the frequency of the chip ? First of all let us understand Setup and hold checks completely Imagine data is travelling from FF1 to FF2 as shown in the figure.

Look at the timing diagram below Data1(clock cycle1 data of FF1 ) is being sampled at FF2 in clock cycle2 Data2 (clock cycle2 data of FF1) is on its way to FF2 already

From the figure Setup check It says that the data sampled from FF1 at cycle 1 should reach FF2 in cycle 2 before FF2 setup time. Equation Tc2q (FF1) + Tcomb = Tclk -Tsetup Hold check It says that the current data ==>Data 2, which is sampled from FF1 at cycle2, should not arrive at FF2 at cycle2 before FF2 hold time. ( because it messes up with Data1 which is being currently captured by FF2 in cycle2 ) In other words there Data2 from FF1 should not mess with Data1 which is already at FF2 which is currently being sampled T.c2q (clock to Q delay of FF1) +Tcomb >= T (hold ) Your clock->Q delay and Tcomb are not at all dependent on the clock period. Hence your hold is independent of clock frequency.

Routing in vlsi chip

Routing is an important step in the design of integrated circuits:

It involves generating metal wires to connect the pins of same signal while obeying manufacturing design rules.
Before routing is performed on the design, cell placement has to be carried out wherein the cells used in the design are placed.
The connections between the pins of the cells pertaining to same signal need to be made. At the time of placement, there are only logical connections between these pins.
The physical connections are made by routing. More generally speaking, routing is to locate a set of wires in routing space so as to connect all the nets in the netlist taking into consideration routing channels’ capacities, wire widths and crossings etc.
The objective of routing is to minimize total wire length and number of vias and that each net meets its timing budget. The tools that perform routing are termed as routers.
You typically provide them with a placed netlist along with list of timing critical nets. These tools, in turn, provide you with the geometry of all the nets in the design.

VLSI routing is generally considered to be a complex combinatorial problem. Several algorithms have been developed for routing, each having its own pros and cons. The complexity of the routing problem is very high. To make it manageable, most routers usually take a two-step approach of global routing (approximation of routing wires) followed by detailed routing (actual routing of wires).

Global routing:

Using a global routing algorithm, the router divides the design into tiles, each tile having a limited number of tracks and generates “loose” route for each connection by finding tile-to-tile paths (As shown in figure

(ii)). The routes are not finalized, but the approximate length is known by the distance among the tiles. For example, a tile may have 12 tracks. So, global router will assign 12 tracks to each tile. But, the final assignment of the track is not done during global routing.

Detailed routing:

Using detailed routing, the router determines the exact route for each net by searching within tile-to-tile path. It involves providing actual physical path to a net from one connected pin to another (as shown in figure

(iii)). Hence, detailed routed wire represents actual resistance, capacitance and length of the net. What router has to take care: While routing, a router has to pertain to specific constraints like timing budget for each critical net, also called performance constraints. There are other performance constraints too – like the router has to route in such a way as not to cause any crosstalk issues. There should not be any antenna issues. Also, there are a set of design rules like resistance, capacitance, wire/via width/spacing that need to be followed. For instance, technology may be limited by the minimum feature size it can have. Like, in 65 nm technology, the foundry cannot have wire widths less than 65 nm. So, the wires in the design have to be constrained to have wire length greater than 65 nm. Similarly, there are foundry specific constraints for other parameters. Each of these is termed as a Design Rule. Any violation pertaining to these in the design is termed as DRC (Design Rule Check) violation.

Grid based and gridless routing:

In grid based routing, a routing grid is superimposed on routing region. Routing takes place along the grid lines. The space between adjacent grid lines is called wire pitch and is equal to sum of minimum width of wires and spacing of wires. On the other hand, any model that does not follow grid based routing is termed as gridless routing model. This model is suitable for wire sizing and perturbation and is more complex and slower than grid based routing. In other words, grid based routing is much easier and simpler in implementation.

We have discussed here routing in VLSI designs. Although many advanced tools are available for achieving the purpose, most of these compromise with the quality of results to save run-time. Almost all tools have the option of routing with more emphasis on meeting timing or congestion. With most of the tools, in present day multi-million gate designs, perfect DRC-free routing (without opens and shorts) is generally not obtained in first pass. You have to route incrementally a few times to achieve the same.

Layout Versus Schematic (LVS)

The Layout Versus Schematic (LVS) is the class of electronic design automation (EDA) verification software that determines whether a particular integrated circuit layout corresponds to the original schematic or circuit diagram of the design. A successful Design rule check (DRC) ensures that the layout conforms to the rules designed/required for faultless fabrication. However, it does not guarantee if it really represents the circuit you desire to fabricate. This is where an LVS check is used.

LVS:

LVS checking software recognizes the drawn shapes of the layout that represent the electrical components of the circuit, as well as the connections between them. The software then compares them with the schematic or circuit diagram.

LVS Checking involves following three steps:

1. Extraction:

The software program takes a database file containing all the layers drawn to represent the circuit during layout. It then runs the database through many logic operations to determine the semiconductor components represented in the drawing by their layers of construction. It then examines the various drawn metal layers and finds how each of these components connects to others.

2. Reduction:

During reduction the software combines the extracted components into series and parallel combinations if possible and generates a netlist representation of the layout database.

3. Comparison:

The extracted layout netlist is then compared to the netlist taken from the circuit schematic. If the two netlists match, then the circuit passes the LVS check. At this point it is said to be "LVS clean." In most cases the layout will not pass LVS the first time requiring the layout engineer to examine the LVS software's reports and make changes to the layout. Typical errors encountered during LVS include:

1. Shorts:

Two or more wires that should not be connected together have been and must be separated.

2. Opens:

Wires or components that should be connected are left dangling or only partially connected. These must be connected properly to fix this.

3. Component Mismatches:

Components of an incorrect type have been used (e.g. a low Vt MOS device instead of a standard Vt MOS device)

4. Missing Components:

An expected component has been left out of the layout. 5. Property Errors: A component is the wrong size compared to the schematic.

Design Rule Chekcs (DRC )

Design Rule Checking or Check(s) (DRC) is the area of Electronic Design Automation that determines whether a particular chip layout satisfies a series of recommended parameters called Design Rules. Design rule checking is a major step during Physical verification of the design, which also involves LVS (Layout versus schematic) Check, XOR Checks, ERC (Electrical Rule Check) and Antenna Checks.

1) Logic DRC contain following three points:

Net fanout
Net capacitance
Net transition

FIXING TECHNIQUES:

1. MAX TRANSITION

ADD A BUFFER IN MIDDLE OF THE LONG LENGTH WIRE.
REDUCE THE WIRE LENGTH.
ADDING A CHAIN OF BUFFERS.

2. MAX CAPACITANCE:

DECREASE WIRE LENGTH AT OUTPUT SIDE.

3. MAX FANOUT:

CLONNING=ADDING A SAME CELL LOAD WILL BE DIVIDED.
SHARING THE LOAD

2) Physical DRC contain various rule, and it's depends on technology, it is described in technology file (tf file), tf file example:

Usually contain technology information such as:MinWdith of all Layers

Fat Metal Width Spacing Rule
Fat Metal Extension Spacing Rule
Maximum Number Minimum Edge Rule
Adjacent Via Rule
Fat Metal Contact Rule
Fat Metal Extension Contact Rule

Design Rules are a series of parameters provided by semiconductor manufacturers that enable the designer to verify the correctness of his or her mask set. Design rules are specific to a particular semiconductor manufacturing process. A design rule set specifies certain geometric and connectivity restrictions to ensure sufficient margins to account for variability in semiconductor manufacturing processes, so as to ensure that most of the parts work correctly. The most basic design rules are shown in the diagram on the right. The first are single layer rules. A width rule specifies the minimum width of any shape in the design. A spacing rule specifies the minimum distance between two adjacent objects. These rules will exist for each layer of semiconductor manufacturing process, with the lowest layers having the smallest rules (typically 100 nm as of 2007) and the highest metal layers having larger rules (perhaps 400 nm as of 2007). A two layer rule specifies a relationship that must exist between two layers. For example, an enclosure rule might specify that an object of one type, such as a contact or via, must be covered, with some additional margin, by a metal layer. A typical value as of 2007 might be about 10 nm. There are many other rule types not illustrated here. A minimum area rule is just what the name implies. Antenna rules are complex rules that check ratios of areas of every layer of a net for configurations that can result in problems when intermediate layers are etched. Many other such rules exist and are explained in detail in the documentation provided by the semiconductor manufacturer. Academic design rules are often specified in terms of a scalable parameter, λ, so that all geometric tolerances in a design may be defined as integer multiples of λ. This simplifies the migration of existing chip layouts to newer processes. Industrial rules are more highly optimized, and only approximate uniform scaling. Design rule sets have become increasingly more complex with each subsequent generation of semiconductor process.

LVS

TLU+ files: TLUP

•In Apollo and Astro technology there is a linear capacitance model, where the net capacitance is calculated in terms of capacitance per square user unit of conducting and via layers specified in the Milkyway technology file (or .tf file). To get higher extraction accuracy and still get the runtime benefit, a Table Look-Up model or table, which contains wire capacitance at different spacings and widths, is precalculated and stored in the Milkyway technology file. TLU internally calls capGen, which is normally bundled with Astro and Apollo, to create this table. The Astro and Apollo Linear Parasitic Extraction (LPE) will look up appropriate wire capacitances from the table during the extraction. The grdgenxo command, which is normally bundled with Star-RCXT, is a more accurate engine to create the table than capGen. After processing and attaching the grdgenxo-generated capacitance table to the Milkyway database, the Astro LPE/TLUPlus will be able to extract the net capacitances using the same extraction engine but different CapTable compared to LPE/TLU.

R,C parasitics of metal per unit length.
These(R,C parasitics) are used for calculating Net Delays.
If TLU+ files are not given then these are getting from .ITF file.
For Loading TLU+ files we have load three files .
Those are Max Tlu+,Min TLU+,MAP file.
MAP file maps the .ITF file and .tf file of the layer and via names.

Minimum pulse width violation: pulse width check

Minimum Pulse width check is important for clocks, for the proper performance of the sequential circuits. This check ensures that, the width of the clock signal, is above a minimum value. Why should the pulse width of a signal shrink ? This is due the unequal rise and fall delays of the combinational cells. Imagine a clock entering a buffer. If the rise delay of the buffer is more than the fall delay, the output clock will have less width than the input. See the following figure, which illustrates the same. So think of, what will happen to the same clock signal, when it passes through a series of same type of buffers. The width of the clock signal keeps decreasing, and at a point when the buffer delay is more than the clock pulse width, the clock pulse gets absorbed. This is known as Pulse absorption. So it is important to perform minimum pulse width check

How to constrain the design for pulse width checks in Primetime?

Keep the variable timing_enable_pulse_clock_constraints to true for enabling the pulse width checks. Define the minimum and maximum required clock pulse width, with the commands set_pulse_clock_min_width and set_pulse_clock_max_width. The result can be seen, after the timing analysis, along with the command report_constraints. If needed, we can ignore the pulse width checks, at particular sequential cells or certain clocks with the command remove_min_pulse_width. How to correct these, if violations are seen ? We need to change the clock tree cells, which have less difference between rise and fall delays. It is always the best to choose the clock tree cells which have minimum delay variations at the begining of your project itself. Then last minute fixes like these can be avoided

Mantra VLSI

Pages