|
| (Features, 23 Aug 2010 ) |
| By Sunit Bansal, Freescale Semiconductor Inc. |
|
Routing congestion has traditionally been a major bottleneck during place-and-route, timing and physical design closure in sub 90nm designs. The aggressive targets of obtaining a competitive die-size and ever-increasing requirements of maximum device operating frequency, render place and route design closure more challenging. Moreover lesser number of interconnect mask layers in some process nodes pose stringent challenges in routing to the designer. This paper proposes a novel approach to alleviate congestion hot-spots by first estimating the standard cell local pin density and subsequently modifying physical attributes of identified culprit cells based on timing criticality. Finally an incremental placement is performed for the changes to take shape and reduce congestion hot-spots. Implementation of the proposed pin-density calculator has been done in a 90nm chip with four metal interconnect layers and 7.2 percent routing tracks are reclaimed in the process.
Congestion is actually a measure of the amount of space available for laying out interconnect segments for logical connections. Ideally the amount of area required should be just sufficient for meeting the purposes of making logical connections. Too much or too less space is not desirable. A design becomes un-routable if there is more number of logical connections and wires in a lesser area with limited interconnect resources.
Existing algorithms to tackle congestion related issues focus on congestion aware placement optimization during physical synthesis stage. There is the provision of reordering scan segments to re-align the connectivity based on the placement information present in the design database. Some algorithms aim at modeling a greater module area in order to spread the placement of constituent standard cells. Typically, designers have to deal with twin objectives of meeting timing as well as accommodating all the glue logic in a minimal area to save on silicon and the associated costs of fabrication. Meeting these two objectives simultaneously constitutes a vicious circle as both concepts of timing criticality and congestion are directly dependent on each other. An area susceptible to congestion issues generally consist of digital logic that has strict timing requirements. Thus, there is a distinct trade-off in meeting either objective. If congestion is optimized, there is a hit in timing and if timing is optimized, there are local hot spots in the design where the area utilization is high.
In this paper, a novel algorithm is proposed to tackle routing congestion early in the placement or physical synthesis stage of the physical design cycle so that the design convergence is seamless and stiff targets of placement utilization are achieved. This algorithm is implemented and used in most of the in-house SoC’s in 90nm and 0.25um technologies. The set-up is fully automated and is readily portable in any design in any technology node.
DESCRIPTION In Figure 1, the flowchart captures the essence of the proposed methodology. A brief description of the underlying principles incorporated in stated methodology includes performing a rough first-cut placement. This gives the designer a vivid picture of the timing and congestion criticalities present. Subsequently using the proposed window based algorithm, standard cell pin density of the sea-of-gates region of the design is calculated. The high density windows are extracted and the constituent cells are analyzed. The culprit cells are identified and special attributes like effective cell area, cell orientation are applied taking into account that the timing is not degraded through these cells.
 Figure 1. Flowchart.
Window Based Algorithm The algorithm needs the following inputs: a. Library Exchange Format (LEF) file of the standard cells at a particular technology: This file contains the macro definitions of each logical cell. Also the layout information like pin position, orientation and shape. It also contains shape and orientation of power supply pins. Also the interconnect routing at standard cell level are modeled in form of blockages. Additionally it also contains physical parameters like cell area, cell dimensions and aspect ratio. This file can be dumped directly from the GDSII of each standard cell. Figure 2 shows the layout of a typical standard cell.
b. Window size: The smaller the size is given, more culprit cells can be identified and congestion can be removed using the proposed algorithm more effectively. The trade-off however is that, a smaller size involves more computation and hence a longer run-time. Depending on the size of the design, the user can judiciously choose a number. c. Coordinates of the floorplan and a placed physical database in form of a DEF (Design Exchange Format) file.
Figure 2: Layout of a standard cell. Click to enlarge
COMPUTATION OF PIN DENSITY
Figure 3: Pin geometries and orientation in a standard cell. Click to enlarge
In the Macro LEF file of a standard cell, pin geometries are described in form of a rectangular geometry. The rectangles shown in fig 3 depict the various pin geometries in a standard cell.
For each rectangle Ri, dimensions li and wi are computed. Suppose n rectangles constitute a pin. Hence the effective pin area is

Let us suppose that there are total m pins in a standard cell. We recursively compute the total pin area for all pins present in a standards cell.

For each standard cell the cell dimensions L and W are known from the Macro LEF file.
In some Macro LEF, a cell area is hard-coded.
Cell area is given by

Pin density for a standard cell is calculated as

Depending on the size of the design, the shape of the logic sea of gates area, the technology node and the accuracy/ run time requirements, the dimensions of a window are decided. Let us suppose the dimensions of the window is a x b
Area of a window is given by

The coordinates of the floorplan (x1, y1, x2, y2) are estimated from the DEF file.
Coordinates of each window is generated by using the following algorithm:

The coordinates of each window generated are given by (x,y,x’,y’). Pin density is computed by calculating the total area of all pin geometries present in all standard cells of the constituent window as well as the window area.
Mathematically, let us assume that there are K standard cells in a window with area Awin and there are N windows available with a fixed size (a x b).

The pin density for each window i.e. π1, π2…. πN are arranged in decreasing order of pin densities. The top k windows with maximum pin densities are evaluated where k is a design and technology dependent parameter.
Timing through the various pins of constituent cells in these k windows are checked and reported. Let the slack numbers be σ1, σ2, σ3……
Let ψ be the limit for the available timing slack. Timing is said to be non-critical if σ1 < ψ, σ2 < ψ, σ3 < ψ and so on. If such criteria are met, there can be three approaches to relieve congestion:
1. Do a re-placement of the standard cells present in the k windows keeping a placement clearance of δ between adjacent cells. 2. The nets through the constituent cells can be detoured or spread to minimize the density of nets passing over a limited area. 3. Do not use standard cells with high pin densities and exclude them during logical synthesis stage of the design itself.
RESULTS The proposed algorithm was implemented on a 90 nm SoC with 4 interconnect layers for routing. The design had 0.5 million gates. Routability in a design is estimated based on the availability of Grid-Cells or G Cells. For signal routing most of the CAD tools used in the industry are track based i.e. the design is divided into segments or tracks for each metal and metal interconnects are drawn on these tracks for connectivity. A collection of a fixed number of tracks for each metal constitutes a GCell. If in a particular GCell at a particular location in the sea-of-gates area, number of required tracks exceeds the number of available tracks, it is concluded that routing congestion prevails. More number of less utilized GCells indicates that there is lesser congestion. Figure 4a and 4b describes the comparison of congested GCells before and after the implementation of the algorithm.
Click to enlarge
In both Figures 4a and 4b, (1-6) ;( 7-13) etc denotes the number of extra tracks needed in a Gcell to satisfy the need of routing resources within the Gcell. From Figure 4a it is clear that there is an overall need of 2.45 percent extra routing resources. After implementing the proposed algorithm, the numbers of congested GCells are reduced and there is a need of 0.11 percent of extra routing resources in the form of routing tracks.
Figure 5: Timing slack distribution before and after the implementation Pin-Density Calculation algorithm. Click to enlarge
Figure 5 describes the comparison of slack numbers after timing analysis before and after the implementation of the algorithm. It can thus be concluded that timing degradation after implementation is less than 2 percent.
To conclude, the methodology is now being implemented in several other designs in 90nm, 180nm, 250nm and is made a part of the routing closure flow. The results show significant improvement in the routability scenario and the proposed flow would definitely aid the place-and-route designer to close routing and achieve higher utilization with less cycle time.
|
| |
|
|
|
|
| |
|
|
| |
|
|
| |
|
|
| 1/2/2012 |
|
| 1/2/2012 |
|
| 1/2/2012 |
|
| |
|
|
|
|
|
|
|
| |
|
| |
|
| 1/2/2012 |
|
| 31/1/2012 |
|
| 18/1/2012 |
|
| |
|
|
|
|
|