Bookmark and Share Printer-friendly version Email to a Friend

Robust Timing Closure in Scan Shift Using Sequential Gates

(Features, 02 Mar 2011 )
By Amol Agarwal and Abhishek Mahajan, Freescale Semiconductor

All modern day SOCs use scan structures to detect any manufacturing faults in design .Scan chains designed for testing, connect sequential elements of chip in serial order. Due to absence of combinational logic between the scan elements, these scan chains are prone to hold failures. Moreover in sub-90nm technologies, the OCV (On Chip Variation) has huge impact on timing margins. So unless design is timing signed off across multiple corners, there are very high chances of hold failures specially in hold critical paths like scan chains. These hold failures make the chip unusable in real applications (even though chip may be fully operational in functional scenario). These failures if found on silicon will lead to yield loss and hence huge revenue loss to design companies. So we need to design a robust scan structure to tackle above problems.

In this article, we will start with quick revision of timing basics of flops and latches. In next section, we will discuss scan chains and associated timing closure challenges with them. We will then explain the use of latches and flops in scan chains to create robust scan structure that will be immune to timing failures in sub-90nm technologies. We will cover best possible solution to meet timing requirements for all possible combinations of sequential elements in scan chain.

QUICK RECAP OF SETUP/HOLD TIMING
Flip-flops and latches are the two basic building blocks of a sequential circuit. A flip-flop changes its state at active edge (positive or negative) of the clock pulse applied. The flop simply retains its output when there is no active clock edge. On the other hand latch is a level sensitive device which continuously samples its input and correspondingly changes its output on active pulse level (positive or negative) of some enable signal. A flip flop has master slave configuration having two latches in cascade working on opposite active level. A flip flop area is almost double of latch area.

In order to design synchronous designs, we need to ensure that output of flops/latches is not metastable. This can be ensured by meeting setup and hold checks in design.


Figure 1
Click to enlarge

In a flop, 1-1 is hold check while 1-3 is setup check(Figure 1) for single cycle operation. We need to make sure that data launched by flop1 is captured by flop2 before next active edge. At the same time we need to make sure that data launched by flop1 is not captured by flop2 on same active edge.


Figure 2
Click to enlarge

In case of second flop being negative edge triggered, setup check will be 1-2 (Figure 2) while hold check will be on previous negative edge (Figure 2). This means that data launched by flop1 should not be captured by previous falling edge of flop2. This in real time is not possible unless we have clock skew more than the half cycle.

Thus in a positive-positive or negative-negative flop pairs, setup check is by default one cycle and hold check is zero cycle While in positive-negative or negative-positive flop pairs, setup check is by default half cycle and hold check is half cycle backwards. Lets hold the concept of timing checks in latch for time being.

Scan Chains
Scan chains are used in SOCs to do testing. All registers of design are connected in serial order and stimulus is provided from outside chip and then output is observed through shifting out these chains to detect any stuckat/transition failure. Modern day SOCs are quite complex and have multiple clock domains in a single chip. While scan stitching a design after logical synthesis, it is generally taken care to stitch flops having same clock structure in same scan chain. But due to limited availability of scan input/output ports available at top level, mixing of registers across different clock domains is inevitable. Having scan chains of unbalanced length is also not good idea because of increase in overall test time. So this scan structure leads to timing closure problems in later stages of design. Since scan shifting is done at slower frequency and there is minimal logic if any between flop pairs, setup closure is not a problem. However these paths are very hold critical because of minimal logic and due to skew present between pair of flops. As we discussed above since flops from different domains are mixed in a scan chain, there are many cases where there is huge skew between launch and capture flops. Many of marginal hold violations can pop up during late stages of design due to noise effect and this can lead to hold buffering in otherwise stable or closed design which can cause design goes haywire.

More worse could be the fact that our derate margins may not be sufficient and we can see hold failures on silicon only. This could be the case if uncommon clock path is huge and actual variation on silicon is higher than estimated variation. As we go further in sub-90nm CMOS technologies, variation effects are getting more and more dominated and can result in lot of hold violations on silicon. Any hold failure in scan shift path has severe consequences. It requires lot of debugging and time to detect failing chain on silicon. The situation worsens when we have compression logic for scan as well. Even after detecting failing chain, we need to block it and it will lead to reduced test coverage.

In short hold failure in scan chain is very risky and design must be robust enough to take care of these uncertainties.

There are methodologies like scan chain reordering to rearrange the scan chains depending upon spatial location of registers. Although these techniques are quite handy and designer must explore them as well but as we discussed above there exists cases where scan chain crossing between two clock domains is unavoidable.

A better way to solve this problem is to act proactively and take care of these issues in logical synthesis stage itself where scan chains are built. All flops driven from same clock gating logic should be stitched together and at the end of these bunch of flops, a lockup latch could be inserted to avoid any hold failure from last flop of this domain to first flop of next clock domain

Let us understand this concept from one example shown in Figure 3.


Figure 3
Click to enlarge

If clock period is 50ns and skew is 5ns, we have to insert 5ns + derate margin equivalent hold buffers between flop3 and flop4 at later stages of design. As we discussed above that due to ocv in sub-90nm designs, our standard derates may not be sufficient as uncommon clock path goes beyond certain limits. For example, only 5ps variation per clock buffer (over and above derated value) for a capture path having 10 extra clock buffers will lead to 50ps violation. Moreover this margin may not be sufficient as due to OCV factor this skew can be more than 5ns.

The solution to above problem is inserting lockup latch at output of flop3 with lockup latch having same latency as flop3.


Figure 4
Click to enlarge

As we can see from above waveform (Figure 4), when we insert lockup latch between flop3 and flop4, our timing path is broken in two stages.
1. From flop3 to lockup latch
Hold Check is from 1-1 which is still zero cycle check but much relaxed and easy to meet as there is no skew. Default setup check is from 1-2.
2. From lockup latch to flop4
Hold check is from 2-1. This is major advantage and motivation to insert lockup latch. Hold is shifted half cycle backwards and now if our clock skew is even up to half of shift clock period, we have sufficient margin. This guarantees that there will not be any hold violation now in this case.

Setup check is from 2-3. Latch is transparent during 2-3, and any data captured during this phase will be transferred to flop4 till edge 3(minus setup time of flop ). We can observe that setup check from flop1 to lockup latch can be relaxed as well. 1-2 is default check but latch is transparent during whole half cycle, so in ideal case setup check can shift towards 3. (This concept is called latch borrowing.)

Another important thing to note here is that lockup latch should have clock same as launching flop clock and not as capture flop clock. As we saw above, hold check from flop3 to latch is still 1-1(zero cycle check). We will not have any advantage if lockup latch has its clock same as capturing flop clock. So ideally both launch flop and lockup latch should be driven by same clock buffer in clock tree structure.

The above example shows latch is effective way of fixing hold in scan shift paths. Some people might question that we can insert hold buffers or delay cells to fix these violations also. However a quick look at area of hold buffer, delay cell and latch suggest that hold buffer is appropriate for fixing small hold violations but if violation is slightly large, latch has advantage of both area and delay over buffer. With delay cells there is always risk of huge variation from one operating condition to other, so these cells should be used selectively and smartly. On the other hand latch always guaranty half cycle delay independent of operating conditions.

In our last section we will consider various cases to find out most suitable candidate for fixing hold when there is huge clock skew between launch and capture flop in a scan chain.

DIFFERENT CASES
Case 1: Between positive and positive edge triggered flops
We covered this case in our above example and negative level latch can be used

Case 2: Between negative and negative edge triggered flops
With same analogy as above, positive level latch would be suitable candidate

Case 3: Between negative and positive edge triggered flops
We know that hold is quite relaxed here. No lockup element is required here.

Case 4: Between positive edge and negative edge triggered flops
This is very interesting case. This is not a problem from timing point of view but this is illegal connection in scan shifting. Since in ATPG clock is considered to be return to zero waveform (after shifting is complete clock will be active low), if we allow this type of crossing we will find that after scans shifting all such positive and negative pairs will have same value after clock pulse. This will lead to drop in test coverage because all flops are not independent controllable. So it should be avoided to have such a situation while stitching but sometimes it is unavoidable to do that because of compression logic or hard macros.

We can insert a negative level lockup latch between positive and negative flops but this will solve the ATPG problem but will introduce timing problem because hold check would be again zero cycle check from both flop to lockup latch and latch to negative edge flop.

Another solution is to insert a dummy flop working either on positive edge or negative edge of clock between these flops. It should be noted that dummy flop will still have same value as first flop or second flop after shifting depending upon whether we have make it positive edge triggered or negative edge triggered but this will not cause any problem because this is not any functional flop and we are not using it anywhere to capture data in any pattern. If we decide to insert positive edge flop, clock latency of launch flop and this dummy flop should be same because it will be zero cycle hold check and dummy flop to next flop would be half cycle hold check and similarly if we insert dummy negative edge flop, latency of capture flop and dummy negative edge flop should be same.

This completes all four cases possible between flops that can exist in design but sometimes these cases are not so obvious. For example, a word of caution is for scan stitching in a design where we have hard macros, which are prestitched. Many times we don’t have netlist/spef/timing constraints available for these hard macros, it is advisable to insert lockup latch before these hard macros in our design to be sure in case hard macro owner has missed it. Another such example is burn-in mode where scan chains of design are concatenated together in order to toggle all the flops at same time. So here also is the possibility that last element of chain and first element of next chain have timing critical logic or invalid positive to negative crossing. This type of scenarios ideally should be taken care in RTL itself because designer knows better about the order of scan elements while concatenating chains together. If this is not taken care, it is a good practice to insert appropriate lockup latch at the end of each chain.

Using above techniques and guidelines, a designer can ensure robust scan structure in its chip. In case of setup failure design can operate at lower frequency but in case of any critical hold failure, intended functionality of logic is unpredictable. Hold failure in scan shift is very critical. It can result in huge coverage loss while testing. So we need a robust scan structure which can address potential scan shift failure issues like we discussed above. An appropriate lockup type element is perfect solution to address such issues because it guarantees half cycle delay independent of operating conditions.

Freescale Semiconductor

 
Printer-friendly version Email to a Friend
 
Article Rating 
Average Rate:
 
Poor Quite Good Good Very Good Excellent
 
 
ADVERTISEMENT
 
Related Content 
 
 
ON-DEMAND WEBCASTS


 
 
Highest Rated  
Feedback Loop  

ADS BY GOOGLE 
 
 
 
ADVERTISEMENT
Press Release 
 
TECHNOLOGY NEWS
 
 
 
PRODUCT NEWS
 
FEATURED SPONSORS
 
 
 
DESIGN CENTERS
 
ADVERTISEMENT
     
Reference Designs 
   
     
 
 
 
 

 

RSS
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   

POLL
What type of environmental regulation do you think will be most beneficial for the tech industry?
Proper recycling and disposal
Push for power efficiency and energy conservation
Chemical/lead regulation
View results


 
     
 
Power Technology E-newsletter 
Power.org Releases Power Architecture 32-bit Application Binary Interface Supplement
EDNA, May 11
POL Regulators Designed for Energy-efficient Computing
EDNA, March 11
Fairchild Revolutionizes Power Savings
EDNA, January 11
Lattice Transforms Board Power and Digital Management
EDNA, November 10
 
Analog E-newsletter 
12V Dual-channel Synchronous Buck Converter Features Integrated FETs
EDNA, February 10
Power MOSFETs features reduced top-side thermal impedanc
EDNA, January 10
 
     
 
KNOWLEDGE CENTER
 
Texas Instruments: DaVinci™ Technology
 
Texas Instruments: Safe Bet Series
 
 
INDUSTRY LINKS
 
Photonics Association (Singapore)
Singapore Industrial Automation Association (SIAA)
Taiwan Semiconductor Industry Association (TSIA)
 
 
OUR SPONSORS
 






Keithley Instruments
With more than 60 years of measurement expertise, Keithley Instruments has become a world leader in advanced electrical test instruments and systems from DC to RF (radio frequency). Our products solve emerging measurement needs in production testing, process monitoring, product development, and research...
 
 
 
     
 

EDN India | EDN Taiwan | EDN Korea | EDN Japan | EDN China | EDN | EDN Europe

 
ABOUT EDN Asia | CONTACT US
   
© 2012 EDN Asia All rights reserved.