Verilog basics

1) verilog code to swap contents of 2 registers with and without temporary registers. with temp reg:

Read More

5 stage pipeline

RISC: Sequence of simple operations. CISC: A few no. of complex instructions for a same task. Instructions of variable length and variable cycles to complete.

Read More

k-map

There are the following steps to find the minterm solution or K-map:

Read More

Floorplan basics

Floorplanning a chip or block is an important task of PD in which location, size and shape of soft modules, and placement of hard macros are decided. Floorplanning sometimes can also include I/O pad, pin placement, bump assignment, bus planning, power planning and more.

Read More

RF vs SRAM

RF is large signal array. SRAM is small signal array. SRAM uses bit line differential signalling and hence generally faster than register files.

Read More

Negative net delay cases

In some cases tool reports negative net/cell delay. The delay for a net/cell is the time it takes for the output signal to reach threshold point after the input signal reaches to threashold point.

Read More

Constraints

In this article we shall discuss about the different constraints we give for a block.

Read More

AOCV

In this we will understand how tool calculates stage counts and distances on AOCV for launch and capture timing paths and gets corresponding AOCV derate values.

Read More

Scan Chain

Scan chains are the elements in scan-based designs that are used to shift-in and shift-out test data. A scan chain is formed by a number of flops connected back to back in a chain with the output of one flop connected to another. The input of first flop is connected to the input pin of the chip (called scan-in) from where scan data is fed. The output of the last flop is connected to the output pin of the chip (called scan-out) which is used to take the shifted data out.

Read More

Timing Models

For full flat chip timing analysis we need to read in gate level netlist alogn with spef/sdf, timing libraries and constraints. Using this approach designers should wait till all blocks completion prior to performing full chip timing.

Read More

Clock gating checks

Clock gating check is a constraint, either applied or inferred automatically by tool, that ensures that the clock will propagate without any glitch through the gate.

Read More

Timing Budgeting

In hierarchical design flows, chip-level timing constraints must be mapped correctly to corresponding block-level constraints.

Read More

Signoff Methodology

STA can be run for many different scenarios. The three main variables that determine a scenario are:

  • Parasitic corners (RC interconnect corners and operating conditions used for parasitic extraction).
  • operating mode.
  • PVT corner.
Read More

NLDM vs CCS

The cell timing models are intended to provide accurate timing for various instances of the cell in the design environment. The timing models are normally obtained from detailed circuit simulations of the cell to model the actual scenario of the cell operation. The timing models are specified for each timing arc of the cell.

Read More

Interconnect delay

Net delay is the diff btw the time a signal is first applied to the net and the time it reaches other devices connected to that net.

Read More

RC Corners

RC variation is also considered as corners for the setup and hold checks. RC variation can happen because of fabrication process and the width of metal layer can vary from the desired one.

Read More

When to specify false path between diff clock domains?

We need to understand whether the clock domains are related or independent of each other. It depends on whether there are any data paths that start from one clock domain and end in the other clock domain. If there are no such paths, you can safely conclude that the two clock domains are independent of each other. This means that there is no timing path that starts from one clock domain and ends in the other clock domain.

Read More

Via Pillar

Via pillar is a new technology that aims to reduce via resistance and increase electromigration robustness for enhanced performance.

Read More

Time borrowing for latch based designs

For Flops, data arrival later than capture clock edge causes SETUP violation. Whereas Latch remains transparent for entire duration of active clock edge, relaxing arrive-before-edge criterion.

Read More

Constraint management for source snychronous designs

A source-synchronous interface outputs the clock in addition to the data constrained with it. This clock port is used to sample data at the receiver that connects to the interface. There are different categories of source-synchronous interfaces.

Read More

What is virtual clock and when to use it?

A virtual clock is a clock that exists but is not associated with any pin or port of the design. It is used as a reference in timing analysis to specify the input and output delays relative to a clock.

Read More

Understanding check_timing

clock_expected This warning indicates a missing signal of type clock on a clock pin ofterm termed as clock phase. If there is any active, ie, non-disabled sequential arc from or to the clock pin & clock pin does not get clock signal, then this warning is flagged.

Read More

Recommendations for fixing hold violations during Pnr

Pre-CTS stage

Hold timing analysis can be performed early in flow to identify hard macros which have large hold time requirements. Identifying these situations early allows planning for enough space to insert the required buffers and delay cells to fix them.

Read More

Metal ECO flow

A metal-only ECO is carried out by changing only metal interconnects in the design. Metal-only ECOs are very common in today’s semiconductor industry as they save complete silicon re-spin. Sometimes there may be need to change the design for various reasons, and that too, a minor change. These changes may be due to some bug in the design or due to customer demand. A metal-only ECO enables the design to be re-fabricated only for a few layers. It is very cost-effective as for complete silicon re-spin, there may be a requirement of around 100 layer masks to be manufactured. Metal-only ECOs enable the older masks to be used for most of the layers. Only the layers with changes in them need to be manufactured again, which is usually 2 to 4 in case of metal-only ECOs.

Read More

PPA tips and tricks

Performance/Power/Area are the key metrics to validate the functionality of any design on a given technology node. In this article we will go over various place-and-route techniques to achieve these key metrics.

Read More

Interconnect RC

Wire delay comes from two sources. One is the intrinsic, speed-of-light delay. The second is the lossy nature of on-chip wires; because the resistance of such wires is very high, the wires form RC circuits. Speed of light delay is proportional to the length of the wire, while RC delay increases with the square of the wire length. Thus, for long wires, RC delay dominates. For shorter wires, speed of light delay still dominates. However, such delay over a short wire is still relatively small compared to gate delay.

Read More

DFT Modes

As the technology nodes are shrinking consistently, the probability of the occurrence of faults is also increasing which makes DFT an indispensable function for modern sub-micron SoCs.

Read More

MIMCAP and MOMCAP

MIM (Metal-Insulator-Metal) and MOM (Metal-Oxide-Metal) capacitors are both metal-to-metal capacitors.

Read More

How to perform timing check between asynchronous clock domains

If two clock domains are asynchronous and you have applied set_false_path between these two clocks, no timing checks can be performed. Also, if you have defined a clock group with asynchronous clocks using the set_clock_groups command​ with the -asynchronous option, by default the tool cannot perform a timing check. But if you use the -allow_paths option with the set_clock_groups command, timing check can be performed.

Read More

Gate vs interconnect delay

Problem Statement: One path has 100% gate delay, second path has 50% gate delay and 50% interconnect delay, third path has 100% interconnect delay. These 3 different paths converged at a voltage and temperature. What would happen to each path if voltage and temperature changed? Which would be fast and which would be slow?

Read More

Clock Jitter

By definition, clock jitter is the deviation of a clock edge from its ideal position in time. Simply speaking, it is the inability of a clock source to produce a clock with clean edges. As the clock edge can arrive within a range, the difference between two successive clock edges will determine the instantaneous period for that cycle. So, clock jitter is of importance while talking about timing analysis. There are many causes of jitter including PLL loop noise, power supply ripples, thermal noise, crosstalk between signals etc. Let us elaborate the concept of clock jitter with the help of an example:

Read More

Clock Domain Crossing

What is Metastability?

Any discussion of clock domain crossing (CDC) should start with a basic understanding of metastability and synchronization. In layman’s terms, metastability refers to an unstable intermediate state, where the slightest disturbance will cause a resolution to a stable state. When applied to flip-flops in digital circuits, it means a state where the flip-flop’s output may not have settled to the final expected value.

Read More

Combating Congestion

Reasons for congestion

There can be multitude of reasons for congestion, with some reasons having a direct and others having an indirect impact. Let’s examine some.

Read More

Mutli bit flops and MIMCAP

Benefits of Multibit flops:

  • Area reduction because of shared transistors and transistor level optimized layout: Area of Multibit cell is less than two single bit cells because of transistor level optimization of cell layout, which includes shared logic, power-supply and substrate-well.
  • Total length of clock tree is reduced: This results in reduction of clock-tree buffers and clock-tree power. Clock-tree buffer level reduction improves overall balanced design skew.
  • Power reduction.
  • Better skew
Read More

Latency vs skew

To achieve better latency; high latency means more power dissipation

  • A good placement of flip flops will reduce latency i,e., sink pins of a clock are not placed far away, so that latency is reduced.
  • Constraint the clock net routing to be on upper metal layers that are less resistive using the TopPreferredLayer and BottomPreferredLayer constructs.
  • Using mesh it is possible to achieve lower insertion delays.
  • Htree reduces insertion delay: the combination of larger drivers and low RC routing layers reduces the non common path clock insertion delay, potentially increasing the performance.
Read More

Design Rule Checks (DRC)

Here we discuss about various types of design rule checks (DRC) violation, their causes and how to fix the various design rule checks (DRC) at lower technology node on block level as well as full chip level implementation while meeting the design rule with respect to latest technology standards.

Read More

Latch vs Flip Flop

Latch

  • Latch is faster (no need to wait for clock edge) but less predictable (more prone to race conditions).
  • Latch uses less area (becasue there are less no.of gates).
  • Latch is fast (the longer combinational path could be compensated by shorter paths in the subsequent logic states). That is why for high performance, circuit designers are turning into latch based design.
  • For ASIC’s with large skew, latches have substantial benefits for reducing the clock period.
Read More

OCV, AOCV, POCV Part 2

One of the primary challenges is variation in manufacturing parameters, namely random and systematic variations. To model these parameter variations, a few engineers came up with the on-chip variation (OCV) model. The concept of OCV was first introduced in technology nodes above 90nm. The fundamental idea behind OCV is to apply global derates on the whole design irrespective of the type of cells, its individual variation or its slew-load conditions. But this simple concept became ineffective in lower technology nodes. Unfortunately, global derates make the design too optimistic for shorter paths and too pessimistic for longer paths. Subsequently, expected results are not accurate and reliable enough, which affects the performance of the chip.

Read More

Congestion

If the number of routing tracks available in one particular area is less than the required number of routing tracks then it is called congestion.

Read More

Physical DRC

WIRE TO WIRE SPACING(MIN SPACING) MIN WIDTH OF WIRES VIA TO VIA SPACINGS NOTCH AVOIDING

Read More

DPT and color conflict

Multiple patterning lithography (MPL) techniques have been used to extend the 193nm lithography to 22nm/14nm nodes. Possibly further due to the delay of extreme ultra violet lithography and electric beam lithography (EBL) Generally speaking, the MPL consists of double patterning lithography (DPL) and triple patterning lithography (TPL).

Read More

Antenna Affects

Effect of charge accumulation in isolated nodes of an integrated circuit during its processing is known as Antenna effect. This effect is also known as Plasma Induced Damage. The discharging of accumulated charges, which is done through the thin gate oxide of the transistor, it might cause damage to the transistors and degrade its performance.

Read More

Deep dive into multisource cts

A Multisource Clock Tree System (MCTS) represents a novel clock distribution technology that fills the gap between conventional clock tree and clock mesh. Clock mesh delivers the best possible clock frequency, skew and OCV results, and whereas conventional clock tree delivers the lowest power consumption and easiest flow.

Read More

Difference between CTS, MultiSource CTS and Mesh

There are four key differences between conventional CTS, multisource CTS, and clock mesh: shared path, mesh fabric, design complexity, and timing analysis. Each subsequent section discusses each of the three clock distribution methods with respect to these key differences.

Read More

Introduction to SDC!

The Synopsys Design Constraints (SDC) format is used to specify the design intent, including timing, power and area constraints for a design. This format is used by different EDA tools to synthesize and analyse a design. SDC is based on the tool command language (Tcl).

Read More

Lockup Latch

Why do we need lock up latch and how are we gonna benefit from lockup latch ? A lock-up latch is nothing more than a transparent latch used intelligently in the places where clock skew is very large and meeting hold timing is a challenge due to large uncommon clock path. That is why, lockup latches are used to connect two flops in scan chain having excessive clock skews/uncommon clock paths as the probability of hold failure is high in such cases.

Read More

Understanding various steps in cts

clock opt engine starts with an initialization step, which checks for pre-requisites and placement overlaps before building the clock tree network. The quality of the clock tree is highly dependent on the quality of input given to the tool. For example, a very tight transition may lead to high buffering, whereas a relaxed transition leads to a substantial violation at signoff. Hence, it is important to understand the flow inputs that are passed to the tool.

Read More

Skew groups

The feature of balancing more than one group of clock pins is similar to the capability of local skew or useful skew in clock tree synthesis. For example, you can group clock sinks that are not related to timing critical paths in a skew group and relax the target skew goals of the skew group. You can also give separate goals of target early delay to different skew groups and achieve the similar effect of useful skew to improve the slack of critical timing paths.

Read More

Inverter vs Buffer based clock tree

A buffer is nothing but two inverters connected back to back. Does it make any difference if the CTS is done using buffers or inverters ? What are the pros and cons and what factors would backend design engineer consider while building clock tree? These questions will be answered here.

Read More

Halo vs blockage

Blocakges are specific locations where placing of cells are prevented or blocked. These act as guidelines for placing std cells in the design. Blockages are of following types:

Read More

Introduction to Mesh, H-Tree and Multi-Tap Clock

Usually with regular CTS, clbock tree is balanced primarily considering a slow corner. When timed in a fast corner, the delays of different cell sizes or cell types may scale differently to one another and differently to the RC delay of the connecting wires, leading to skew. This may lead to harder setup and/or hold timing closure.

Read More

Part2 on ICG cells

PD compliers does not balance ICG cells because they are not synchronization points on the clock tree. ICG cells are intermediate points on the clock tree, so the clock arrival times cannot be balanced. CTS balances the flip-flops located downstream.

Read More

Handling Power in EDA Flow

DC low power flow: The size_only attribute is set on all inserted isolation cells, level shifters, and enable registers to prevent them from being optimized away. This ensures that thae isolation and level -shifting functions between power domains are maintained throughout the flow.

Read More

Power Reduction Methods - Part1

As power becomes increasingly significant in the advanced technologies more power reduction design methods are explored. There are several different RTL and gate level design strategies for reducing power. Clock gating is one of them and the most popular one. Other such is dynamic voltage and frequency scaling, but this is not very popular as it is difficult to design and implement.

Read More

Power Intent Concepts

In the UPF language, a power domain is a group of elements in the design that share a common set of power supply needs.

Read More

Low Power Design Strategies

Power consumption has become a very important factor in recent process nodes. In this article we will explore/discuss the increasing challenges of power consumption and various design strategies to reduce power consumption.

Read More

PPA comparision of 8kx32 vs two 4kx32 SRAMS

Let us first compare the power between two designs. For 8kx32 sram, there will be one clock read and 1 clock write. But if we divide that into two then we need to have more clocks and we might see more power dissipated in the second design than first.

Read More

Unconstrained Paths

Unconstrained paths are paths without any timing constraints specified to them, i.e. set_input_delay, create_clock, etc.

Read More

MultiCycle Paths!

If no multi-cycle is defined, then default setup check happens after one clock cycle and hold check happens at same clock edge as launch. This looks something like below:

Read More

Different Setup and Hold fix methods!

Ways to fix Setup and hold timing violation

Setup time is defined as the minimum amount of time before the clock’s active edge by which the data must be stable for it to be latched correctly. Any violation in this required time causes incorrect data to be captured and is known as a setup violation.

Read More

Flat vs Hier Designs!

Introduction

The disadvantages of flat runs are more run times, more memory requirement, limitation of EDA tools to handle designs greater than certain gate count.The design under discussion had two blocks. So effectively we had two blocks and a chip_top to pay attention to. Figure 1 is the floorplan in such a case. As you can see, there are two huge partitions in the design. It can also be seen that there is huge channel in between the blocks. Also, there are certain design requirements like signals from either of the partitions should not cross over the other partition

Read More