Lecture 05a - ⏳ Clock Skew Scheduling Under Process Variations

📝 Abstract

The main topic of the lecture is clock skew scheduling under process variations. The lecture discusses various techniques and methods for optimizing clock skew to improve circuit performance or minimize timing failures.

The lecture begins with an overview of the problem and background of clock skew scheduling. It then explains the concept of clock skew and the difference between zero skew and useful skew designs. The importance of meeting timing constraints, such as setup time and hold time, is discussed, along with the potential problems that can occur if these constraints are violated.

The lecture presents various approaches to clock skew scheduling, such as traditional scheduling, yield-driven scheduling, and minimum cost-to-time ratio formulation. It also examines various methods for finding the optimal clock period and the corresponding skew schedule, including linear programming and the use of the Bellman-Ford algorithm.

The lecture goes on to discuss primitive solutions and their shortcomings, such as pre-allocating timing margins and using the Least Center Error Square (LCES) problem formulation. The lecture also introduces more advanced techniques such as slack maximization (EVEN) and prop-based methods that distribute slack along the most timing-critical cycle based on Gaussian models. The drawbacks of these methods are highlighted, particularly their assumptions about gate delay distributions.

Finally, statistical static timing analysis (SSTA) and the use of statistical methods to account for process variations are discussed. The concept of the most critical cycle is introduced, and the lecture provides experimental results to demonstrate the effectiveness of various clock skew scheduling techniques.

🔑 Keywords

Static timing analysis, STA 静态时序分析
Statistical STA 统计静态时序分析
Clock skew 时钟偏差/偏斜
Zero skew design 零偏差设计
- Critical paths 关键路径
- Negative slack 负时序裕量
Useful skew design 有效偏差设计
- Critical cycles 关键环
- Negative cycles 负环
Clock skew scheduling ⏳ (CSS) 时钟偏差安排/规划
Yield-driven CSS 产品率驱动时钟偏差安排

🗺️ Overview

Background
Problem formulation
Traditional clock skew scheduling ⏳
Yield-driven clock skew scheduling ⏳
Minimum cost-to-time ratio formulation

Sequential Logic

Local data path

Sequential Logic (cont'd)

Graph

Clock Skew

$T_{skew} (i, f) = t_{i} - t_{f}$ , where
- $t_{i}$ : clock signal delay at the initial register
- $t_{f}$ : clock signal delay at the final register

Timing Constraint

Setup time constraint $T_{skew} (i, f) \leq T_{CP} - D_{i f} - T_{setup} = u_{i f}$ While this constraint destroyed, cycle time violation (zero clocking) occurs.
Hold time constraint $T_{skew} (i, f) \geq T_{hold} - d_{i f} = l_{i f}$ While this constraint destroyed, race condition (double clocking) occurs.

Zero skew vs. Useful skew

Zero skew ( $t_{i} = t_{f}$ ) : Relatively easy to implement.
Useful skew. Improve:
- The performance of the circuit by permitting a higher maximum clock frequency, or
- The safety margins of the clock skew within the permissible ranges.
Max./min. path delays are got from static timing analysis (STA).

Timing Constraint Graph

Create a graph by
- replacing the hold time constraint with a h-edge with cost $- (T_{hold} - d_{ij})$ from $FF_{i}$ to $FF_{j}$ , and
- replacing the setup time constraint with an s-edge with cost $T_{CP} - D_{ij} - T_{setup}$ from $FF_{j}$ to $FF_{i}$ .
Two sets of constraints stemming from clock skew definition:
- The sum of skews for paths having the same starting and ending flip-flop to be the same;
- The sum of clock skews of all cycles to be zero

Timing Constraint Graph (TCG)

Example circuit

Timing Constraint Graph (TCG)

Assume $T_{setup} = T_{hold}$ = 0

Clock period $T_{CP}$ is feasible if and only if current graph contains no negative cost cycles.

TCG

Minimize Clock Period

Linear programming (LP) formulation

$minimize subject to T_{CP} l_{ij} \leq T_{i} - T_{j} \leq u_{ij}$

where $FF_{i}$ and $FF_{j}$ are sequential adjacent
The above constraint condition is so-called system of difference constraints (see Introduction to Algorithms, MIT):
👉 Note: easy to check if a feasible solution exists by detecting negative cycle using for example Bellman-Ford algorithm.

Basic Bellman-Ford Algorithm

function BellmanFord(list vertices, list edges, vertex source)
    // Step 1: initialize graph
    for each vertex i in vertices:
        if i is source then u[i] := 0
        else u[i] := inf
        predecessor[i] := null

    // Step 2: relax edges repeatedly
    for i from 1 to size(vertices)-1:
        for each edge (i, j) with weight d in edges:
*           if u[j] > u[i] + d[i,j]:
*               u[j] := u[i] + d[i,j]
*               predecessor[j] := i

    // Step 3: check for negative-weight cycles
    for each edge (i, j) with weight d in edges:
        if u[j] > u[i] + d[i,j]:
            error "Graph contains a negative-weight cycle"
return u[], predecessor[]

Problems with Bellman-Ford Algorithm

The algorithm is originally used for finding the shortest paths.
Detecting negative cycle is just a side product of the algorithm.
The algorithm is simple, but...
- detects negative cycle at the end only.
- has to compute all d[i,j].
- Restart the initialization with u[i] := inf.
- requests the input graph must have a source node.

Various improvements have been proposed extensively.

Minimize clock period (I)

Fast algorithm for solving the LP:
- Use binary search method for finding the minimum clock period.
- In each iteration, Bellman-Ford algorithm is called to detect if the timing constraint graph contains negative weighted edge cycle.
👉 Note: Originally Bellman-Ford algorithm is used to find a shortest-path of a graph.

Minimize clock period (II)

When the optimal clock period is solved, the corresponding skew schedule is got simultaneously.
However, many skew values are on the bounds of feasible range.

Timing uncertainty emerges under process variations

Yield-driven Clock Skew Scheduling

When process variations increase more and more, timing-failure-induced yield loss becomes a significant problem.
Yield-driven Clock Skew Scheduling becomes important.
Primary goal of this scheduling is to minimize the yield loss instead of minimizing the clock period.

Timing Yield Definition

The circuit is called functionally correct if all the setup- and hold-time constraints are satisfied under a group of determinate process parameters.
Timing Yield = (functional correct times) / sample number * 100%

Primitive solution (1)

Pre-allocate timing margins (usually equivalent to maximum timing uncertainty) at both ends of the FSR's (Feasible Skew Region). $l_{ij} \leq s_{ij} \leq u_{ij} ⟹ l_{ij} + Δ d \leq s_{ij} \leq u_{ij} - Δ d$
Then perform clock period optimization.

Problems with this method

The maximum timing uncertainty is too pessimistic. Lose some performance;
$Δ d$ is fixed; it does not consider data path delay differences between cycle edges.

📑 References (1)

"Clock skew optimization", IEEE Trans. Computers, 1990
"A graph-theoretic approach to clock skew optimization", ISCAS'94
"Cycle time and slack optimization for VLSI-chips", ICCAD'99
"Clock scheduling and clocktree construction for high performance Asics", ICCAD'03
"ExtensiveSlackBalance: an Approach to Make Front-end Tools Aware of Clock Skew Scheduling", DAC'06

Primitive solution (2)

Formulate as LCES (Least Center Error Square) problem
- A simple observation suggests that, to maximize slack, skew values should be chosen as close as possible to the middle points of their FSR's.
$l_{ij} + l m_{k} (u_{ij} - l_{ij}) \leq s_{ij} \leq u_{ij} - u m_{k} (u_{ij} - l_{ij})$

$minimize subject to \sum_{k} (0.5 - min (l m_{k}, u m_{k}))^{2} 0 \leq l m_{k} \leq 0.5 0 \leq u m_{k} \leq 0.5$

📑 References (2)

Graph-based algorithm
- (J. L. Neves and E. G. Friedman, "Optimal Clock Skew Scheduling Tolerant to Process Variations", DAC'96)
Quadratic Programming method
- (I. S. Kourtev and E. G. Fredman, "Clock skew scheduling ⏳ for improved reliability via quadratic programming", ICCAD'99)

Shortcoming: might reduce some slacks to be zero to minimum total CES. This is not optimal for yield.

Primitive solution (3)

Incremental Slack Distribution
- (Xinjie Wei, Yici CAI and Xianlong Hong, "Clock skew scheduling ⏳ under process variations", ISQED'06)
Advantage: check all skew constraints
Disadvantage: didn't take the path delay difference into consideration

Minimum Mean Cycle Based

Even: solve the slack optimization problem using a minimum mean cycle formulation.
Prop: distribute slack along the most timing-critical cycle proportional to path delays
FP-Prop: use sensitizable-critical-path search algorithm for clock skew scheduling.

Slack Maximization (EVEN)

Slack Maximization Scheduling

$maximize subject to t T_{j} - T_{i} \leq μ_{ij} - t$
Equivalent to the so-called minimum mean cycle problem (MMC), where $t^{*} = \frac{1}{∣ C ∣} (i, j) \in C \sum μ_{ij}$ $C$ : critical cycle (first negative cycle)
Can be solved efficiently by the above method.

Even - iterative slack optimization

Identify the circuit's most timing-critical cycle,
Distribute the slack along the cycle,
Freeze the clock skews on the cycle, and
Repeat the process iteratively.

Most timing-critical cycle

Identify the timing-critical cycle

Identify the circuit's most timing-critical cycle
Solve the minimum mean-weight cycle problem by
- Karp's algorithm
- A. Dasdan and R.K.Gupta, "Faster Maximum and Minimum Mean Cycle Algorithms for System-Performance", TCAD'98.

Distribute the slack

Distribute the slack evenly along the most timing-critical cycle.

Freeze the clock skews (I)

Replace the critical cycle with super vertex.

Freeze the clock skews (II)

To determine the optimal slacks and skews for the rest of the graph, we replace the critical cycle with super vertex.

Repeat the process (I)

Repeat the process (II)

Final result

$Skew_{12}$ = 0.75
$Skew_{23}$ = -0.25
$Skew_{31}$ = -0.5
$Slack_{12}$ = 1.75
$Slack_{23}$ = 1.75
$Slack_{31}$ = 1

where $Slack_{ij} = T_{CP} - D_{ij} - T_{setup} - Skew_{ij}$

Problems with Even

Assume all variances are the same.
However, the timing uncertainty of a long combinational path is usually larger than that of a shorter path.
Therefore, the even slack distribution along timing-critical cycles performed by Even is not optimal for yield if data path delays along the cycles are different.

Prop-Based on Gaussian model (I)

Assuming there are $n$ gates with delay $N (μ, σ^{2})$ in a path, then this path delay is $N (n μ, n σ^{2})$
Distribute slack along the most timing-critical cycle, according to the square root of each edge's path delays (???).
To achieve this, update the weights of s-edges and h-edges: $T_{CP} - (D_{ij} + α D_{ij} σ) - T_{setup} - T_{hold} + (d_{ij} - α d_{ij} σ)$ where $α$ ensures a minimum timing margin for each timing constraint.

Prop-Based on Gaussian model (II)

Given a specific clock period $T_{CP}$ , we gradually increase $α$ and use the Bellman-Ford algorithm to detect whether it is still feasible.
After finding the maximum $α$ , the edges along the most timing-critical cycle will have slacks equal to the pre-allocated timing margins.
Many edges in a circuit have sufficiently large slack. Therefore, we can perform proportional slack distribution only for the most timing-critical cycle. Assign the rest of skews using Even.

Problems with Prop

Assume all gate delay has the same distribution.
Not justify using the square root of path delay for timing margin.

FP-Prop (I)

: False path

FP-Prop (II)

If we do not consider false path, some non timing-critical cycles become timing-critical. Then, more slacks are distributed to these cycles, but the slacks in actually timing-critical cycles are not sufficient. As a result, the overall timing yield decreases.

Problems with FP-Prop

Same problems as Prop

🧪 Experimental Results

📈 Statistical Method

Setup time constraint

$T_{skew} (i, f) \leq T_{CP} - \tilde{D}_{i f} - T_{setup}$
Hold time constraint

$T_{skew} (i, f) \geq T_{hold} - \tilde{d}_{i f}$

where $D_{i f} and d_{i f}$ are random variable under process variations.

📈 Statistical TC Graph

After SSTA, edge weight is represented as a pair of value (mean, variance).

Most Critical Cycle

Traditional criteria: minimum mean cycle $C \in C min \frac{\sum _{(i, j) \in C} μ _{ij}}{∣ C ∣}$
New criteria: $C \in C min \frac{\sum _{(i, j) \in C} μ _{ij}}{\sum _{(i, j) \in C} σ _{ij}}$

(We show the correctness later)

Slack Maximization (C-PROP)

Slack Maximization Scheduling $maximize subject to t T_{j} - T_{i} \leq μ_{ij} - σ_{ij} t$
Equivalent to the minimum cost-to-time ratio problem (MMC), where:
- $t^{*} = \sum_{(i, j) \in C} μ_{ij} / \sum_{(i, j) \in C} σ_{ij}$
- $C$ : critical cycle (first negative cycle)

Probability Observation

Prob(timing failure) turns out to be an Error function that solely depends on this ratio. Therefore, it is justified to use this ratio as critical criteria.

Whole flow

After determining the clock arrival time at each vertex in the most critical cycle, the cycle is replaced with a super vertex $v^{'}$ .
In-edge $(u, v)$ from outside vertex $u$ to cycle member $v$ is replaced by an in-edge $(u, v^{'})$ with weight mean $μ (u, v) - T_{v}$ .
Out-edge $(v, u)$ is replaced by out-edge $(v^{'}, u)$ with weight mean $μ (v, u) + T_{v}$ . However, the variance of the edge weight is not changed. And parallel edges can be remained.
Repeat the process iteratively until the graph is reduced to a single super vertex, or the edges number is zero.

Data structure

Final result: $T_{1} = T_{1} + T_{s_{1}} + T_{s_{3}}$

Advantages of This Method

Justified by probability observation.
Fast algorithm exists for minimum cost-to-time ratio problem.
Reduce to Even when all variances are equal.
When a variance tends to zero, it makes sense that only minimal slack is assigned to this variable, and hence others can be assigned more.

Results

📑 Main Reference

Jeng-Liang Tsai, Dong Hyum Baik, Charlie Chung-Ping Chen, and Kewal K. Saluja, "Yield-Driven, False-Path-Aware Clock Skew Scheduling", IEEE Design & Test of Computers, May-June 2005

Lecture 05b - ⏳ Clock Skew Scheduling Under Process Variations (2)

🗺️ Overview

A Review of CSS Issues
General Formulation
Yield-driven Clock Skew Scheduling
Numerical Results

Minimum Clock Period Problem

Linear programming (LP) formulation

$minimize subject to T_{CP} l_{ij} \leq T_{i} - T_{j} \leq u_{ij}$

where $FF_{i}$ and $FF_{j}$ are sequentially adjacent to each other.
The above constraints are called system of difference constraints (see Introduction to Algorithms, MIT):
- Key: it is easy to check if a feasible solution exists by detecting negative cycles using the Bellman-Ford algorithm.

System of Difference Constraints

In some cases, you may need to do some transformations, e.g.
- $T_{i} \leq min_{k} {T_{k} + a_{ik}} \to T_{i} - T_{k} \leq a_{ik}, \forall k$
- $T_{i} \geq max_{k} {T_{k} + b_{ik}} \to b_{ik} \leq T_{i} - T_{k}, \forall k$

Slack Maximization (EVEN)

Slack Maximization Scheduling

$maximum subject to t T_{j} - T_{i} \leq μ_{ij} - t$ (👉 Note: $μ_{ij} is not equal to μ_{ji}$ )
is equivalent to the so-called minimum mean cycle problem (MMC), where:
- $t^{*} = \sum_{(i, j) \in C} μ_{ij} /∣ C ∣$ ,
- $C$ : critical cycle (first negative cycle)
Can be efficiently solved by the parametric shortest path methods.

Slack Maximization (C-PROP)

Slack Maximization Scheduling

$maximum subject to t T_{j} - T_{i} \leq μ_{ij} - σ_{ij} t$ (we show the correctness later)
is equivalent to the minimum cost-to-time ratio problem (MCR), where:
- $t^{*} = \sum_{(i, j) \in C} μ_{ij} / \sum_{(i, j) \in C} σ_{ij}$ ,
- $C$ : critical cycle

General Formulation

General form: $maximum subject to g (t) T_{i} - T_{j} \leq f_{ij} (t), \forall (i, j) \in E$ where $f_{ij} (t)$ a linear function that represents various problems defined above.

Problem	$g (t)$	$f_{ij} (t)$ (setup)	$f_{ji} (t)$ (hold)
Min. CP	$- t$	$t - D_{ij} - T_{setup}$	$- T_{hold} + d_{ij}$
EVEN	$t$	$T_{CP} - D_{ij} - T_{setup} - t$	$- T_{hold} + d_{ij} - t$
C-PROP	$t$	$T_{CP} - D_{ij} - T_{setup} - σ_{ij} t$	$- T_{hold} + d_{ij} - σ_{ij} t$

General Formulation (cont'd)

In fact, $g (t)$ and $f_{ij} (t)$ are not necessarily linear functions. Any monotonic decreasing function will do.
Theorem: if $g (t)$ and $f_{ij} (t)$ are monotonic decreasing functions for all $i$ and $j$ , then there is a unique solution to the problem. (prove later).
Question 1: Does this generalization have any application?
Question 2: What if $g (t)$ and $f_{ij} (t)$ are convex but not monotone?

🔕 Non-Gaussian Distribution

65nm and below, the path delay is likely to have a non-Gaussian distribution:

👉 Note: central limit theorem does not apply because
- random variables are correlated (why?)
- delays are non-negative

Timing Yield Maximization

Formulation:
- $max {min {Pr {T_{j} - T_{i} \leq \tilde{W}_{ij}}}}$
- is not exactly timing yield but reasonable.
It is equivalent to:

$maximum subject to t T_{i} - T_{j} \leq T_{CP} - F_{ji}^{- 1} (t) T_{j} - T_{i} \leq F_{ij}^{- 1} (1 - t)$

where $F_{ij} (\cdot) is CDF of \tilde{W}_{ij}$
Luckily, any CDF must be a monotonic increasing function.

📈 Statistical Interpretations of C-PROP

Reduce to C-PROP when $\tilde{W}_{ij}$ is Gaussian, or precisely

$F_{ij} (x) = K ((x - μ_{ij}) / σ_{ij})$
EVEN: identical distribution up to shifting

$F_{ij} (x) = H (x - μ_{ij})$

Not necessarily worse than C-PROP

⚖️ Comparison

Three Solving Methods in General

Binary search based
- Local convergence is slow.
Cycle based
- Idea: if a solution is infeasible, there exists a negative cycle which can always be "zero-out" with minimum effort (proof of optimality)
Path based
- Idea: if a solution is feasible, there exists a (shortest) path from where we can always improve the solution.

Parametric Shortest Path Algorithms

Lawler's algorithm (binary search)
Howard's algorithm (based on cycle cancellation)
Hybrid method
Improved Howard's algorithm
Input:
- Interval [tmin, tmax] that includes t*
- Tol: tolerance
- G(V, E): timing graph
Output:
- Optimal t* and its corresponding critical cycle C

Lawler's Algorithm

@startuml
 while ((tmax - tmin) > tol)
    : t := (tmin + tmax) / 2;
    if (a neg. cycle C under t exists) then
       : tmax := t;
    else
       : tmin := t;
    endif
 endwhile
 : t* := t;
@enduml

Howard's Algorithm

@startuml
 : t := tmax;
 while (a neg. cycle C under t exists)
    : find t' such that
      sum{(i,j) in C | fij(t')} = 0;
    : t := t';
 endwhile
 : t* := t;
@enduml

Hybrid Method

@startuml
 while ((tmax - tmin) > tol)
    : t := (tmin + tmax) / 2;
    if (a neg. cycle C under t exists) then
       : find t' such that
         sum{(i,j) in C | fij(t')} = 0;
       : t := t';
       : tmax := t;
    else
       : tmin := t;
    endif
 endwhile
 : t* := t;
@enduml

Improved Howard's Algorithm

@startuml
 : t := (tmin + tmax) / 2;
 while (no neg. cycle under t)
    : tmin := t;
    : t := (tmin + tmax) / 2;
 endwhile
 while (a neg. cycle C under t exists)
    : find t' such that
      sum{(i,j) in C | fij(t')} = 0;
    : t := t';
 endwhile
 : t* := t;
@enduml

]

⏳ Clock Skew Scheduling for Unimodal Distributed Delay Models

@luk036

2022-10-26

Useful Skew Design: Why and Why not?

Bad 👎:

Needs more engineer training.
Balanced clock-trees are harder to build.
Don't know how to handle process variation, multi-corner multi-mode, ..., etc.

Good 👍:

If you do it right,

spend less time struggling about timing, or
get better chip performance or yield.

What can modern STA tools do today?

Manually assign clock arrival times to registers (all zeros by default)
Grouping: Non-critical parts can be grouped as a single unit. In other words, there is no need for full-chip optimization.
Takes care of multi-cycle paths, slew rate, clock-gating, false paths etc. All we need are the reported slacks.
Provide 3-sigma statistics for slacks/path delays (POCV).
However, the full probability density function and correlation information are not available.

Unimodality

In statistics, a unimodal probability distribution or unimodal distribution is a probability distribution with a single peak.
In continuous distributions, unimodality can be defined through the behavior of the cumulative distribution function (cdf). If the cdf is convex for $x < m$ and concave for $x > m$ , then the distribution is unimodal, $m$ being the mode.
Examples
- Normal distribution
- Log-normal distribution
- Log-logistic distribution
- Weibull distribution

Quantile function

The quantile function $z_{p}$ of a distribution is the inverse of the cumulative distribution function $Φ^{- 1} (p)$ .
Close-form expression for some unimodal distributions:
- Normal: $μ + σ 2 erf^{- 1} (2 p - 1)$
- Log-normal: $exp (μ + σ 2 erf^{- 1} (2 p - 1))$
- Log-logistic: $α (\frac{p}{1 - p})^{1/ β}$
- Weibull: $λ (- ln (1 - p))^{1/ k}$
For log-normal distribution:
- mode: $exp (μ - σ^{2})$
- CDF at mode: $1/2 (1 + erf (- σ / 2))$

Normal vs. Log-normal Delay Model

Normal/Gaussian:

Convertible to a linear network optimization problem.
Supported over the whole real line. Negative delays are possible.
Symmetric, obviously not adaptable to the 3-sigma results.

Log-normal:

Non-linear, but still can be solved efficiently with network optimization.
Supported only on the positive side.
Non-symmetric, may be able to fit into the 3-sigma results. (???)

Setup- and Hold-time Constraints

Let $T_{skew} (i, f) = t_{i} - t_{f}$ , where
- $t_{i}$ : clock signal delay at the initial register
- $t_{f}$ : clock signal delay at the final register
- Assume in zero-skew, i.e. $T_{skew} (i, f) = 0$ , the reported setup- and hold-time slacks are _ $S_{i f}$ and _ $H_{i f}$ respectively.
Then, in useful skew design: $T_{skew} (i, f) \leq S_{i f} ⟹ t_{i} - t_{f} \leq S_{i f}$ $T_{skew} (i, f) \geq - H_{i f} ⟹ t_{f} - t_{i} \leq H_{i f}$
In principle, $H_{i f} and T_{CP} - S_{i f}$ represent the minimum- and maximum-path delay, and should be always greater than zero.
Let $D_{i f} = T_{CP} - S_{i f}$

Yield-driven Optimization

Max-Min Formulation:
- $max {min {Pr {t_{j} - t_{i} \leq \tilde{W}_{ij}}}}$ ,
- No need for correlation information between paths.
- Not exactly the timing yield objective but reasonable.
Equivalent to:

$maximum subject to β Pr {t_{i} - t_{j} \leq T_{CP} - D_{ij}} \geq β Pr {t_{j} - t_{i} \leq H_{ij}} \geq β$

$maximum subject to β t_{i} - t_{j} \leq T_{CP} - Φ_{D_{ij}}^{- 1} (β) t_{j} - t_{i} \leq Φ_{H_{ij}}^{- 1} (1 - β)$

Yield-driven Optimization (cont'd)

In general, Lawler's algorithm (binary search) can be used.
Depending on the distribution, there are several other ways to solve problem.

Gaussian Delay Model

Reduce to:

$maximum subject to β t_{i} - t_{j} \leq T_{CP} - (μ_{ij}^{D} + σ_{ij}^{D} 2 erf^{- 1} (2 β - 1)) t_{j} - t_{i} \leq μ_{ij}^{H} + σ_{ij}^{H} 2 erf^{- 1} (2 (1 - β) - 1)$

Linearization. Since $erf^{- 1} (\cdot)$ is anti-symmetric and monotonic, we have:

$maximum subject to β^{'} t_{i} - t_{j} \leq T_{CP} - μ_{ij}^{D} - σ_{ij}^{D} β^{'} t_{j} - t_{i} \leq μ_{ij}^{H} - σ_{ij}^{H} β^{'}$

is equivalent to the minimum cost-to-time ratio (linear).
However, actual path delay distributions are non-Gaussian.

Log-normal Delay Model

Reduce to:

$maximum subject to β t_{i} - t_{j} \leq T_{CP} - exp (μ_{ij}^{D} + σ_{ij}^{D} 2 erf^{- 1} (2 β - 1)) t_{j} - t_{i} \leq exp (μ_{ij}^{H} + σ_{ij}^{H} 2 erf^{- 1} (2 (1 - β) - 1))$

Since $erf^{- 1} (\cdot)$ is anti-symmetric and monotonic, we have:

$maximum subject to β^{'} t_{i} - t_{j} \leq T_{CP} - exp (μ_{ij}^{D} + σ_{ij}^{D} β^{'}) t_{j} - t_{i} \leq exp (μ_{ij}^{H} - σ_{ij}^{H} β^{'})$

Bypass evaluating error function. Non-linear and non-convex, but still can be solved efficiently by for example binary search on $β^{'}$ .

Weibull Delay Model

Reduce to:

$maximum subject to β t_{i} - t_{j} \leq T_{CP} - λ_{ij}^{D} (- ln (1 - β))^{1/ k_{ij}^{D}} t_{j} - t_{i} \leq λ_{ij}^{H} (- ln (β))^{1/ k_{ij}^{H}}$

Algorithms for Design-for-Manufacturability