Network Optimization: Quick Start

📝 Abstract

This lecture serves as an introductory guide to the algorithms used to solve network optimization problems. It covers several important concepts and techniques for both beginners and advanced users. The lecture begins by explaining how to explore the locality and associativity of a network, solve discrete optimization problems, and gain insight into critical parts of the network's cut and cycle. It then delves into basic concepts such as nodes, edges, orientation, the node-edge incidence matrix, and the boundary operator. It then explains the flow and potential of a network. It also examines feasibility problems and provides examples such as clock skew scheduling and delay padding. The lecture concludes with guidelines for algorithm developers and average users, suggesting special handling of multi-edges and techniques for finding negative cycles and cuts. Overall, the lecture provides a quick start guide to network optimization, covering important algorithms and concepts needed to tackle such problems.

📖 Introduction

Why and why not

👍 Algorithms are available for common network problems (Python: networkx, C++: Boost Graph Library (BGL)):
- Explore the locality of network.
- Explore associativity (things can be added up in any order)
👍 Be able to solve discrete problems optimally (e.g. matching/assignment problems)
👍 Bonus: gives you insight into the most critical parts of the network (critical cut/cycle)
👎 The theory is hard to understand.
👎 Algorithms are hard to understand (some algorithms do not allow users to have an input flow in reverse directions, but create edges internally for the reverse flows).
👎 There are too many algorithms available. You have to choose them wisely.

Flow and Potential

Cut
Current
Flow $x$
Sum of $x_{ij}$ around a node = 0

flow

Cycle/Path
Voltage
Tension $y$
Sum of $y_{ij}$ around a cycle = 0

potential

If you don't know more...

For the min-cost linear flow problem, the best guess is to use the "network simplex algorithm".
For the min-cost linear potential problem: formulate it as a dual (flow) problem.
For the parametric potential problem (single parameter), the best guess is to use Howard's algorithm.
All these algorithms are based on the idea of finding "negative cycle".
You can apply the same principle to the nonlinear problems.

For dual problems...

Dual problems can be solved by applying the same principle.
Finding negative cycles is replaced by finding a negative "cuts", which is more difficult...
...unless your network is a planar graph.

Guidelines for the average users

Look for specialized algorithms for specialized problems. For example, for bipartite maximum cardinality matching, use the Hopcroft-Karp matching algorithm.
Avoid creating edges with infinite costs. Delete them or reformulate your problem.

Guidelines for algorithm developers

Make "negative cycles" as orthogonal to each other as possible.
Reuse previous solutions as a new starting point for finding negative cycles.

💡 Essential Concepts

Basic elements of a network

Definition (network)

A network is a collection of finite-dimensional vector spaces, which includes nodes and edges/arcs:

$V = {v_{1}, v_{2}, \dots, v_{N}}$ , where $∣ V ∣ = N$
$E = {e_{1}, e_{2}, e_{3}, \dots, e_{M}}$ where $∣ E ∣ = M$

which satisfies 2 requirements:

The boundary of each edge is comprised of the union of nodes
The intersection of any edges is either empty or the boundary node of both edges.

Network

By this definition, a network can contain self-loops and multi-edges.
A graph structure encodes the neighborhood information of nodes and edges.
Note that Python's NetworkX requires special handling of multi-edges.
The most efficient graph representation is an adjacency list.
The concept of a graph can be generalized to complex: node, edge, face...

Types of graphs

Bipartite graphs, trees, planar graphs, st-graphs, complete graphs.

Orientation

Definition (Orientation)

An orientation of an edge is an ordering of its boundary node $(s, t)$ , where

$s$ is called a source/initial node
$t$ is called a target/terminal node

👉 Note: orientation != direction

Definition (Coherent)

Two orientations to be the same is called coherent

Node-edge Incidence Matrix (connect to algebra!)

Definition (Incidence Matrix)

An $N \times M$ matrix $A^{T}$ is a node-edge incidence matrix with entries:

$A (i, j) = ⎩ ⎨ ⎧ + 1 - 1 0 if e_{i} is coherent with the orientation of node v_{j}, if e_{i} is not coherent with the orientation of node v_{j}, otherwise.$

Example:

$A^{T} = 01 - 1 - 1 10 10 - 1 1 - 1 0 0 - 1 1$

Chain

Definition (Chain $τ$ )

An edge/node chain $τ$ is an $M$ / $N$ -tuple of scalar that assigns a coefficient to each edge/node, where $M$ / $N$ is the number of distinct edges/nodes in the network.

Remark (II)

A chain may be viewed as an (oriented) indicator vector representing a set of edges/nodes.

Example (II)

$[0, 0, 1, 1, 1]$ , $[0, 0, 1, - 1, 1]$

Discrete Boundary Operator

Definition (Boundary operator)

The boundary operator $\partial = A^{T}$ .

Definition (Cycle)

A chain is said to be a cycle if it is in the null-space of the boundary operator, i.e. $A^{T} τ = 0$ .

Definition (Boundary)

A chain $β$ is said to be a boundary of $τ$ if it is in the range of the boundary operator.

Co-boundary Operator $d$

Definition (Co-boundary operator)

The co-boundary (or differential) operator $d = \partial^{*} = (A^{T})^{*} = A$

👉 Note

Null-space of $A$ is #components of a graph

Discrete Stokes' Theorem

Let $τ_{i} = {10 if e_{i} \in S, otherwise.$
Conventional (integration): $\int_{S} d ω = \oint_{\partial S} ω$
Discrete (pairing): $[τ, A ω] = [A^{T} τ, ω]$

Fundamental Theorem of Calculus

Conventional (integration): $\int_{a}^{b} f (t) d t = F (b) - F (a)$
Discrete (pairing): $[τ_{1}, A c^{0}] = [A^{T} τ_{1}, c^{0}]$

stokes

Divergence and Flow

Definition (Divergence)

$div x = A^{T} x$

Definition (Flow)

$x$ is called a flow if $\sum div x = 0$ , where all negative entries of (div $x$ ) are called sources and positive entries are called sinks.

Definition (Circulation)

A network is called a circulation if there is no source or sink. In other words, $div x = 0$

Tension and Potential

Definition (Tension)

A tension (in co-domain) $y$ is a differential of a potential $u$ , i.e. $y = A u$ .

Theorem (Tellgen's)

Flow and tension are bi-orthogonal (isomorphic).

Proof

$0 = [A^{T} x, u] = (A^{T} x)^{T} u = x^{T} (A u) = x^{T} y$

Path

A path indicator vector $τ$ of $P$ that $τ_{i} = {10 if e_{i} \in P, otherwise.$

Theorem

[total tension $y$ on $P$ ] = [total potential on the boundary of $P$ ].

Proof

$y^{T} τ = (A u)^{T} τ = u^{T} (A^{T} τ) = u^{T} (\partial P)$ .

Cut

Two node sets $S$ and $S^{'}$ (the complement of $S$ , i.e. $V - S$ ). A cut $Q$ is an edge set, denoted by $[S, S^{'}]^{-}$ . A cut indicator vector $q$ (oriented) of $Q$ is defined as $A c$ where $c_{i} = {10 if v_{i} \in S, otherwise .$

Theorem (Stokes' theorem!)

[Total divergence of $x$ on $S$ ] = [total $x$ across $Q$ ].

Proof

$(div x)^{T} c = (A^{T} x)^{T} c = x^{T} (A c) = x^{T} q$ .

Examples

cut

Feasibility Problems

Feasible Flow/Potential Problem

Feasible Flow Problem

Find a flow $x$ such that: $c^{-} \leq x \leq c^{+}, A^{T} x = b, b (V) = 0.$
Can be solved using:
- Painted network algorithm
- If no feasible solution, return a "negative cut".

Feasible Potential Problem:

Find a potential $u$ such that: $d^{-} \leq y \leq d^{+} A \cdot u = y .$
Can be solved using:
- Bellman-Ford algorithm
- If no feasible solution, return a "negative cycle".

Examples

Genome-scale reaction network (primal)

$A$ : Stoichiometric matrix $S$
$x$ : reactions between metabolites/proteins
$c^{-} \leq x \leq c^{+}$ : constraints on reaction rates

Timing constraints (co-domain)

$A^{T}$ : incidence matrix of timing constraint graph
$u$ : arrival time of clock
$y$ : clock skew
$d^{-} \leq y \leq d^{+}$ : setup- and hold-time constraints

Feasibility Flow Problem

Theorem (feasibility flow)

The problem has a feasible solution if and only if $b (S) \leq c^{+} (Q)$ for all cuts $Q = [S, S^{'}]$ where $c^{+} (Q)$ = upper capacity [1, p. 56].

Proof (if-part)

Let $q = A \cdot k$ be a cut vector (oriented) of $Q$ . Then

$c^{-} \leq x \leq c^{+}$
$q^{T} x \leq c^{+} (Q)$
$(A \cdot k)^{T} x \leq c^{+} (Q)$
$k^{T} A^{T} x \leq c^{+} (Q)$
$k^{T} b \leq c^{+} (Q)$
$b (S) \leq c^{+} (Q)$

$d^{-} \leq y \leq d^{+}$
$τ^{T} y \leq d^{+} (P)$
$τ^{T} (A \cdot u) \leq d^{+} (P)$
$(A^{T} τ)^{T} u \leq d^{+} (P)$
$(\partial P)^{T} u \leq d^{+} (P)$
$0 \leq d^{+} (P)$

Remarks

The only-if part of the proof is constructive. It can be done by constructing an algorithm to obtain the feasible solution.
$d^{+}$ could be $\infty$ or zero, etc.
$d^{-}$ could be $- \infty$ or zero, etc.
$c^{+}$ could be $\infty$ or zero, etc.
$c^{-}$ could be $- \infty$ or zero, etc.

Note: most tools require that $c^{-}$ must be zero such that the solution flow $x$ is always positive.

Convert to the elementary problem

By splitting every edge into two, the feasibility flow problem can reduce to an elementary one:
- Find a flow $x$ such that
  
  $c \leq x, A_{1}^{T} x = b_{1}, b_{1} (V_{1}) = 0.$
  
  where $A_{1}$ is the incident matrix of the modified network.

Original:

original

Modified:

modified

Convert to the elementary problem

By adding a reverse edge for every edge, the feasibility potential problem can reduce to an elementary one:
- Find a potential $u$ such that
  
  $y_{2} \leq d, A_{2} u = y_{2}$
  
  where $A_{2}$ is the incident matrix of the modified network.

Original:

original2

Modified:

modified2

]

Basic Bellman-Ford Algorithm

function BellmanFord(list vertices, list edges, vertex source)
   // Step 1: initialize graph
   for each vertex i in vertices:
       if i is source then u[i] := 0
       else u[i] := inf
       predecessor[i] := null

   // Step 2: relax edges repeatedly
   for i from 1 to size(vertices)-1:
       for each edge (i, j) with weight d in edges:
           if u[j] > u[i] + d[i,j]:
               u[j] := u[i] + d[i,j]
               predecessor[j] := i

   // Step 3: check for negative-weight cycles
   for each edge (i, j) with weight d in edges:
       if u[j] > u[i] + d[i,j]:
           error "Graph contains a negative-weight cycle"
   return u[], predecessor[]

Example 1 : Clock skew scheduling ⏳

Goal: intentionally assign an arrival time $u_{i}$ to each register so that the setup and hold time constraints are satisfied.
Note: the clock skew $s_{ij} = u_{i} - u_{j}$ is more important than the arrival time $u$ itself, because the clock runs periodically.
In the early stages, fixing the timing violation could be done as soon as a negative cycle is detected. A complete timing analysis is unnecessary at this stage.

Example 2 : Delay padding + clock skew scheduling ⏳

Goal: intentionally "insert" a delay $p$ so that the setup and hold time constraints are satisfied.
Note that a delay can be "inserted" by swapping a fast transistor into a slower transistor.
Traditional problem formulation: Find $p$ and $u$ such that

$y \leq d + p, A u = y, p \geq 0$
Note 1: Inserting delays into some local paths may not be allowed.
Note 2: The problem can be reduced to the standard form by modifying the network (timing constraint graph)

Four possible ways to insert delay

No delay:

no_delay

$p_{s} = p_{h}$ :

same_delay

Independent:

independent

$p_{s} \geq p_{h}$ :

setup_greater

Remarks (III)

If there exists a negative cycle, it means that timing cannot be fixed using simply this technique.
Additional constraints, such as $p_{s} \leq p_{m a x}$ , can be imposed.

Parametric Problems

Parametric Potential Problem (PPP)

Consider a parameter potential problem: $maximize subject to β y \leq d (β), A \cdot u = y$ where $d (β)$ is a monotonic decreasing function.
If $d (β)$ is a linear function $(m - s β)$ where $s$ is non-negative, the problem reduces to the well-known minimum cost-to-time ratio problem.
If $s$ = constant, it further reduces to the minimum mean cycle problem.

Note: Parametric flow problem can be defined similarly.

Examples (III)

$d (β)$ is linear $(m - s β)$ :
- Optimal clock period scheduling problem
- Slack maximization problem
- Yield-driven clock skew scheduling ⏳ (Gaussian)
$d (β)$ is non-linear:
- Yield-driven clock skew scheduling ⏳ (non-Gaussian)
- Multi-domain clock skew scheduling ⏳

Examples (IV)

Lawler's algorithm (binary search based)
Howard's algorithm (cycle cancellation)
Young's algorithm (path based)
Burns' algorithm (path based)
- for clock period optimization problem (all elements of $s$ are either 0 or 1)
Several hybrid methods have also been proposed

Remarks (IV)

Need to solve feasibility problems many times.
Data structures, such as Fibonacci heap or spanning tree/forest, can be used to improve efficiency
For multi-parameter problems, the ellipsoid method can be used.
Example 1: yield-driven clock skew scheduling ⏳ (c.f. lecture 5)

Example 2: yield-driven delay padding

The problem can be reduced to the standard form by modifying the underlying constraint graph.

Four possible way to insert delay

No delay:

no_delay_s

$p_{s} = p_{h}$ :

same_delay_s

Independent:

independent_s

$p_{s} \geq p_{h}$ :

setup_greater_s

Min-cost Flow/Potenial Problem

Elementary Optimal Problems

Elementary Flow Problem:

$min s. t. d^{T} x + p c \leq x, A^{T} x = b, b (V) = 0$
Elementary Potential Problem: $max s. t. b^{T} u - (c^{T} y + q) y \leq d, A u = y$

Elementary Optimal Problems (Cont'd)

The problems are dual to each other if $p + q = - c^{T} d, (x - c)^{T} (d - y) = 0, c \leq x, y \leq d$
Since $b^{T} u$ = $(A^{T} x)^{T} u = x^{T} A u = x^{T} y,$ $[min] - [max] = (d^{T} x + p) - (b^{T} u - [c^{T} y + q])$ = $d^{T} x + c^{T} y - x^{T} y + p + q = (x - c)^{T} (d - y) \geq 0$
$[min] - [max]$ when equality holds.

Remark (V)

We can formulate a linear problem in primal or dual form, depending on which solution method is more appropriate:
- Incremental improvement of feasible solutions
- Design variables are in the integral domain:
  - The max-flow problem (i.e. $d^{T} = [- 1, - 1, \dots, - 1]^{T}$ ) may be better solved by the dual method.

Linear Optimal Problems

Optimal Flow Problem: $min s. t. d^{T} x + p c^{-} \leq x \leq c^{+}, A^{T} x = b, b (V) = 0$
Optimal Potential Problem: $max s. t. b^{T} u - (c^{T} y + q) d^{-} \leq y \leq d^{+}, A u = y$

Linear Optimal Problems (II)

By modifying the network:

The problem can be reduced to the elementary case [pp.275-276]

piece of cake

Piece-wise linear convex cost can be reduced to this linear problem [p.239,p.260]

The problem has been extensively studied and has numerous applications.

Remark (VI)

We can transform the cost function to be non-negative by reversing the orientation of the negative cost edges.
Then reduce the problem to the elementary case (or should we???)

Algorithms for Optimal Flow Problems

Successive shortest path algorithm
Cycle cancellation method
- Iteratively insert additional minimal flows according to a negative cycle of the residual network until no negative cycles are found.
Scaling method

For Special Cases

Max-flow problem ( $d = - [1, \dots, 1]$ )
- Ford-Fulkerson algorithm: iteratively insert additional minimal flows according to an augmented path of the residual network, until no augmented paths of the residual network are found.
- Pre-flow Push-Relabel algorithm (dual method???)
Matching problems ( $[c^{-}, c^{+}] = [0, 1]$ )
- Edmond's blossom algorithm

Min-Cost Flow Problem (MCFP)

Problem Formulation: $min s. t. d^{T} x 0 \leq x \leq c, A^{T} x = b, b (V) = 0$
Algorithm idea: descent method: given a feasible $x_{0}$ , find a better solution $x_{1} = x_{0} + α p$ , where $α$ is positive.

General Descent Method

Input: $f (x)$ , initial $x$
Output: optimal opt $x^{*}$
while not converged,
1. Choose descent direction $p$ ;
2. Choose the step size $α$ ;
3. $x := x + α p$ ;

Some Common Descent Directions

Gradient descent: $p = - \nabla f (x)^{T}$
Steepest descent:
- $△ x_{n s d} = arg min {\nabla f (x)^{T} v ∣ ∥ v ∥ = 1}$
- $△ x_{s d}$ = $∥\nabla f (x) ∥△ x_{n s d}$ (un-normalized)
Newton's method: $p = - \nabla^{2} f (x)^{- 1} \nabla f (x)$
For convex problems, must satisfy $\nabla f (x)^{T} p < 0$ .

Note: Here, there is a natural way to choose $p$ !

Min-Cost Flow Problem (II)

Let $x_{1} = x_{0} + α p$ , then we have: $min s. t. d^{T} x_{0} + α d^{T} p - x_{0} \leq α p \leq c - x_{0} A^{T} p = 0 \Rightarrow d^{T} p < 0 \Rightarrow residual graph \Rightarrow p is a cycle!$
In other words, choose $p$ to be a negative cycle!
- Simple negative cycle, or
- Minimum mean cycle

Primal Method for MCFP

Input: $G (V, E), [c^{-}, c^{+}], d$
Output: optimal opt $x^{*}$
Initialize a feasible $x$ and certain data structure
while a negative cycle $p$ found in $G (x)$ ,
1. Choose a step size $α$ ;
2. If $α$ is unbounded, return UNBOUNDED;
3. If $α = 0$ , break;
4. $x := x + α p$ ;
5. Update corresponding data structures
return OPTIMAL

Remarks (VI)

In Step 4, negative cycle can be found using Bellman-Ford algorithm.
In the cycle cancelling algorithm, $p$ is:
- a simple negative cycle, or
- a minimum mean cycle
A heap or other data structures are used for finding negative cycles efficiently.
Usually $α$ is chosen such that one constraint is tight.

Min-Cost Potential Problem (MCPP)

Problem Formulation: $min s. t. c^{T} y y \leq d, A u = y$ where $c$ is assumed to be non-negative.
Algorithm: given an initial feasible $u_{0}$ , find a better solution $u_{1} = u_{0} + βq$ , where $β$ is positive: $min s. t. c^{T} y_{0} + c^{T} y y \leq d - A u_{0} β A q = y \Rightarrow c^{T} y < 0 \Rightarrow residual graph \Rightarrow q is a “cut”!$

Method for MCPP

Input: $G (V, E), c, d$
Output: optimal opt $u^{*}$
Initialize a feasible $u$ and certain data structure
while a negative cut $q$ found in $G (u)$ ,
1. Choose a step size $β$ ;
2. If $β$ is unbounded, return UNBOUNDED;
3. If $β = 0$ , break;
4. $u := u + βq$ ;
5. Update corresponding data structures
return OPTIMAL

Remarks (VII)

Usually $β$ is chosen such that one constraint is tight.
The min-cost potential problem is the dual of the min-cost flow problem, so algorithms can solve both problems.
In the network simplex method, $q$ is chosen from a spanning tree data structure (for linear problems only)

Algorithms for Design-for-Manufacturability