Layer and Activation Functions

The basic pipeline in source code of OMLT is:

\[\begin{align*} \mathbf z^{(0)} \xrightarrow[\text{Constraints}]{\text{Layer 1}} \hat{\mathbf z}^{(1)} \xrightarrow[\text{Activations}]{\text{Layer 1}} \mathbf z^{(1)} \xrightarrow[\text{Constraints}]{\text{Layer 2}} \hat{\mathbf z}^{(2)} \xrightarrow[\text{Activations}]{\text{Layer 2}} \mathbf z^{(2)} \xrightarrow[\text{Constraints}]{\text{Layer 3}}\cdots \end{align*}\]

where \(\mathbf z^{(0)}\) is the output of InputLayer, \(\hat{\mathbf z}^{(l)}\) is the pre-activation output of \(l\)-th layer, \(\mathbf z^{(l)}\) is the post-activation output of \(l\)-th layer.

Layer Functions

Since OMLT builds layer and activation functions in layer level, we ignore the layer index and use the following notations to describe the \(l\)-th layer:

\[\begin{split}\begin{align*} F_{in} &:= \text{number of input features}\\ F_{out} &:= \text{number of output features}\\ x_i &:= \text{the $i$-th input, $0\le i<F_{in}$}\\ y_j &:= \text{the $j$-th output, $0\le j<F_{out}$}\\ w_{ij} &:= \text{weight from $x_i$ to $y_j$, $0\le i<F_{in}, 0\le j<F_{out}$}\\ b_j &:= \text{bias for $y_j$, $0\le j<F_{out}$} \end{align*}\end{split}\]
omlt.neuralnet.layers.full_space.full_space_conv2d_layer(net_block, net, layer_block, layer)[source]

Add full-space formulation of the 2-D convolutional layer to the block

A 2-D convolutional layer applies cross-correlation kernels to a 2-D input. Specifically, the input is convolved by sliding the kernels along the input vertically and horizontally. At each location, the preactivation is computed as the dot product of the kernel weights and the input plus a bias term.

omlt.neuralnet.layers.full_space.full_space_dense_layer(net_block, net, layer_block, layer)[source]

Add full-space formulation of the dense layer to the block:

\[\begin{align*} y_j = \sum\limits_{i=0}^{F_{in}-1}w_{ij}x_i+b_j, && \forall 0\le j<F_{out} \end{align*}\]
omlt.neuralnet.layers.full_space.full_space_gnn_layer(net_block, net, layer_block, layer)[source]

We additionally introduce the following notations to describe the gnn layer:

\[\begin{split}\begin{align*} N &:= \text{the number of node in the graph}\\ u &:= \text{the node index of $x_i$, $u=\lfloor iN/F_{in}\rfloor$}\\ v &:= \text{the node index of $y_j$, $v=\lfloor jN/F_{out}\rfloor$}\\ A_{u,v} &:= \text{the edge between node $u$ and $v$}\\ l_i &:= \text{the lower bound of $x_i$}\\ u_i &:= \text{the upper bound of $x_i$}\\ \end{align*}\end{split}\]

Add full-space formulation of the gnn layer to the block:

\[\begin{split}\begin{align*} y_j &= \sum_{i=0}^{F_{in}-1} w_{ij} \bar x_{i,v} + b_j, && \forall 0\le j<F_{out} \\ \bar x_{i,v} &= A_{u,v} x_{i}, && \forall 0\le i<F_{in},~ 0\le v<N \end{align*}\end{split}\]

The big-M formulation for \(\bar x_{i,v}\) is given by:

\[\begin{split}\begin{align*} x_i-u_i(1-A_{u,v})\le &~ \bar x_{i,v} \le x_i-l_i(1-A_{u,v})\\ l_iA_{u,v}\le & ~\bar x_{i,v}\le u_iA_{u,v}\\ \end{align*}\end{split}\]

The bounds of \(\bar x_{i,v}\) is defined by:

\[\begin{split}\begin{align*} \left[\bar l_{i,v},\bar u_{i,v}\right]= \begin{cases} [0,0], & \text{$A_{u,v}$ is fixed to $0$}\\ [l_i,u_i], & \text{$A_{u,v}$ is fixed to $1$}\\ [\min(0,l_i),\max(0,u_i)], & \text{$A_{u,v}$ is not fixed} \end{cases} \end{align*}\end{split}\]
omlt.neuralnet.layers.full_space.full_space_maxpool2d_layer(net_block, net, layer_block, layer)[source]

Add Big-M max pooling formulation.

\[\begin{split}\begin{align*} \hat{z_i} \leq w\cdot x_{i}^{l} + \sum_{k{=}1}^{d} M_{i}^{l,k} q_{i}^{k} && \forall i \in N,\ \forall l \in \{ 1,...,d \} \\ \hat{z_i} \geq w\cdot x_{i}^{l} && \forall i \in N,\ \forall l \in \{ 1,...,d \} \\ (x_{i},\hat{z_i},q_{i}) \in [L_{i},U_{i}] \times \mathbb{R} \times \Delta^{d} && \forall i \in N \\ q_{i} \in \{ 0,1 \}^{d} && \forall i \in N \\ M_{i}^{l,k} = w\cdot max\{ L_{i}^{l} - L_{i}^{k}, \\ L_{i}^{l} - U_{i}^{k}, U_{i}^{l} - L_{i}^{k}, U_{i}^{l} - U_{i}^{k} \} && \forall i \in N,\ \forall l \in \{ 1,...,d \},\ \forall k \in \{ 1,...,d \} \end{align*}\end{split}\]

where \(w\) is the convolution kernel on the preceding convolutional layer; \(d\) is the number of features in each of the \(N\) max pooling windows; \(x_{i}\) is the set of \(d\) features in the \(i\)-th max pooling window; \(\Delta^{d}\) is the \(d\)-dimensional simplex; and [L_{i},U_{i}] are the bounds on x_{i}.

NOTE This formulation is adapted from the Anderson et al. (2020) formulation, section 5.1, with the following changes:

  • OMLT presently does not support biases on convolutional layers. Bias terms from the original formulation are removed.

  • The original formulation formulates the max of \(w^{l}\cdot x + b^{l}\), varying the weights \(w\) and biases \(b\) and keeping the input \(x\) constant. Since convolutional layers have constant weights and biases convolved with varying portions of the feature map, this formulation formulates the max of \(w\cdot x^{l} + b\).

  • Due to the above 2 changes, the calculation of \(N^{l,k}\) is changed.

omlt.neuralnet.layers.reduced_space.reduced_space_dense_layer(net_block, net, layer_block, layer, activation)[source]

Add reduced-space formulation of the dense layer to the block

\[\begin{align*} \hat z_i &= \sum_{j{=}1}^{M_i} w_{ij} z_j + b_i && \forall i \in N \end{align*}\]
omlt.neuralnet.layers.partition_based.default_partition_split_func(w, n)[source]

Default function for partitioning weights in \(w\) into \(n\) partitions.

Weights in \(w\) are sorted and partitioned evenly.

omlt.neuralnet.layers.partition_based.partition_based_dense_relu_layer(net_block, net, layer_block, layer, split_func)[source]

Partition-based ReLU activation formulation.

Generates the constraints for the ReLU activation function:

\[\begin{align*} y_j = \max\left(0,\sum\limits_{i=0}^{F_{in}-1}w_{ij}x_i+b_j\right), && \forall 0\le j<F_{out} \end{align*}\]

We additionally introduce the following notations to describe this formulation:

\[\begin{split}\begin{align*} n &:= \text{the number of partitions}\\ S_k &:= \text{indexes of the $k$-th partition satisfying:} \\ & \quad\quad \bigcup\limits_{k=0}^{n-1} S_k=\{0,1,\dots,F_{in}-1\},~S_{k_1}\cap S_{k_2}=\emptyset, ~\forall k_1\neq k_2\\ \sigma &:= \text{if this activation function is activated, i.e.,}\\ & \quad\quad y_j= \begin{cases} 0, & \sigma=1\\ \sum\limits_{i=0}^{F_{in}-1}w_{ij}x_i+b_j, & \sigma=0 \end{cases}\\ p_k &:=\text{auxiliary variable representing the $k$-th partition, i.e., $\sum\limits_{i\in S_k}w_{ij}x_i$}\\ l_k &:=\text{the lower bound of $\sum\limits_{i\in S_k}w_{ij}x_i$}\\ u_k &:=\text{the upper bound of $\sum\limits_{i\in S_k}w_{ij}x_i$} \end{align*}\end{split}\]

The partition-based formulation for \(y_j\) is given by:

\[\begin{split}\begin{align*} & y_j=\sum\limits_{k=0}^{n-1}p_k+(1-\sigma)b_j\\ & \sum\limits_{k=0}^{n-1}\left(\sum\limits_{i\in S_k}w_{ij}x_i-p_k\right)+\sigma b_j\le 0\\ & \sum\limits_{k=0}^{n-1}p_k+(1-\sigma)b_j\ge 0\\ & \sigma l_k\le \sum\limits_{i\in S_k}w_{ij}x_i-p_k\le \sigma u_k,~0\le k<n\\ & (1-\sigma)l_k\le p_k\le (1-\sigma)u_k,~0\le k<n \end{align*}\end{split}\]

Activation Functions

Since all activation functions are element-wised, we only consider how to formulate activation functions for a single neuron, where \(x\) denotes pre-activation variable, and \(y\) denotes post-activation variable.

omlt.neuralnet.activations.linear.linear_activation_constraint(net_block, net, layer_block, layer, add_constraint=True)[source]

Linear activation constraint generator

Generates the constraints for the linear activation function:

\[\begin{align*} y=x \end{align*}\]
omlt.neuralnet.activations.linear.linear_activation_function(zhat)[source]
class omlt.neuralnet.activations.relu.ComplementarityReLUActivation(transform=None)[source]

Bases: object

Complementarity-based ReLU activation formulation.

Generates the constraints for the ReLU activation function:

\[\begin{align*} y=\max(0,x) \end{align*}\]

The complementarity-based formulation is given by:

\[\begin{align*} 0\le y \perp (y-x)\ge 0 \end{align*}\]

which denotes that:

\[\begin{split}\begin{align*} y\ge 0\\ y(y-x)=0\\ y-x\ge 0 \end{align*}\end{split}\]
omlt.neuralnet.activations.relu.bigm_relu_activation_constraint(net_block, net, layer_block, layer)[source]

Big-M ReLU activation formulation.

Generates the constraints for the ReLU activation function:

\[\begin{align*} y=\max(0,x) \end{align*}\]

We additionally introduce the following notations to describe this formulation:

\[\begin{split}\begin{align*} \sigma &:= \text{denote if $y=x$, $\sigma\in\{0,1\}$}\\ l &:= \text{the lower bound of $x$}\\ u &:= \text{the upper bound of $x$}\\ \end{align*}\end{split}\]

The big-M formulation is given by:

\[\begin{split}\begin{align*} y&\ge 0\\ y&\ge x\\ y&\le x-(1-\sigma)l\\ y&\le \sigma u \end{align*}\end{split}\]

The lower bound of \(y\) is \(\max(0,l)\), and the upper bound of \(y\) is \(\max(0,u)\).

omlt.neuralnet.activations.smooth.sigmoid_activation_constraint(net_block, net, layer_block, layer)[source]

Sigmoid activation constraint generator.

omlt.neuralnet.activations.smooth.sigmoid_activation_function(x)[source]

Applies the sigmoid function:

\[\begin{align*} y=\frac{1}{1+\exp(-x)} \end{align*}\]
omlt.neuralnet.activations.smooth.smooth_monotonic_activation_constraint(net_block, net, layer_block, layer, fcn)[source]

Activation constraint generator for a smooth monotonic function.

Generates the constraints for the activation function \(f\) if it is smooth and monotonic:

\[\begin{align*} y=f(x) \end{align*}\]
omlt.neuralnet.activations.smooth.softplus_activation_constraint(net_block, net, layer_block, layer)[source]

Softplus activation constraint generator.

omlt.neuralnet.activations.smooth.softplus_activation_function(x)[source]

Applies the softplus function:

\[\begin{align*} y=\log(\exp(x)+1) \end{align*}\]
omlt.neuralnet.activations.smooth.tanh_activation_constraint(net_block, net, layer_block, layer)[source]

tanh activation constraint generator.

omlt.neuralnet.activations.smooth.tanh_activation_function(x)[source]

Applies the tanh function:

\[\begin{align*} y=\tanh(x) \end{align*}\]