Suppose we have a two-layer network. Let’s represent inputs/outputs with , and the two layers by states, that is, the connection weights with bias value: and . We will also use σ as the activation function.