When using ReLU as the activation function for classification problems a softmax function is used on the last layer. It helps to generate the probability such as scores similar to () for each of the classes: