ROC curve - Choose the decision threshold based on decision criteria AUC - Provides total measure of performance across all possible classification thresholds
Three common failure modes for gradient descent
Gradients can vanish - Use ReLu instead of sigmoid/tanh