Activation Function Visualizer
Explore the shapes and derivatives of various activation functions used in deep learning, and understand how they transform input values during forward and backward passes.

What this calculator is doing
This tool generates a plot comparing several common activation functions used in neural networks.
It also optionally shows their derivatives (∂f/∂x), which are critical during backpropagation.
The plot shows how each function transforms input values within a configurable range, helping practitioners
choose the right activation based on saturation, linearity, and gradient flow characteristics.
- ReLU (Rectified Linear Unit): Ideal for hidden layers in deep neural networks due to its simplicity and ability to reduce vanishing gradients.
- Sigmoid: Useful in binary classification problems or as an output activation for probabilities between 0 and 1.
- Tanh: Preferred over sigmoid for hidden layers when zero-centered outputs are desired.
- Leaky ReLU (configurable α): Helps prevent dying neurons by allowing a small gradient when inputs are negative.
- ELU (Exponential Linear Unit) (configurable α): Offers smoother and faster convergence by producing negative values, which helps keep mean activations closer to zero.
- Swish (β parameter): Performs well in deep models by combining linearity and smoothness, especially when learned β improves flexibility.
- GELU (Gaussian Error Linear Unit): Often used in transformer-based models for its smooth, probabilistic nonlinearity that improves performance in large networks.
Disclaimer: These calculators are provided for informational purposes only. Always verify your designs against relevant engineering standards and consult a qualified professional. We do not take responsibility for any errors or damages resulting from the use of these calculations.