blog
-
Understanding Floating Point Precision, BF16, and Why Deep Learning Training Still Works
An explanation of floating point spacing, BF16 precision limitations, and how mixed precision training enables modern deep learning despite low-precision formats.
-
Conv โ BatchNorm โ ReLU: Why This Order is Standard in CNNs
An explanation of why the Conv โ BatchNorm โ ReLU ordering is standard in CNNs, grounded in mathematical reasoning and empirical results from modern deep learning architectures.
-
Understanding Gradients in a Linear Layer: Why dL/dW = g xแต
A step-by-step derivation of the weight gradient in a linear layer, explaining why the 3D Jacobian tensor collapses into a simple outer product during backpropagation.
-
2D Convolution in Image Processing โ Complete Summary
A complete guide to 2D convolution covering kernels, padding, stride, output size, and why modern CNNs prefer 3x3 filters.
-
Optimizing Claude Token Usage: A Practical Guide to Context Management
Learn how Claude uses context tokens and how to optimize them using compacting, task isolation, and tool management.
-
Bias-Variance Tradeoff: From Theory to Ensemble Methods
A deep dive into the bias-variance decomposition, decision trees, bagging, boosting, and XGBoost โ with clear math and intuition.
-
Quantization in CNNs: From FP32 Training to INT8 Deployment
A practical walkthrough of how CNN weights go from 32-bit floating point to 8-bit integers โ and why it barely hurts accuracy.
-
PCA and ZCA Whitening: A Comprehensive Study Guide
Understanding whitening transforms โ from eigenvalue decomposition to PCA and ZCA whitening โ with clear math and intuition.
-
Depthwise and Pointwise Convolutions: A Practical Guide for Edge AI
How depthwise and pointwise convolutions enable efficient CNN deployment on edge devices.