Tag Archives: Model Quantization

Optimizing TensorFlow models with Quantization Techniques

Deep Learning models are great at solving extremely complex tasks efficiently but this superpower comes at a cost. Due to a large number of parameters, these models are typically big in size(memory footprint) and also slow in the inference (during predictions). Slow and heavy models are not much appreciated when it comes to the deployment part. As we… Read More »