Quantization:

The method for representing the weights of models into lower precision in order to maximize the usability and lower the resources consumption.

Untitled

Untitled

Quantize and de-quantize a tensor.

method of mapping the large set to the smaller set of values.

Untitled

what we can quantize in neural networks.

Advantages of quantization

Untitled

Linear mapping:

higher precision range to the lower percision range.

$$ r = s(q-z) $$