Apr 9, 2024
Hi Ratnakar. DeQuantization just works the reverse way to extract data. Here is the typical formal to calculate the value
dequantizedTensor = int4Tensor
/roundedValue(totalNumberOfPositions/absmax(inputXTensor))
In the above example
totalNumberOfPositions = 16
However some of the resolution is lost during this process. it may not be exact weights as the original values.