Towards Accurate Post-training Quantization for Reparameterized Models
CoRR(2024)
摘要
Model reparameterization is a widely accepted technique for improving
inference speed without compromising performance. However, current
Post-training Quantization (PTQ) methods often lead to significant accuracy
degradation when applied to reparameterized models. This is primarily caused by
channel-specific and sample-specific outliers, which appear only at specific
samples and channels and impact on the selection of quantization parameters. To
address this issue, we propose RepAPQ, a novel framework that preserves the
accuracy of quantized reparameterization models. Different from previous
frameworks using Mean Squared Error (MSE) as a measurement, we utilize Mean
Absolute Error (MAE) to mitigate the influence of outliers on quantization
parameters. Our framework comprises two main components: Quantization
Protecting Reparameterization and Across-block Calibration. For effective
calibration, Quantization Protecting Reparameterization combines multiple
branches into a single convolution with an affine layer. During training, the
affine layer accelerates convergence and amplifies the output of the
convolution to better accommodate samples with outliers. Additionally,
Across-block Calibration leverages the measurement of stage output as
supervision to address the gradient problem introduced by MAE and enhance the
interlayer correlation with quantization parameters. Comprehensive experiments
demonstrate the effectiveness of RepAPQ across various models and tasks. Our
framework outperforms previous methods by approximately 1% for 8-bit PTQ and
2% for 6-bit PTQ, showcasing its superior performance. The code is available
at .
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要