Abstract: Quantization, effective Neural Network architecture, and efficient accelerator hardware are three important design paradigms to maximize accuracy and efficiency. Mixed Precision Quantization ...