Q5_0 quantized model inference time is more than Q8_0

I am trying to deploy the model on Qualcomm board QCS6490 CPU, I have quantized the model to Q8_0 and Q5_0 , While performing inference of variable length audio's like 9 sec, 10 sec and 12 sec audio the inference time of Q5_0 is more than the Q8_0 model for each audio.



Audio_Duration_ms | Q8_0 Inference_Time_ms | Q5_0_Inference_Time_ms
-- | -- | --
6180 | 7404.38 | 9039.27
6240 | 7443.3 | 9275.39
6800 | 7387.46 | 8923.46
7200 | 7923.13 | 9377.34
8340 | 7477.39 | 8953.79
8760 | 7790.3 | 9420.6
10580 | 7691.98 | 9557.05
10680 | 7791.84 | 9290.73
17700 | 8143.61 | 9556.05
19980 | 8083.97 | 9774.57



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Q5_0 quantized model inference time is more than Q8_0 #3491

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Audio_Duration_ms	Q8_0 Inference_Time_ms	Q5_0_Inference_Time_ms
6180	7404.38	9039.27
6240	7443.3	9275.39
6800	7387.46	8923.46
7200	7923.13	9377.34
8340	7477.39	8953.79
8760	7790.3	9420.6
10580	7691.98	9557.05
10680	7791.84	9290.73
17700	8143.61	9556.05
19980	8083.97	9774.57

Q5_0 quantized model inference time is more than Q8_0 #3491

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions