QuantLinear¶
Bases: QuantBase
Int8-quantized linear layer with optional ReLU activation.
Uses asymmetric fake-quantization for activations and symmetric per-output-channel fake-quantization for weights.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
in_features
|
int
|
Number of input features. |
required |
out_features
|
int
|
Number of output features. |
required |
bias
|
bool
|
Whether to include an additive bias. |
True
|
act_func
|
str | None
|
Either |
None
|
ema_constant
|
float
|
EMA smoothing factor for activation observers. |
0.01
|
device
|
str
|
Torch device for the underlying parameters. |
'cpu'
|