Skip to content

QuantLinear

Bases: QuantBase

Int8-quantized linear layer with optional ReLU activation.

Uses asymmetric fake-quantization for activations and symmetric per-output-channel fake-quantization for weights.

Parameters:

Name Type Description Default
in_features int

Number of input features.

required
out_features int

Number of output features.

required
bias bool

Whether to include an additive bias.

True
act_func str | None

Either "relu" or None.

None
ema_constant float

EMA smoothing factor for activation observers.

0.01
device str

Torch device for the underlying parameters.

'cpu'