Skip to content

QuantLinear

Bases: QuantBase

Int8-quantized linear layer with optional BatchNorm and ReLU.

Uses asymmetric fake-quantization for activations and symmetric per-output-channel fake-quantization for weights. When bn=True a :class:torch.nn.BatchNorm1d is applied after the linear during training; for schema export BN is folded into the weight and bias.

Parameters:

Name Type Description Default
in_features int

Number of input features.

required
out_features int

Number of output features.

required
bias bool

Whether to include an additive bias.

True
act_func str | None

Either "relu" or None.

None
ema_constant float

EMA smoothing factor for activation observers.

0.01
device str

Torch device for the underlying parameters.

'cpu'
bn bool

If True, append a BatchNorm1d layer and fold it into weights for schema export.

False