Skip to content

QuantLowerTriangularLinear

Bases: QuantBase

Quantized :class:LowerTriangularLinear.

  • out <= in: bottom out rows of the in x in lower-triangular matrix (block flush against the right edge of the weight matrix).
  • out > in: leading in rows form a square lower-triangular block; trailing out - in rows are fully dense.

Fake-quantization is applied to weight * mask, so masked-out entries remain exactly zero through quantization.

Parameters:

Name Type Description Default
in_features int

Number of input features.

required
out_features int

Number of output features.

required
bias bool

Whether to include an additive bias.

True
act_func str | None

Either "relu" or None.

None
ema_constant float

EMA smoothing factor for observers.

0.01
device str

Torch device for the parameters.

'cpu'