Skip to content

QuantUpperTriangularLinear

Bases: QuantBase

Quantized :class:UpperTriangularLinear.

  • out <= in: top out rows of the in x in upper-triangular matrix.
  • out > in: leading out - in rows are fully dense; trailing in rows form a square upper-triangular block.

Fake-quantization is applied to weight * mask, so masked-out entries remain exactly zero through quantization.

Parameters:

Name Type Description Default
in_features int

Number of input features.

required
out_features int

Number of output features.

required
bias bool

Whether to include an additive bias.

True
act_func str | None

Either "relu" or None.

None
ema_constant float

EMA smoothing factor for observers.

0.01
device str

Torch device for the parameters.

'cpu'