llmcompressor.pytorch.utils.sparsification_info.helpers

函数

get_leaf_operations –

获取模型中的叶子操作
get_precision_information –

获取操作精度信息。
is_quantized –

检查操作是否已量化（包含

get_leaf_operations

get_leaf_operations(
    model: Module,
    operations_to_skip: Optional[List[Module]] = None,
    operations_to_unwrap: Optional[List[Module]] = None,
) -> List[torch.nn.Module]

获取模型中的叶子操作（那些没有子操作的操作）

参数

model
(Module) –

模型以从模型中获取叶子操作
operations_to_skip
(Optional[List[Module]], default: None ) –

在获取叶子操作时将被忽略的叶子操作列表。如果传递 None，则默认跳过 Identity 操作。
operations_to_unwrap
(Optional[List[Module]], default: None ) –

在获取叶子操作时将被解开的操作列表。解开意味着我们将直接将操作包装（即操作的 `module` 属性）的操作添加到叶子操作列表中。如果传递 None，则默认解开 QuantWrapper 操作。

返回

List[Module] –

叶子操作列表

Source code in llmcompressor/pytorch/utils/sparsification_info/helpers.py

def get_leaf_operations(
    model: torch.nn.Module,
    operations_to_skip: Optional[List[torch.nn.Module]] = None,
    operations_to_unwrap: Optional[List[torch.nn.Module]] = None,
) -> List[torch.nn.Module]:
    """
    Get the leaf operations in the model
    (those that do not have operations as children)

    :param model: the model to get the leaf operations from
    :param operations_to_skip: a list of leaf operations that will be
        omitted when getting the leaf operations. If None passed, by
        default the Identity operation will be skipped
    :param operations_to_unwrap: a list of operations that will be unwrapped
        when getting the leaf operations. Unwrapping means that we directly
        add the module(s) that is/are wrapped by the operation (i.e. operation's
        `module` attribute) to the list
        of leaf operations. If None passed, by default the QuantWrapper
        operation will be unwrapped
    :return: a list of the leaf operations
    """
    if operations_to_skip is None:
        operations_to_skip = [Identity]

    if operations_to_unwrap is None:
        operations_to_unwrap = [QuantWrapper]

    leaf_operations = []
    children = list(model.children())

    if children == []:
        return model
    else:
        for child in children:
            if isinstance(child, tuple(operations_to_unwrap)):
                leaf_operations.append(child.module)
                continue
            try:
                leaf_operations.extend(get_leaf_operations(child))
            except TypeError:
                leaf_operations.append(get_leaf_operations(child))
    leaf_operations = [
        op for op in leaf_operations if not isinstance(op, tuple(operations_to_skip))
    ]
    return leaf_operations

get_precision_information

get_precision_information(
    operation: Module,
) -> Union[None, int, QuantizationScheme]

获取操作精度信息。

1) 如果操作已量化，则返回操作的量化方案。 2) 如果操作未量化，则返回操作权重的位数。 3) 如果操作未量化且没有权重，则返回 None。

参数

operation
(Module) –

要从中获取量化方案的操作

返回

Union[None, int, QuantizationScheme] –

操作的量化方案，操作权重的位数，或 None（如果操作未量化且没有权重）

Source code in llmcompressor/pytorch/utils/sparsification_info/helpers.py

def get_precision_information(
    operation: torch.nn.Module,
) -> Union[None, int, "QuantizationScheme"]:  # noqa F821
    """
    Get the information about the precision of the operation.

    1)  If operation is quantized, returns the quantization
        scheme of the operation.
    2)  If operation is not quantized, returns the numer of bits
        of the operation's weights.
    3)  If operation is not quantized and does not have a weights,
        returns None.

    :param operation: the operation to get the quantization scheme from
    :return: the quantization scheme of the operation, the number of bits
        of the operation's weights, or None if the operation is not quantized
        and does not have a weight
    """

    if hasattr(operation, "quantization_scheme"):
        return getattr(operation, "quantization_scheme")
    elif hasattr(operation, "weight"):
        return _get_num_bits(operation.weight.dtype)
    else:
        return None

is_quantized

is_quantized(operation: Module) -> bool

检查操作是否已量化（包含量化方案）

Source code in llmcompressor/pytorch/utils/sparsification_info/helpers.py

def is_quantized(operation: torch.nn.Module) -> bool:
    """
    Check whether the operation is quantized (contains
    a quantization scheme)
    """
    return hasattr(operation, "quantization_scheme")

llmcompressor.pytorch.utils.sparsification_info.helpers

get_leaf_operations

`model`

`operations_to_skip`

`operations_to_unwrap`

get_precision_information

`operation`

is_quantized