llmcompressor.pytorch.utils

PyTorch通用的工具和辅助代码

模块

helpers –

实用/辅助函数
sparsification –

与模型稀疏化相关的信息的辅助函数
sparsification_info –

类

ModuleSparsificationInfo –

提供PyTorch模块参数相关信息的辅助类

函数

get_linear_layers –

:param module: 需要获取所有线性层的模块
get_quantized_layers –

:param module: 需要从中获取量化层的模块
set_deterministic_seeds –

手动设置numpy, random, 和 torch 包的随机种子。
tensor_sparsity –

:param tens: 需要计算稀疏度的张量
tensors_module_forward –

用于调用模型进行前向执行的默认函数。
tensors_to_device –

将张量或张量集合放到相应设备上的默认函数。
tensors_to_precision –

:param tensors: 需要更改精度的张量

ModuleSparsificationInfo

ModuleSparsificationInfo(
    module: Module,
    state_dict: Optional[Dict[str, Tensor]] = None,
)

提供与PyTorch模块参数和稀疏化应用量相关信息的辅助类。包括剪枝和量化信息

参数

module
(Module) –

需要分析的PyTorch模块
state_dict
(Optional[Dict[str, Tensor]], 默认值: None ) –

可选的state_dict，用于分析代替PyTorch模型。当分析FSDP模型时使用，此时可能无法访问完整的权重

属性

params_quantized (int) –

:return: 量化层中的参数数量
params_quantized_percent (float) –

:return: 量化参数的百分比
params_sparse (int) –

:return: 模型中稀疏（0）可训练参数的总数
params_sparse_percent (float) –

:return: 模型中稀疏化参数的百分比
params_total (int) –

:return: 模型中可训练参数的总数

Source code in llmcompressor/pytorch/utils/sparsification.py

def __init__(
    self, module: Module, state_dict: Optional[Dict[str, torch.Tensor]] = None
):
    self.module = module

    if state_dict is not None:
        # when analyzing an FSDP model, the state_dict does not differentiate
        # between trainable and non-trainable parameters
        # (e.g. it can contain buffers) this means that the
        # self.trainable_parameters may be overestimated
        self.trainable_params = state_dict
    else:
        if hasattr(module, "_hf_hook"):
            self.trainable_params = get_state_dict_offloaded_model(module)
        else:
            self.trainable_params = {
                k: v for k, v in self.module.named_parameters() if v.requires_grad
            }

params_quantized `property`

params_quantized: int

返回

int –

量化层中的参数数量

params_quantized_percent `property`

params_quantized_percent: float

返回

float –

量化参数的百分比

params_sparse `property`

params_sparse: int

返回

int –

模型中稀疏（0）可训练参数的总数

params_sparse_percent `property`

params_sparse_percent: float

返回

float –

模型中稀疏化参数的百分比

params_total `property`

params_total: int

返回

int –

模型中可训练参数的总数

get_linear_layers

get_linear_layers(module: Module) -> Dict[str, Module]

参数

module
(Module) –

需要获取所有线性层的模块

返回

Dict[str, Module] –

模块中的所有线性层列表

Source code in llmcompressor/pytorch/utils/helpers.py

def get_linear_layers(module: Module) -> Dict[str, Module]:
    """
    :param module: the module to grab all linear layers for
    :return: a list of all linear layers in the module
    """
    return {
        name: mod for name, mod in module.named_modules() if isinstance(mod, Linear)
    }

get_quantized_layers

get_quantized_layers(
    module: Module,
) -> List[Tuple[str, Module]]

参数

module
(Module) –

需要从中获取量化层的模块

返回

List[Tuple[str, Module]] –

包含量化层（Embedding、Linear、Conv2d、Conv3d）的名称和模块的列表

Source code in llmcompressor/pytorch/utils/helpers.py

def get_quantized_layers(module: Module) -> List[Tuple[str, Module]]:
    """
    :param module: the module to get the quantized layers from
    :return: a list containing the names and modules of the quantized layers
        (Embedding, Linear, Conv2d, Conv3d)
    """

    quantized_layers = []
    for name, mod in module.named_modules():
        if hasattr(mod, "quantization_scheme"):
            weight_scheme = getattr(mod.quantization_scheme, "weights", None)
            if weight_scheme is not None and hasattr(mod, "weight"):
                quantized_layers.append((name, mod))

    return quantized_layers

set_deterministic_seeds

set_deterministic_seeds(seed: int = 0)

手动设置numpy, random, 和 torch 包的随机种子。还设置 torch.backends.cudnn.deterministic 为 True

参数

seed
(int, 默认值: 0 ) –

使用的手动种子。默认为0

Source code in llmcompressor/pytorch/utils/helpers.py

def set_deterministic_seeds(seed: int = 0):
    """
    Manually seeds the numpy, random, and torch packages.
    Also sets torch.backends.cudnn.deterministic to True
    :param seed: the manual seed to use. Default is 0
    """
    numpy.random.seed(seed)
    random.seed(seed)
    torch.manual_seed(seed)
    torch.backends.cudnn.deterministic = True

tensor_sparsity

tensor_sparsity(
    tens: Tensor,
    dim: Union[
        None, int, List[int], Tuple[int, ...]
    ] = None,
) -> Tensor

参数

tens
(Tensor) –

需要计算稀疏度的张量
dim
(Union[None, int, List[int], Tuple[int, ...]], default: None ) –

用于分割计算的维度；例如，可以按批次、通道或组合进行分割

返回

Tensor –

输入张量的稀疏度，即零的比例

Source code in llmcompressor/pytorch/utils/helpers.py

def tensor_sparsity(
    tens: Tensor, dim: Union[None, int, List[int], Tuple[int, ...]] = None
) -> Tensor:
    """
    :param tens: the tensor to calculate the sparsity for
    :param dim: the dimension(s) to split the calculations over;
        ex, can split over batch, channels, or combos
    :return: the sparsity of the input tens, ie the fraction of numbers that are zero
    """
    if dim is None:
        zeros = (tens.cpu() == 0).sum()
        total = tens.numel()

        return zeros.float() / float(total)

    if isinstance(dim, int):
        dim = [dim]

    if max(dim) >= len(tens.shape):
        raise ValueError(
            "Unsupported dim given of {} in {} for tensor shape {}".format(
                max(dim), dim, tens.shape
            )
        )

    sum_dims = [ind for ind in range(len(tens.shape)) if ind not in dim]
    zeros = (tens == 0).sum(dim=sum_dims) if sum_dims else tens == 0
    total = numpy.prod(
        [tens.shape[ind] for ind in range(len(tens.shape)) if ind not in dim]
    )

    permute_order = sorted(
        ((d, len(dim) - i - 1) for i, d in enumerate(dim)), reverse=True
    )
    permute = [d[1] for d in permute_order]

    if permute != [i for i in range(len(permute))]:
        # need to permute to get desired dimensions at the front
        zeros = zeros.permute(*permute).contiguous()

    return zeros.float() / float(total)

tensors_module_forward

tensors_module_forward(
    tensors: Union[
        Tensor, Iterable[Tensor], Mapping[Any, Tensor]
    ],
    module: Module,
    check_feat_lab_inp: bool = True,
) -> Any

用于调用模型进行前向执行的默认函数。返回模型结果。注意，如果是一个可迭代对象，则传递给模型的特征被认为是索引0，其他索引是标签。

支持用例：单个张量，第一个张量作为特征传递给模型的迭代器

参数

tensors
(Union[Tensor, Iterable[Tensor], Mapping[Any, Tensor]]) –

要传递给模型的数据，如果是一个可迭代对象，则传递给模型的特征被认为是索引0，其他索引是标签
module
(Module) –

要将数据传递到的模块
check_feat_lab_inp
(bool, 默认值: True ) –

如果为True，则检查传入的张量是否看起来由特征和标签组成（例如，一个包含2个元素的元组或列表，通常来自数据加载器），并将只用第一个元素调用模型，假定它是特征。如果为False，则不进行检查。

返回

Any –

调用模型进行前向传递的结果

Source code in llmcompressor/pytorch/utils/helpers.py

def tensors_module_forward(
    tensors: Union[Tensor, Iterable[Tensor], Mapping[Any, Tensor]],
    module: Module,
    check_feat_lab_inp: bool = True,
) -> Any:
    """
    Default function for calling into a model with data for a forward execution.
    Returns the model result.
    Note, if an iterable the features to be passed into the model are considered
    to be at index 0 and other indices are for labels.

    Supported use cases: single tensor,
    iterable with first tensor taken as the features to pass into the model

    :param tensors: the data to be passed into the model, if an iterable the features
        to be passed into the model are considered to be at index 0 and other indices
        are for labels
    :param module: the module to pass the data into
    :param check_feat_lab_inp: True to check if the incoming tensors looks like
        it's made up of features and labels ie a tuple or list with 2 items
        (typical output from a data loader) and will call into the model with just
        the first element assuming it's the features False to not check
    :return: the result of calling into the model for a forward pass
    """
    if (
        (isinstance(tensors, tuple) or isinstance(tensors, List))
        and len(tensors) == 2
        and check_feat_lab_inp
    ):
        # assume if this is a list or tuple of 2 items that it is made up of
        # (features, labels) pass the features into a recursive call for the model
        return tensors_module_forward(tensors[0], module, check_feat_lab_inp=False)

    if isinstance(tensors, Tensor):
        return module(tensors)

    if isinstance(tensors, Mapping):
        return module(**tensors)

    if isinstance(tensors, Iterable):
        return module(*tensors)

    raise ValueError(
        "unrecognized type for data given of {}".format(tensors.__class__.__name__)
    )

tensors_to_device

tensors_to_device(
    tensors: Union[
        Tensor, Iterable[Tensor], Dict[Any, Tensor]
    ],
    device: str,
) -> Union[Tensor, Iterable[Tensor], Dict[Any, Tensor]]

将张量或张量集合放到相应设备上的默认函数。返回放置在相应设备上的张量引用。

支持用例： - 单个张量 - 单个张量字典 - 单个张量迭代器字典 - 张量字典的字典 - 单个张量迭代器 - 张量迭代器的迭代器 - 张量字典的迭代器

参数

tensors
(Union[Tensor, Iterable[Tensor], Dict[Any, Tensor]]) –

要放置到设备上的张量或张量集合
device
(str) –

表示要将张量放置到的设备的字符串，例如：'cpu'，'cuda'，'cuda:1'

返回

Union[Tensor, Iterable[Tensor], Dict[Any, Tensor]] –

放置到设备上的张量或张量集合

Source code in llmcompressor/pytorch/utils/helpers.py

def tensors_to_device(
    tensors: Union[Tensor, Iterable[Tensor], Dict[Any, Tensor]], device: str
) -> Union[Tensor, Iterable[Tensor], Dict[Any, Tensor]]:
    """
    Default function for putting a tensor or collection of tensors to the proper device.
    Returns the tensor references after being placed on the proper device.

    Supported use cases:
        - single tensor
        - Dictionary of single tensors
        - Dictionary of iterable of tensors
        - Dictionary of dictionary of tensors
        - Iterable of single tensors
        - Iterable of iterable of tensors
        - Iterable of dictionary of tensors

    :param tensors: the tensors or collection of tensors to put onto a device
    :param device: the string representing the device to put the tensors on,
        ex: 'cpu', 'cuda', 'cuda:1'
    :return: the tensors or collection of tensors after being placed on the device
    """
    if isinstance(tensors, Tensor):
        return tensors.to(device)

    if isinstance(tensors, OrderedDict):
        return OrderedDict(
            [(key, tensors_to_device(tens, device)) for key, tens in tensors.items()]
        )

    if isinstance(tensors, Mapping):
        return {key: tensors_to_device(tens, device) for key, tens in tensors.items()}

    if isinstance(tensors, tuple):
        return tuple(tensors_to_device(tens, device) for tens in tensors)

    if isinstance(tensors, Iterable):
        return [tensors_to_device(tens, device) for tens in tensors]

    raise ValueError(
        "unrecognized type for tensors given of {}".format(tensors.__class__.__name__)
    )

tensors_to_precision

tensors_to_precision(
    tensors: Union[
        Tensor, Iterable[Tensor], Dict[Any, Tensor]
    ],
    full_precision: bool,
) -> Union[Tensor, Iterable[Tensor], Dict[Any, Tensor]]

参数

tensors
(Union[Tensor, Iterable[Tensor], Dict[Any, Tensor]]) –

需要更改精度的张量
full_precision
(bool) –

True 表示全精度（float 32），False 表示半精度（float 16）

返回

Union[Tensor, Iterable[Tensor], Dict[Any, Tensor]] –

转换为所需精度的张量

Source code in llmcompressor/pytorch/utils/helpers.py

def tensors_to_precision(
    tensors: Union[Tensor, Iterable[Tensor], Dict[Any, Tensor]], full_precision: bool
) -> Union[Tensor, Iterable[Tensor], Dict[Any, Tensor]]:
    """
    :param tensors: the tensors to change the precision of
    :param full_precision: True for full precision (float 32) and
        False for half (float 16)
    :return: the tensors converted to the desired precision
    """
    if isinstance(tensors, Tensor):
        return tensors.float() if full_precision else tensors.half()

    if isinstance(tensors, Mapping):
        return {
            key: tensors_to_precision(tens, full_precision)
            for key, tens in tensors.items()
        }

    if isinstance(tensors, tuple):
        return tuple(tensors_to_precision(tens, full_precision) for tens in tensors)

    if isinstance(tensors, Iterable):
        return [tensors_to_precision(tens, full_precision) for tens in tensors]

    raise ValueError(
        "unrecognized type for tensors given of {}".format(tensors.__class__.__name__)
    )

llmcompressor.pytorch.utils

ModuleSparsificationInfo

`module`

`state_dict`

params_quantized `property`

params_quantized_percent `property`

params_sparse `property`

params_sparse_percent `property`

params_total `property`

get_linear_layers

`module`

get_quantized_layers

`module`

set_deterministic_seeds

`seed`

tensor_sparsity

`tens`

`dim`

tensors_module_forward

`tensors`

`module`

`check_feat_lab_inp`

tensors_to_device

`tensors`

`device`

tensors_to_precision

`tensors`

`full_precision`

llmcompressor.pytorch.utils

ModuleSparsificationInfo

module

state_dict

params_quantized property

params_quantized_percent property

params_sparse property

params_sparse_percent property

params_total property

get_linear_layers

module

get_quantized_layers

module

set_deterministic_seeds

seed

tensor_sparsity

tens

dim

tensors_module_forward

tensors

module

check_feat_lab_inp

tensors_to_device

tensors

device

tensors_to_precision

tensors

full_precision

`module`

`state_dict`

params_quantized `property`

params_quantized_percent `property`

params_sparse `property`

params_sparse_percent `property`

params_total `property`

`module`

`module`

`seed`

`tens`

`dim`

`tensors`

`module`

`check_feat_lab_inp`

`tensors`

`device`

`tensors`

`full_precision`