llmcompressor.pipelines.sequential.helpers
类
-
Subgraph–指定模型图的可执行子图的数据类
函数
-
dispatch_for_sequential–使用顺序管道调度模型进行顺序校准。
-
get_sequential_targets–根据修饰符列表和数据集参数推断顺序目标
-
trace_subgraphs–追踪模型以生成子图,其中每个顺序目标都恰好属于
SequentialTracer
Bases: HFTracer
Get a tracer specialized for the given model. The resulting tracer will not trace inside of sequential targets, nor any modules which are not call graph ancestors of sequential targets
Tracing within sequential targets is unnecessary, and tracing within offloaded modules may result in meta tensors being added to the model graph
参数
-
(ancestorsSet[Module]) –modules which are ancestors of sequential targets
-
(offloadedSet[Module]) –modules which have offloaded params and should not be traced
源代码位于 llmcompressor/pipelines/sequential/helpers.py
Subgraph dataclass
Subgraph(
graph: Graph,
input_names: Set[str],
consumed_names: Set[str],
_code: Optional[PythonCode] = None,
)
指定模型图的可执行子图的数据类
参数
-
(graphGraph) –模型图的子图
-
(input_namesSet[str]) –编译后的 forward 函数的参数名称
-
(consumed_namesSet[str]) –任何后续子图未使用的参数名称,因此可以从中间缓存中删除
方法
-
forward–执行子图中的操作
forward
执行子图中的操作
参数
-
–\*args子图 forward 函数的参数输入
-
–\**kwargs子图 forward 函数的关键字输入
返回
-
Dict[str, Any]–
源代码位于 llmcompressor/pipelines/sequential/helpers.py
dispatch_for_sequential
使用顺序管道调度模型进行顺序校准。模型将被卸载到 CPU,如果可用,则调度到 CUDA/XPU 设备。移除任何现有的钩子。
参数
-
(modelPreTrainedModel) –要分发的模型
返回
-
PreTrainedModel–已调度的模型
源代码位于 llmcompressor/pipelines/sequential/helpers.py
find_target_nodes
Find all nodes whose execution is equivalent to executing the target modules. Note that these nodes are guaranteed to be treated as leaf nodes by SequentialTracer
参数
-
(graphGraphModule) –graph containing target nodes
-
(targetsSet[Module]) –modules whose nodes are being searched for
返回
-
Set[Node]–set of all nodes which call the target modules
源代码位于 llmcompressor/pipelines/sequential/helpers.py
get_sequential_ancestors
Find modules which are call graph ancestors of the given sequential targets
参数
-
(modelModule) –model containing sequential targets
-
(targetsSet[Module]) –sequential targets to find ancestors of
返回
-
Set[Module]–call graph ancestors of sequential targets
源代码位于 llmcompressor/pipelines/sequential/helpers.py
get_sequential_targets
get_sequential_targets(
modifiers: List[Modifier],
model: PreTrainedModel,
args: DatasetArguments,
) -> List[str]
根据修饰符列表和数据集参数推断顺序目标
参数
-
(modelPreTrainedModel) –正在校准的模型
-
(修饰符List[Modifier]) –校准期间应用的修饰符列表
-
–dataset_args用户传递的数据集参数
返回
-
List[str]–顺序目标列表
源代码位于 llmcompressor/pipelines/sequential/helpers.py
graph_is_well_formed
A graph is well formed if and only if nodeA in NodeB.users <=> nodeB in Node.A.all_input_nodes
参数
-
(graphGraph) –graph being checked
返回
-
bool–True if the graph is well formed, False otherwise
源代码位于 llmcompressor/pipelines/sequential/helpers.py
match_modules
Find modules whose names match the patterns given by target_names
参数
-
(modelModule) –model containing submodules to find
-
(target_namesList[str]) –target patterns to find
返回
-
Set[Module]–all submodules matching
target_names
源代码位于 llmcompressor/pipelines/sequential/helpers.py
partition_graph
Convert each partition into a Subgraph. Each Subgraph returns a dictionary mapping of output node names to their computed values. Note that the consumed_names attribute of each Subgraph remains empty, to be later populated by trace_consumed_names
参数
-
(modelModule) –model which owns the produced Subgraphs
-
(partitionsList[List[Node]]) –list of partitions, where each partition is a list of nodes belonging to that partition
返回
-
List[Subgraph]–list of subgraphs in order of execution
源代码位于 llmcompressor/pipelines/sequential/helpers.py
populate_concrete_args
Creates concrete args which, unlike the equivalent function provided by transformers.utils.fx, creates default values for variadic arguments, which are needed by some models.
参数
-
(modelModule) –正在追踪的模型
-
(sample_inputDict) –values used to symbolically trace the model. All arguments to the model.forward function which are not in the sample_input are considered concrete args
返回
-
Dict–dictionary mapping concrete argument names to their default values
源代码位于 llmcompressor/pipelines/sequential/helpers.py
topological_partition
Partition the graph into partitions such that each target belongs to exactly one partition and executing each partition depends only on intermediate values produced by executing the partitions before it.
参数
-
(graphGraphModule) –graph being partitioned
-
(targetsSet[Module]) –target modules which will be assigned to disjoint partitions
返回
-
List[List[Node]]–list of partitions, where each partition is a list of nodes belonging to that partition
源代码位于 llmcompressor/pipelines/sequential/helpers.py
trace_consumed_names
Populate the consumed_names attribute of each Subgraph according to when inputs are last used in order to vacate the intermediates cache and save memory
参数
-
(subgraphsList[Subgraph]) –list of subgraphs with empty
consumed_namesattributes
源代码位于 llmcompressor/pipelines/sequential/helpers.py
trace_subgraphs
trace_subgraphs(
model: PreTrainedModel,
sample_input: Dict[str, Any],
sequential_targets: List[str],
ignore: List[str],
) -> List[Subgraph]
追踪模型以生成子图,其中每个顺序目标都恰好属于一个子图,并且按顺序执行每个子图等效于执行原始模型
参数
-
(modelPreTrainedModel) –正在追踪的模型
-
(sample_inputDict[str, Any]) –在执行期间其值将发生变化的输入,但其 len、bool 和 contains 值在批次中假定为常量
-
(sequential_targetsList[str]) –匹配顺序目标的模式列表
-
(ignoreList[str]) –追踪期间要跳过的函数和方法名称
返回
-
List[Subgraph]–按执行顺序排列的子图列表
源代码位于 llmcompressor/pipelines/sequential/helpers.py
86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 | |