量化多模态音频模型
https://github.com/user-attachments/assets/6732c60b-1ebe-4bed-b409-c16c4415dff5
音频由 Daniel Galvez 等人根据知识共享许可提供
<|startoftranscript|> <|en|>
...
<|transcribe|> <|notimestamps|>
that's where you have a lot of windows in the south no actually that's passive solar
and passive solar is something that was developed and designed in the 1960s and 70s
and it was a great thing for what it was at the time but it's not a passive house
此目录包含使用 GPTQ 量化各种音频语言模型的示例脚本。
压缩您自己的模型
要使用您自己的多模态模型,请从现有示例开始,将 model_id 更改为匹配您自己的模型存根。
model_id = "path/to/your/model"
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="auto")
自定义 GPTQModifier 参数
GPTQModifier 是负责执行模型权重量化的修改器。有关使用不同权重方案进行量化的更多信息,请参阅 示例文件夹 中的 quantization_ 示例。
recipe = [
GPTQModifier(
targets="Linear",
scheme="W4A16",
sequential_targets=["WhisperEncoderLayer", "WhisperDecoderLayer"],
ignore=["lm_head"],
)
]
顺序目标 (Sequential Targets)
顺序目标是决定模型前向传播过程中误差传播和激活卸载粒度的模块。这些通常是模型的“Transformer 块”,在 llm-compressor 中也称为“层”。
选择具有更高粒度的顺序目标(例如,“Linear”而不是“LlamaDecoderLayer”)将导致同时分配的 Hessian 更少,从而降低压缩的内存需求。这还可能提高模型的恢复精度,因为压缩误差以更高的粒度传播。然而,使用更高粒度的顺序目标也可能增加压缩时间,因为在卸载和加载激活上花费的时间更多。
添加自定义 Smoothquant 映射
有关为您的数据集添加 Smoothquant 映射的指南,请参阅 SmoothQuant 指南。
添加自定义数据整理器 (Data Collator)
大多数示例都使用一个通用的 data_collator,该整理器可以正确地为大多数多模态数据集关联数据。如果您发现您的模型需要自定义数据整理(例如 pixtral 的情况),您可以修改此函数以反映这些特定于模型的要求。
提供的音频根据知识共享署名许可获得
https://creativecommons.org/licenses/by/4.0/legalcode
@article{DBLP:journals/corr/abs-2111-09344,
author = {Daniel Galvez and
Greg Diamos and
Juan Ciro and
Juan Felipe Cer{\'{o}}n and
Keith Achorn and
Anjali Gopi and
David Kanter and
Maximilian Lam and
Mark Mazumder and
Vijay Janapa Reddi},
title = {The People's Speech: {A} Large-Scale Diverse English Speech Recognition
Dataset for Commercial Usage},
journal = {CoRR},
volume = {abs/2111.09344},
year = {2021},
url = {https://arxiv.org/abs/2111.09344},
eprinttype = {arXiv},
eprint = {2111.09344},
timestamp = {Mon, 22 Nov 2021 16:44:07 +0100},
biburl = {https://dblp.org/rec/journals/corr/abs-2111-09344.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}