vllm_gaudi.extension.utils ¶
FP8Matmul ¶
基类: Module
源代码在 vllm_gaudi/extension/utils.py
ModuleFusedSDPA ¶
基类: Module
源代码在 vllm_gaudi/extension/utils.py
__init__ ¶
forward ¶
forward(
query,
key,
value,
attn_mask,
dropout_p,
is_causal,
scale,
softmax_mode,
recompute_mode,
valid_sequence_lengths,
padding_side="left",
window_size=None,
)
源代码在 vllm_gaudi/extension/utils.py
VLLMFP8KVCache ¶
基类:VLLMKVCache
源代码在 vllm_gaudi/extension/utils.py
__init__ ¶
dequant_output ¶
fetch_from_cache ¶
源代码在 vllm_gaudi/extension/utils.py
forward ¶
VLLMKVCache ¶
基类: Module