跳到内容

llmcompressor.modeling.qwen3_next_moe

CalibrationQwen3NextSparseMoeBlock

CalibrationQwen3NextSparseMoeBlock(
    original: Qwen3NextSparseMoeBlock,
    config: Qwen3NextConfig,
    calibrate_all_experts: bool = True,
)

基类: MoECalibrationModule

Qwen3NextSparseMoeBlock 的校准版本,将所有 token 发送给所有专家。

源代码在 llmcompressor/modeling/qwen3_next_moe.py
def __init__(
    self,
    original: Qwen3NextSparseMoeBlock,
    config: Qwen3NextConfig,
    calibrate_all_experts: bool = True,
):
    super().__init__()
    self.num_experts = config.num_experts
    self.top_k = config.num_experts_per_tok
    self.norm_topk_prob = config.norm_topk_prob

    # gating
    self.calibrate_all_experts = calibrate_all_experts
    self.gate = original.gate
    self.experts = original.experts

    self.shared_expert = original.shared_expert
    self.shared_expert_gate = original.shared_expert_gate