speculators.models

模块

eagle –

Speculators 实现，提供统一的实现
eagle3 –
mlp –

类

Eagle3SpeculatorConfig –

EAGLE-3 预测器配置，支持词汇表映射。
EagleSpeculator –

EagleSpeculator 模型实现，用于预测解码的 EAGLE 和 HASS 变体
EagleSpeculatorConfig –

与 EagleSpeculator 一起使用的 SpeculatorModelConfig 实现
MLPSpeculatorConfig –

待办事项

Eagle3SpeculatorConfig

Eagle3SpeculatorConfig(**kwargs)

基类: SpeculatorModelConfig

EAGLE-3 预测器配置，支持词汇表映射。

EAGLE-3 特征词汇表，在草稿（32K）和目标（128K）词汇表之间映射，支持跨分词器预测。

参数

transformer_layer_config
–

Transformer 解码器层配置
draft_vocab_size
–

预测草稿模型词汇表的大小
norm_before_residual
–

在存储残差之前应用 hidden_norm

方法

serialize_transformer_config –

将 transformer 配置序列化为字典。
validate_transformer_config –

验证并转换 transformer 配置。

属性

target_vocab_size (int) –

从 transformer 配置获取目标词汇表大小。

源代码位于 speculators/config.py

def __init__(self, **kwargs):
    # initialize the Pydantic arguments first to set all valid fields
    PydanticClassRegistryMixin.__init__(self, **kwargs)

    # reset kwargs handled by Pydantic so PretrainedConfig doesn't override
    for field in self.__class__.model_fields:
        kwargs[field] = getattr(self, field)

    # initialize the Hugging Face PretrainedConfig arguments for the model
    PretrainedConfig.__init__(self, **kwargs)

    # ensure we always update the transformers version
    self.transformers_version = version("transformers")

target_vocab_size `property`

target_vocab_size: int

从 transformer 配置获取目标词汇表大小。

serialize_transformer_config

serialize_transformer_config(
    value: PretrainedConfig,
) -> dict

将 transformer 配置序列化为字典。

源代码在 speculators/models/eagle3/config.py

@field_serializer("transformer_layer_config")
def serialize_transformer_config(self, value: PretrainedConfig) -> dict:
    """Serialize transformer config to dict."""
    return value.to_diff_dict()

validate_transformer_config `classmethod`

validate_transformer_config(value: Any) -> PretrainedConfig

验证并转换 transformer 配置。

源代码在 speculators/models/eagle3/config.py

@field_validator("transformer_layer_config", mode="before")
@classmethod
def validate_transformer_config(cls, value: Any) -> PretrainedConfig:
    """Validate and convert transformer config."""
    if isinstance(value, dict):
        config_class: type[PretrainedConfig] = LlamaConfig
        if "model_type" in value:
            config_class = AutoConfig.for_model(
                model_type=value["model_type"]
            ).__class__
        return config_class(**value)
    return value

EagleSpeculator

EagleSpeculator(
    config: EagleSpeculatorConfig,
    verifier: str
    | PathLike
    | PreTrainedModel
    | None = None,
    verifier_attachment_mode: Literal[
        "detached", "full", "train_only"
    ]
    | None = None,
)

基类: SpeculatorModel

用于预测解码的 EAGLE 和 HASS 变体的 SpeculatorModel 实现： - Eagle / Eagle v1: https://arxiv.org/abs/2401.15077 - Eagle v2: https://arxiv.org/abs/2406.16858 - HASS: https://arxiv.org/abs/2408.15766

架构概述：EAGLE 预测器由以下部分组成： 1. 输入嵌入层（与验证器共享） 2. 可选的嵌入层归一化 3. 融合层：将输入嵌入 + 验证器隐藏状态连接并投影到隐藏大小的潜在空间 4. 单个 transformer 解码器层用于候选 token 生成 5. 可选的预 LM 头层归一化 6. 语言模型头（与验证器共享）

预测解码过程： 1. 验证器模型处理输入并生成隐藏状态 2. EAGLE 预测器使用这些隐藏状态 + 输入嵌入来预测下一个 token 3. 使用 token 提议方法并行生成多个候选 token 4. 验证器根据概率阈值验证候选 token 并接受/拒绝 5. 该过程迭代进行以进行多 token 预测

示例

from speculators import SpeculatorsConfig, VerifierConfig
from speculators.models import EagleSpeculator, EagleSpeculatorConfig
from speculators.proposals import GreedyTokenProposalConfig
from transformers import AutoConfig, AutoTokenizer

config = EagleSpeculatorConfig(
    transformer_layer_config=AutoConfig.from_pretrained("meta-llama/Llama-3.1-8B-Instruct"),
    speculators_config=SpeculatorsConfig(
        algorithm="eagle",
        proposal_methods=[
            GreedyTokenProposalConfig(),
        ],
        default_proposal_method="greedy",
        verifier=VerifierConfig(
            name_or_path="meta-llama/Llama-3.1-8B-Instruct",
            architectures=["LlamaForCausalLM"],
        )
)
speculator = EagleSpeculator(
    config, verifier=verifier, verifier_attachment_mode="full"
)

初始化 EAGLE 预测器架构，根据提供的配置配置可配置组件。模型从 verifier-dependent layers（embed_tokens, rotary_emb, lm_head）开始，设置为 None，直到附加 verifier。

参数

config
(EagleSpeculatorConfig) –

指定模型架构、层设置和预测解码参数的配置对象。必须是 EagleSpeculatorConfig 的实例，其中包含 transformer 层配置和 EAGLE 特定的设置。
verifier
(str | PathLike | PreTrainedModel | None, 默认值: None ) –

可选的验证器模型，用于附加到预测器进行预测解码。可以是模型目录的路径、Hugging Face 模型标识符或 PreTrainedModel 实例。如果为 None，则必须稍后通过 attach_verifier() 附加，然后才能使用模型。
verifier_attachment_mode
(Literal['detached', 'full', 'train_only'] | None, 默认值: None ) –

验证器附加模式。“detached”模式即使提供了验证器也不会附加。“full”模式启用用于训练和生成的完整集成。“train_only”模式仅附加所需的组件进行训练，优化内存使用。

方法

attach_verifier –

将验证器模型附加到 EagleSpeculator 进行预测解码。
detach_verifier –

移除对已附加验证器模型的引用并释放
forward –

执行前向传播以生成预测 token。

源代码在 speculators/models/eagle.py

def __init__(
    self,
    config: EagleSpeculatorConfig,
    verifier: str | os.PathLike | PreTrainedModel | None = None,
    verifier_attachment_mode: Literal["detached", "full", "train_only"]
    | None = None,
):
    """
    Initializes an EAGLE speculator architecture with configurable components based
    on the provided configuration. The model starts with verifier-dependent layers
    (embed_tokens, rotary_emb, lm_head) set to None until a verifier is attached.

    :param config: Configuration object specifying model architecture, layer
        settings, and speculative decoding parameters. Must be an instance of
        EagleSpeculatorConfig containing transformer layer configuration and
        EAGLE-specific settings.
    :param verifier: Optional verifier model to attach for speculative decoding.
        Can be a path to a model directory, Hugging Face model identifier, or
        PreTrainedModel instance. If None, must be attached later via
        attach_verifier() before using the model.
    :param verifier_attachment_mode: Mode for verifier attachment. "detached"
        prevents attachment even if verifier is provided. "full" enables
        complete integration for both training and generation. "train_only"
        attaches only components needed for training, optimizing memory usage.
    """
    if not isinstance(config, EagleSpeculatorConfig):
        raise ValueError(
            "config must be an instance of EagleSpeculatorConfig, "
            f"got {type(config)} instead."
        )

    # Initialize model parameters from config
    self.vocab_size = config.transformer_layer_config.vocab_size
    self.hidden_size = config.transformer_layer_config.hidden_size
    self.padding_idx = config.transformer_layer_config.pad_token_id

    # Set layers pulled from the verifier to None until attach is called
    self.embed_tokens: nn.Embedding | None = None
    self.rotary_emb: nn.Module | None = None
    self.lm_head: nn.Linear | None = None

    # Delayed initialization to ensure everything needed for attach_verifier is set
    super().__init__(
        config=config,
        verifier=verifier,
        verifier_attachment_mode=verifier_attachment_mode,
    )

    self._decoder_class, self._layernorm_class = self._import_model_classes()
    # Initialize layers based on the configuration
    self.embedding_layernorm: nn.Module | None = self._create_layernorm()
    self.fusion_fc: nn.Linear = nn.Linear(
        2 * self.hidden_size,
        self.hidden_size,
        bias=config.fusion_bias,
    )
    self.transformer: nn.Module = self._create_transformer_layer()
    self.pre_lm_head_layernorm: nn.Module | None = self._create_layernorm()

    self.post_init()  # type: ignore[attr-defined]

attach_verifier

attach_verifier(
    verifier: str | PathLike | PreTrainedModel,
    mode: Literal["full", "train_only"] | None = None,
)

将验证器模型附加到 EagleSpeculator 进行预测解码。利用验证器的 embed_tokens、rotary_emb 和 lm_head 层进行预测器的前向传播和生成方法。此外，对于 `generate`，它使用验证器的隐藏状态来生成预测 token 预测。

如果模式为“full”，则验证器将完全集成，用于 `generate` 和 `forward` 方法。

如果模式为“train_only”，则仅附加前向传播所需的验证器层，从而在训练期间更好地利用资源。直到附加了完整的验证器，`generate` 才可用。

示例

# Load and attach a verifier
verifier = EagleSpeculator(...)

# For generation
speculator.attach_verifier(verifier)
outputs = speculator.generate(input_ids)
speculator.detach_verifier()

# For training
speculator.attach_verifier(verifier, mode="train_only")
outputs = speculator(input_ids, hidden_states)
speculator.detach_verifier()

参数

verifier
(str | PathLike | PreTrainedModel) –

要附加的验证器模型。这可以是本地模型目录的路径、Hugging Face 模型标识符或 PreTrainedModel 实例。如果提供了路径或标识符，模型将自动加载。如果提供了实例，它将直接使用。
模式
(Literal['full', 'train_only'] | None, 默认值: None ) –

附加验证器的模式。可以是“full”或“train_only”。如果为 None，则默认为“full”。在“train_only”模式下，仅附加前向传播所需的层，并且在附加了完整的验证器之前，预测器无法执行生成。

返回

–

已附加的验证器的 PreTrainedModel 实例。

源代码在 speculators/models/eagle.py

def attach_verifier(
    self,
    verifier: str | os.PathLike | PreTrainedModel,
    mode: Literal["full", "train_only"] | None = None,
):
    """
    Attach a verifier model to the EagleSpeculator for speculative decoding.
    Utilizes the verifier's embed_tokens, rotary_emb, and lm_head layers
    for the speculator's forward pass and generation methods.
    Additionally, for `generate`, it uses the verifier's hidden states
    to generate speculative token predictions.

    If mode is "full", the verifier is fully integrated for use with
    both `generate` and `forward` methods.

    If mode is "train_only", only the verifier's layers required for a forward pass
    are attached, allowing for better resource utilization during training.
    `generate` will not be available until a full verifier is attached.

    Example:
        ```python
        # Load and attach a verifier
        verifier = EagleSpeculator(...)

        # For generation
        speculator.attach_verifier(verifier)
        outputs = speculator.generate(input_ids)
        speculator.detach_verifier()

        # For training
        speculator.attach_verifier(verifier, mode="train_only")
        outputs = speculator(input_ids, hidden_states)
        speculator.detach_verifier()
        ```

    :param verifier: The verifier model to attach. This can be a path to a local
        model directory, a Hugging Face model identifier, or an instance of
        PreTrainedModel. If a path or identifier is provided, the model will be
        loaded automatically. If an instance is provided, it will be used directly.
    :param mode: The mode for attaching the verifier. Can be "full" or "train_only".
        If None, defaults to "full". In "train_only" mode, only the layers
        required for a forward pass are attached, and the speculator cannot
        perform generation until a full verifier is attached.
    :return: The PreTrainedModel instance for the verifier that was attached.
    """
    super().attach_verifier(verifier=verifier, mode=mode)

    if self.verifier_attachment_mode == "train_only":
        verifier_model = self.resolve_verifier(verifier)
    elif self.verifier_attachment_mode == "full":
        verifier_model = cast("PreTrainedModel", self.verifier)
    else:
        return

    if hasattr(verifier_model, "model"):
        self.embed_tokens = verifier_model.model.embed_tokens  # type: ignore[assignment,union-attr]
        self.rotary_emb = verifier_model.model.rotary_emb  # type: ignore[assignment,union-attr]
    else:
        # Bare model structure
        self.embed_tokens = verifier_model.embed_tokens  # type: ignore[assignment,attr-defined]
        self.rotary_emb = verifier_model.rotary_emb  # type: ignore[assignment,attr-defined]

    # lm_head is always at the top level of the verifier
    self.lm_head = verifier_model.lm_head  # type: ignore[assignment,attr-defined]

detach_verifier

detach_verifier()

移除对已附加验证器模型的引用并释放关联的内存。调用此方法后，推测器将无法执行前向传播或生成，直到附加新的验证器。

源代码在 speculators/models/eagle.py

def detach_verifier(self):
    """
    Removes the reference to the attached verifier model and frees up the
    associated memory. After calling this method, the speculator will not
    be able to perform forward passes or generation until a new verifier
    is attached.
    """
    super().detach_verifier()

    del self.embed_tokens
    self.embed_tokens = None
    del self.rotary_emb
    self.rotary_emb = None
    del self.lm_head
    self.lm_head = None

forward

forward(
    input_ids: LongTensor,
    hidden_states: FloatTensor,
    attention_mask: Tensor | None = None,
    position_ids: LongTensor | None = None,
    past_key_values: tuple[tuple[FloatTensor]]
    | None = None,
    use_cache: bool | None = None,
    output_attentions: bool | None = None,
    output_hidden_states: bool | None = None,
    return_dict: bool | None = None,
) -> torch.FloatTensor | CausalLMOutputWithPast

执行前向传播以生成预测 token。

通过 EAGLE 架构处理输入 token 和验证器隐藏状态，以生成用于预测解码的候选 token。该方法通过融合层将输入嵌入与验证器隐藏状态结合，并通过 transformer 解码器层进行处理，从而产生下一个 token 预测的 logits。

参数

input_ids
(LongTensor) –

当前输入序列的 token ID。形状：(batch_size, sequence_length)。这些代表将被转换为嵌入并与验证器隐藏状态合并的 token。
hidden_states
(FloatTensor) –

来自验证器模型与输入序列对应的隐藏状态表示。形状：(batch_size, sequence_length, hidden_size)。这些捕获了验证器对上下文的理解。
attention_mask
(Tensor | None, 默认值: None ) –

可选的注意力掩码，用于避免关注填充 token。形状：(batch_size, sequence_length)（2D）或 (batch_size, 1, sequence_length, sequence_length)（4D 因果掩码）。
position_ids
(LongTensor | None, 默认值: None ) –

序列中 token 的可选位置索引。形状：(batch_size, sequence_length)。如果为 None，则根据序列长度和过去的键值自动生成。
past_key_values
(tuple[tuple[FloatTensor]] | None, 默认值: None ) –

来自先前前向传播的可选缓存键值状态，用于高效生成。层键值对的元组。
use_cache
(bool | None, 默认值: None ) –

是否返回键值状态以供后续前向传播缓存。对自回归生成效率有用。
output_attentions
(bool | None, 默认值: None ) –

是否返回 transformer 层中的注意力权重。用于分析和可视化。
output_hidden_states
(bool | None, 默认值: None ) –

是否返回 transformer 层中的隐藏状态。在此模型中当前未实现。
return_dict
(bool | None, 默认值: None ) –

是否返回结构化的 CausalLMOutputWithPast 而不是原始 logits。如果为 None，则使用 config.use_return_dict 默认值。

返回

FloatTensor | CausalLMOutputWithPast –

如果 return_dict=False，则为原始 logits 张量 (batch_size, sequence_length, vocab_size)，或者为包含 logits、过去键值和可选注意力权重的 CausalLMOutputWithPast。

引发

ValueError –

如果未附加验证器组件（embed_tokens、rotary_emb、lm_head）。在使用 forward() 之前调用 attach_verifier()。

源代码在 speculators/models/eagle.py

def forward(
    self,
    input_ids: torch.LongTensor,
    hidden_states: torch.FloatTensor,
    attention_mask: torch.Tensor | None = None,
    position_ids: torch.LongTensor | None = None,
    past_key_values: tuple[tuple[torch.FloatTensor]] | None = None,
    use_cache: bool | None = None,
    output_attentions: bool | None = None,
    output_hidden_states: bool | None = None,  # noqa: ARG002
    return_dict: bool | None = None,
) -> torch.FloatTensor | CausalLMOutputWithPast:
    """
    Execute the forward pass for speculative token generation.

    Processes input tokens and verifier hidden states through the EAGLE architecture
    to generate candidate tokens for speculative decoding. The method combines input
    embeddings with verifier hidden states via a fusion layer, processes them
    through a transformer decoder layer, and produces logits for next token
    prediction.

    :param input_ids: Token IDs for the current input sequence. Shape: (batch_size,
        sequence_length). These represent the tokens that will be converted to
        embeddings and combined with verifier hidden states.
    :param hidden_states: Hidden state representations from the verifier model
        corresponding to the input sequence. Shape: (batch_size, sequence_length,
        hidden_size). These capture the verifier's understanding of the context.
    :param attention_mask: Optional attention mask to avoid attending to padding
        tokens. Shape: (batch_size, sequence_length) for 2D or (batch_size, 1,
        sequence_length, sequence_length) for 4D causal mask.
    :param position_ids: Optional position indices for tokens in the sequence.
        Shape: (batch_size, sequence_length). If None, auto-generated based on
        sequence length and past key values.
    :param past_key_values: Optional cached key-value states from previous forward
        passes for efficient generation. Tuple of layer key-value pairs.
    :param use_cache: Whether to return key-value states for caching in subsequent
        forward passes. Useful for autoregressive generation efficiency.
    :param output_attentions: Whether to return attention weights from the
        transformer layer. Used for analysis and visualization.
    :param output_hidden_states: Whether to return hidden states from the
        transformer layer. Currently not implemented in this model.
    :param return_dict: Whether to return structured CausalLMOutputWithPast instead
        of raw logits. If None, uses config.use_return_dict default.
    :return: Either raw logits tensor (batch_size, sequence_length, vocab_size) if
        return_dict=False, or CausalLMOutputWithPast containing logits, past key
        values, and optional attention weights.
    :raises ValueError: If verifier components (embed_tokens, rotary_emb, lm_head)
        are not attached. Call attach_verifier() before using forward().
    """
    if self.embed_tokens is None or self.rotary_emb is None or self.lm_head is None:
        raise ValueError(
            "Verifier model layers not initialized. "
            "Call `attach_verifier` to set up the model before using forward."
        )

    return_dict = (
        return_dict if return_dict is not None else self.config.use_return_dict
    )

    inputs_embeds = self.embed_tokens(input_ids)
    if self.embedding_layernorm is not None:
        inputs_embeds = self.embedding_layernorm(inputs_embeds)

    hidden_states = self.fusion_fc(
        torch.cat([inputs_embeds, hidden_states], dim=-1)
    )
    hidden_states, attention_mask, position_ids = self._prepare_decoder_inputs(
        hidden_states, attention_mask, position_ids, past_key_values
    )

    cos, sin = self.rotary_emb(hidden_states, position_ids)
    layer_outputs = self.transformer(
        hidden_states,
        attention_mask=attention_mask,
        position_ids=position_ids,
        past_key_value=past_key_values[0] if past_key_values else None,
        output_attentions=output_attentions,
        use_cache=use_cache,
        position_embeddings=(cos, sin),
    )
    hidden_states = layer_outputs[0]

    if self.pre_lm_head_layernorm is not None:
        hidden_states = self.pre_lm_head_layernorm(hidden_states)

    logits = self.lm_head(hidden_states)

    if not return_dict:
        return logits

    return CausalLMOutputWithPast(
        logits=logits,
        past_key_values=layer_outputs[1] if use_cache else None,
        hidden_states=None,
        attentions=None,
    )

EagleSpeculatorConfig

EagleSpeculatorConfig(**kwargs)

基类: SpeculatorModelConfig

用于预测解码的 EAGLE 和 HASS 变体的 EagleSpeculator 的 SpeculatorModelConfig 实现： - Eagle / Eagle v1: https://arxiv.org/abs/2401.15077 - Eagle v2: https://arxiv.org/abs/2406.16858 - HASS: https://arxiv.org/abs/2408.15766

模型配置： - EAGLE1: layernorms=False, fusion_bias=False - EAGLE2: layernorms=False, fusion_bias=False - HASS: layernorms=False, fusion_bias=True

示例

from speculators import SpeculatorsConfig, VerifierConfig
from speculators.models import EagleSpeculatorConfig
from speculators.proposals import GreedyTokenProposalConfig
from transformers import AutoConfig

config = EagleSpeculatorConfig(
    transformer_layer_config=AutoConfig.from_pretrained("meta-llama/Llama-3.1-8B-Instruct"),
    speculators_config=SpeculatorsConfig(
        algorithm="eagle",
        proposal_methods=[
            GreedyTokenProposalConfig(),
        ],
        default_proposal_method="greedy",
        verifier=VerifierConfig(
            name_or_path="meta-llama/Llama-3.1-8B-Instruct",
            architectures=["LlamaForCausalLM"],
        )
)

方法

check_add_architectures –

如果 transformer 层架构未包含在内，则自动将其添加到
serialize_transformer_layer_config –

将 transformer_layer_config 序列化为字典，用于 JSON 存储。
validate_transformer_layer_config –

验证并转换 transformer_layer_config 为 PretrainedConfig 实例。

源代码位于 speculators/config.py

def __init__(self, **kwargs):
    # initialize the Pydantic arguments first to set all valid fields
    PydanticClassRegistryMixin.__init__(self, **kwargs)

    # reset kwargs handled by Pydantic so PretrainedConfig doesn't override
    for field in self.__class__.model_fields:
        kwargs[field] = getattr(self, field)

    # initialize the Hugging Face PretrainedConfig arguments for the model
    PretrainedConfig.__init__(self, **kwargs)

    # ensure we always update the transformers version
    self.transformers_version = version("transformers")

check_add_architectures

check_add_architectures() -> Self

如果 transformer 层架构未包含在内，则自动将其添加到架构列表中。

返回

Self –

已验证的配置实例，包含更新后的架构

源代码在 speculators/models/eagle.py

@model_validator(mode="after")
def check_add_architectures(self) -> Self:
    """
    Automatically adds the transformer layer architecture to the
    architectures list if it's not already present.

    :return: The validated configuration instance with updated architectures
    """
    if (
        self.transformer_layer_architecture != "auto"
        and self.transformer_layer_architecture not in self.architectures
    ):
        self.architectures.append(self.transformer_layer_architecture)

    return self

serialize_transformer_layer_config

serialize_transformer_layer_config(
    value: PretrainedConfig,
) -> dict

将 transformer_layer_config 序列化为字典，用于 JSON 存储。

使用 to_diff_dict() 将 PretrainedConfig 对象转换为其字典表示形式，仅包含非默认值。

参数

值
(PretrainedConfig) –

要序列化的 PretrainedConfig 实例

返回

dict –

Transformer 层配置的字典表示

源代码在 speculators/models/eagle.py

@field_serializer("transformer_layer_config")
def serialize_transformer_layer_config(self, value: PretrainedConfig) -> dict:
    """
    Serialize the transformer_layer_config to a dictionary for JSON storage.

    Converts the PretrainedConfig object to its dictionary representation
    using to_diff_dict() to only include non-default values.

    :param value: The PretrainedConfig instance to serialize
    :return: Dictionary representation of the transformer layer configuration
    """
    return value.to_diff_dict()

validate_transformer_layer_config `classmethod`

validate_transformer_layer_config(
    value: Any,
) -> PretrainedConfig

验证并转换 transformer_layer_config 为 PretrainedConfig 实例。

接受可以转换为 PretrainedConfig 的字典或现有的 PretrainedConfig 实例。

参数

值
(Any) –

要验证的值（字典或 PretrainedConfig）

返回

PretrainedConfig –

已验证的 PretrainedConfig 实例

引发

ValueError –

如果值无法转换为 PretrainedConfig

源代码在 speculators/models/eagle.py

@field_validator("transformer_layer_config", mode="before")
@classmethod
def validate_transformer_layer_config(cls, value: Any) -> PretrainedConfig:
    """
    Validate and convert transformer_layer_config to a PretrainedConfig instance.

    Accepts either a dictionary that can be converted to a PretrainedConfig
    or an existing PretrainedConfig instance.

    :param value: The value to validate (dict or PretrainedConfig)
    :return: A validated PretrainedConfig instance
    :raises ValueError: If the value cannot be converted to a PretrainedConfig
    """
    if isinstance(value, dict):
        return AutoConfig.for_model(**value)
    if isinstance(value, PretrainedConfig):
        return value

    raise ValueError(
        "transformer_layer_config must be a PretrainedConfig instance or a "
        "dictionary that can be converted to a PretrainedConfig."
    )

MLPSpeculatorConfig

MLPSpeculatorConfig(**kwargs)

基类: SpeculatorModelConfig

待办事项

源代码位于 speculators/config.py

def __init__(self, **kwargs):
    # initialize the Pydantic arguments first to set all valid fields
    PydanticClassRegistryMixin.__init__(self, **kwargs)

    # reset kwargs handled by Pydantic so PretrainedConfig doesn't override
    for field in self.__class__.model_fields:
        kwargs[field] = getattr(self, field)

    # initialize the Hugging Face PretrainedConfig arguments for the model
    PretrainedConfig.__init__(self, **kwargs)

    # ensure we always update the transformers version
    self.transformers_version = version("transformers")

speculators.models

Eagle3SpeculatorConfig

`transformer_layer_config`

`draft_vocab_size`

`norm_before_residual`

target_vocab_size `property`

serialize_transformer_config

validate_transformer_config `classmethod`

EagleSpeculator

`config`

`verifier`

`verifier_attachment_mode`

attach_verifier

`verifier`

`模式`

detach_verifier

forward

`input_ids`

`hidden_states`

`attention_mask`

`position_ids`

`past_key_values`

`use_cache`

`output_attentions`

`output_hidden_states`

`return_dict`

EagleSpeculatorConfig

check_add_architectures

serialize_transformer_layer_config

`值`

validate_transformer_layer_config `classmethod`

`值`

MLPSpeculatorConfig

speculators.models

Eagle3SpeculatorConfig

transformer_layer_config

draft_vocab_size

norm_before_residual

target_vocab_size property

serialize_transformer_config

validate_transformer_config classmethod

EagleSpeculator

config

verifier

verifier_attachment_mode

attach_verifier

verifier

模式

detach_verifier

forward

input_ids

hidden_states

attention_mask

position_ids

past_key_values

use_cache

output_attentions

output_hidden_states

return_dict

EagleSpeculatorConfig

check_add_architectures

serialize_transformer_layer_config

值

validate_transformer_layer_config classmethod

值

MLPSpeculatorConfig

`transformer_layer_config`

`draft_vocab_size`

`norm_before_residual`

target_vocab_size `property`

validate_transformer_config `classmethod`

`config`

`verifier`

`verifier_attachment_mode`

`verifier`

`模式`

`input_ids`

`hidden_states`

`attention_mask`

`position_ids`

`past_key_values`

`use_cache`

`output_attentions`

`output_hidden_states`

`return_dict`

`值`

validate_transformer_layer_config `classmethod`

`值`