测试文件结构和风格指南¶
为了确保项目的可维护性和可持续发展,我们鼓励贡献者在提交代码更改的同时提交测试代码(单元测试、系统测试或端到端测试)。本文档概述了组织和命名测试文件的指南。
测试类型¶
单元测试和系统测试¶
对于单元测试和系统测试,我们强烈建议将测试文件放置在被测试源代码的相同目录结构中,并使用 test_*.py 的命名约定。
模型的端到端 (E2E) 测试¶
端到端测试用于验证系统或组件的完整功能。对于我们的项目,不同 Omni 模型的 E2E 测试被组织到两个子目录中
-
tests/e2e/offline_inference/:用于离线推理模式的测试(例如,Qwen3Omni 离线推理) -
tests/e2e/online_serving/:用于在线服务场景的测试(例如,API 服务器测试)
示例: vllm_omni/entrypoints/omni_llm.py 的测试文件应位于 tests/entrypoints/test_omni_llm.py。
测试目录结构¶
理想的目录结构应镜像源代码的组织结构
vllm_omni/ tests/
├── config/ → ├── config/
│ └── model.py │ └── test_model.py
│
├── core/ → ├── core/
│ └── sched/ │ └── sched/ # Maps to core/sched/
│ ├── omni_ar_scheduler.py │ ├── test_omni_ar_scheduler.py
│ ├── omni_generation_scheduler.py │ ├── test_omni_generation_scheduler.py
│ └── output.py │ └── test_output.py
│
├── diffusion/ → ├── diffusion/
│ ├── diffusion_engine.py │ ├── test_diffusion_engine.py
│ ├── omni_diffusion.py │ ├── test_omni_diffusion.py
│ ├── attention/ │ ├── attention/ # Maps to diffusion/attention/
│ │ └── backends/ │ │ └── test_*.py
│ ├── models/ │ ├── models/ # Maps to diffusion/models/
│ │ ├── qwen_image/ │ │ ├── qwen_image/
│ │ │ └── ... │ │ │ └── test_*.py
│ │ └── z_image/ │ │ └── z_image/
│ │ └── ... │ │ └── test_*.py
│ └── worker/ │ └── worker/ # Maps to diffusion/worker/
│ └── ... │ └── test_*.py
│
├── distributed/ → ├── distributed/
│ └── ... │ └── test_*.py
│
├── engine/ → ├── engine/
│ ├── processor.py │ ├── test_processor.py
│ └── output_processor.py │ └── test_output_processor.py
│
├── entrypoints/ → ├── entrypoints/
│ ├── omni_llm.py │ ├── test_omni_llm.py # UT: OmniLLM core logic (mocked)
│ ├── omni_stage.py │ ├── test_omni_stage.py # UT: OmniStage logic
│ ├── omni.py │ ├── test_omni.py # E2E: Omni class (offline inference)
│ ├── async_omni.py │ ├── test_async_omni.py # E2E: AsyncOmni class
│ ├── cli/ │ ├── cli/ # Maps to entrypoints/cli/
│ │ └── ... │ │ └── test_*.py
│ └── openai/ │ └── openai/ # Maps to entrypoints/openai/
│ ├── api_server.py │ ├── test_api_server.py # E2E: API server (online serving)
│ └── serving_chat.py │ └── test_serving_chat.py
│
├── inputs/ → ├── inputs/
│ ├── data.py │ ├── test_data.py
│ ├── parse.py │ ├── test_parse.py
│ └── preprocess.py │ └── test_preprocess.py
│
├── model_executor/ → ├── model_executor/
│ ├── layers/ │ ├── layers/
│ │ └── mrope.py │ │ └── test_mrope.py
│ ├── model_loader/ │ ├── model_loader/
│ │ └── weight_utils.py │ │ └── test_weight_utils.py
│ ├── models/ │ ├── models/
│ │ ├── qwen2_5_omni/ │ │ ├── qwen2_5_omni/
│ │ │ ├── qwen2_5_omni_thinker.py │ │ │ ├── test_qwen2_5_omni_thinker.py # UT
│ │ │ ├── qwen2_5_omni_talker.py │ │ │ ├── test_qwen2_5_omni_talker.py # UT
│ │ │ └── qwen2_5_omni_token2wav.py │ │ │ └── test_qwen2_5_omni_token2wav.py # UT
│ │ └── qwen3_omni/ │ │ └── qwen3_omni/
│ │ └── ... │ │ └── test_*.py
│ ├── stage_configs/ │ └── stage_configs/ # Configuration tests (if needed)
│ │ └── ... │ └── test_*.py
│ └── stage_input_processors/ │ └── stage_input_processors/
│ └── ... │ └── test_*.py
│
├── sample/ → ├── sample/
│ └── ... │ └── test_*.py
│
├── utils/ → ├── utils/
│ └── platform_utils.py │ └── test_platform_utils.py
│
├── worker/ → ├── worker/
├── gpu_ar_worker.py │ ├── test_gpu_ar_worker.py
├── gpu_generation_worker.py │ ├── test_gpu_generation_worker.py
├── gpu_model_runner.py │ ├── test_gpu_model_runner.py
└── npu/ │ └── npu/ # Maps to worker/npu/
└── ... │ └── test_*.py
│
└── e2e/ → ├── e2e/ # End-to-end scenarios (no 1:1 source mirror)
├── online_serving/ # Full-stack online serving flows
│ └── (empty for now)
└── offline_inference/ # Full offline inference flows
├── test_qwen2_5_omni.py # Moved from multi_stages/
├── test_qwen3_omni.py # Moved from multi_stages_h100/
├── test_t2i_model.py # Moved from single_stage/
└── stage_configs/ # Shared stage configs
├── qwen2_5_omni_ci.yaml
└── qwen3_omni_ci.yaml
命名约定¶
- 单元/系统测试:使用
test_<module_name>.py格式 -
示例:
omni_llm.py→test_omni_llm.py -
E2E 测试:放置在
tests/e2e/offline_inference/或tests/e2e/online_serving/中,并使用描述性名称 - 示例:
tests/e2e/offline_inference/test_qwen3_omni.py,tests/e2e/offline_inference/test_diffusion_model.py
最佳实践¶
- 镜像源代码结构:测试目录应镜像源代码的结构
- 测试类型指示符:使用注释来指示测试类型(UT 表示单元测试,E2E 表示端到端测试)
- 共享资源:将共享的测试配置(例如,CI 配置)放置在适当的子目录中
- 一致的命名:在所有测试文件中始终遵循
test_*.py命名约定
测试代码要求¶
编码风格¶
- 文件头:为所有测试文件添加 SPDX 许可证头
- 导入:请勿使用手动
sys.path修改,请使用标准导入。 -
测试类型区分:
- 单元测试:保持模拟风格
- 模型的 E2E 测试:考虑统一使用 OmniRunner,避免使用装饰器
-
文档:为所有测试函数添加 docstrings
- 环境变量:统一在
conftest.py或文件顶部设置 - 类型注解:为所有测试函数参数添加类型注解
- 资源,使用 pytest 标签指定测试所需的计算资源。
模板¶
E2E - 在线服务¶
# SPDX-License-Identifier: Apache-2.0
# SPDX-FileCopyrightText: Copyright contributors to the vLLM project
"""
Online E2E smoke test for an omni model (video,text,audio → audio).
"""
from pathlib import Path
import pytest
import openai
# Optional: set process start method for workers
os.environ["VLLM_WORKER_MULTIPROC_METHOD"] = "spawn"
models = ["{your model name}"] #Edit here to load your model
stage_configs = [str(Path(__file__).parent / "stage_configs" / {your model yaml})] #Edit here to load your model yaml
test_params = [(model, stage_config) for model in models for stage_config in stage_configs]
#OmniServer,Used to start the vllm-omni server
class OmniServer:
xxx
@pytest.fixture
def omni_server(request):
model, stage_config_path = request.param
with OmniServer(model, ["--stage-configs-path", stage_config_path]) as server:
yield server
#handle request message
@pytest.fixture(scope="session")
def base64_encoded_video() -> str:
xxx
@pytest.fixture(scope="session")
def dummy_messages_from_video_data(video_data_url: str, content_text: str) -> str:
xxx
@pytest.mark.parametrize("omni_server", test_params, indirect=True)
def test_video_to_audio(
client: openai.OpenAI,
omni_server,
base64_encoded_video: str,
) -> None:
#set message
video_data_url = f"data:video/mp4;base64, {base64_encoded_video}"
messages = dummy_messages_from_video_data(video_data_url)
#send request
chat_completion = client.chat.completions.create(
model=omni_server.model,
messages=messages,
)
#verify text output
text_choice = chat_completion.choices[0]
assert text_choice.finish_reason == "length"
#verify audio output
audio_choice = chat_completion.choices[1]
audio_message = audio_choice.message
if hasattr(audio_message, "audio") and audio_message.audio:
assert audio_message.audio.data is not None
assert len(audio_message.audio.data) > 0
E2E - 离线推理¶
# SPDX-License-Identifier: Apache-2.0
# SPDX-FileCopyrightText: Copyright contributors to the vLLM project
"""
Offline E2E smoke test for an omni model (video → audio).
"""
import os
from pathlib import Path
import pytest
from vllm.assets.video import VideoAsset
from ..multi_stages.conftest import OmniRunner
# Optional: set process start method for workers
os.environ["VLLM_WORKER_MULTIPROC_METHOD"] = "spawn"
models = ["{your model name}"] #Edit here to load your model
stage_configs = [str(Path(__file__).parent / "stage_configs" / {your model yaml})] #Edit here to load your model yaml
# Create parameter combinations for model and stage config
test_params = [(model, stage_config) for model in models for stage_config in stage_configs]
# function name: test_{input_modality}_to_{output_modality}
# modality candidate: text, image, audio, video, mixed_modalities
@pytest.mark.gpu_mem_high # requires high-memory GPU node
@pytest.mark.parametrize("test_config", test_params)
def test_video_to_audio(omni_runner: type[OmniRunner], model: str) -> None:
"""Offline inference: video input, audio output."""
model, stage_config_path = test_config
with omni_runner(model, seed=42, stage_configs_path=stage_config_path) as runner:
# Prepare inputs
video = VideoAsset(name="sample", num_frames=4).np_ndarrays
outputs = runner.generate_multimodal(
prompts="Describe this video briefly.",
videos=video,
)
# Minimal assertions: got outputs and at least one audio result
assert outputs
has_audio = any(o.final_output_type == "audio" for o in outputs)
assert has_audio
提交测试文件前的清单¶
- 文件已保存在合适的位置,文件名清晰。
- 编码风格符合上述要求。
- 对于 e2e 模型测试,请确保测试已在
./buildkite/文件夹下配置。