InternVL3.5 使用指南¶
InternVL3.5 是由上海人工智能实验室开发的一个视觉语言模型。本指南将介绍如何使用 vLLM 部署 InternVL3.5,并提供一些简单的 API 使用示例。
安装 vLLM¶
使用 vLLM 启动 InternVL3.5¶
API 使用示例¶
纯文本聊天¶
from openai import OpenAI
client = OpenAI(api_key='', base_url='http://0.0.0.0:8000/v1')
model_name = client.models.list().data[0].id
response = client.chat.completions.create(
model=model_name,
messages=[{
'role':
'user',
'content': [{
'type': 'text',
'text': '9.11 and 9.8, which is greater?',
}],
}],
temperature=0.6,
top_p=0.95,
)
print(response.choices[0].message.content)
图像聊天¶
单张图像¶
from openai import OpenAI
client = OpenAI(api_key='', base_url='http://0.0.0.0:8000/v1')
model_name = client.models.list().data[0].id
response = client.chat.completions.create(
model=model_name,
messages=[{
'role':
'user',
'content': [{
'type': 'text',
'text': 'Describe the image.',
}, {
'type': 'image_url',
'image_url': {'url': 'https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg'},
}],
}],
temperature=0.0
)
print(response.choices[0].message.content)
多张图像¶
from openai import OpenAI
client = OpenAI(api_key='', base_url='http://0.0.0.0:8000/v1')
model_name = client.models.list().data[0].id
response = client.chat.completions.create(
model=model_name,
messages=[{
'role':
'user',
'content': [{
'type': 'text',
'text': 'Describe these two images.',
}, {
'type': 'image_url',
'image_url': {'url': 'https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/demo/resources/human-pose.jpg'},
}, {
'type': 'image_url',
'image_url': {'url': 'https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/demo/resources/det.jpg'},
}],
}],
temperature=0.0
)
print(response.choices[0].message.content)
思考模式¶
要启用思考模式,请将系统提示设置为我们的思考模式系统提示。启用思考模式时,我们建议将 temperature 设置为 0.6 以减轻不期望的重复。
from openai import OpenAI
client = OpenAI(api_key='', base_url='http://0.0.0.0:8000/v1')
model_name = client.models.list().data[0].id
THINKING_SYSTEM_PROMPT = """
You are an AI assistant that rigorously follows this response protocol:
1. First, conduct a detailed analysis of the question. Consider different angles, potential solutions, and reason through the problem step-by-step. Enclose this entire thinking process within <think> and </think> tags.
2. After the thinking section, provide a clear, concise, and direct answer to the user's question. Separate the answer from the think section with a newline.
Ensure that the thinking process is thorough but remains focused on the query. The final answer should be standalone and not reference the thinking section.
""".strip()
response = client.chat.completions.create(
model=model_name,
messages=[{
'role': 'system',
'content': [{
'type': 'text',
'text': THINKING_SYSTEM_PROMPT,
}],
}, {
'role': 'user',
'content': [{
'type': 'text',
'text': '9.11 and 9.8, which is greater?',
}],
}],
temperature=0.6,
top_p=0.95,
)
print(response.choices[0].message.content)