Image-To-Image¶
来源 https://github.com/vllm-project/vllm-omni/tree/main/examples/online_serving/image_to_image.
此示例演示了如何使用 vLLM-Omni 部署 Qwen-Image-Edit 模型以提供在线图像编辑服务。
对于多图像输入编辑,请使用Qwen-Image-Edit-2509 (QwenImageEditPlusPipeline),并在用户消息内容中发送多个图像。
启动服务器¶
基本启动¶
多图像编辑 (Qwen-Image-Edit-2509)¶
带参数启动¶
或者使用启动脚本
使用脚本为 Qwen-Image-Edit-2509 提供服务
API 调用¶
方法 1: 使用 curl (图像编辑)¶
# Image editing
bash run_curl_image_edit.sh input.png "Convert this image to watercolor style"
# Or execute directly
IMG_B64=$(base64 -w0 input.png)
curl -s https://:8092/v1/chat/completions \
-H "Content-Type: application/json" \
-d "{
\"messages\": [{
\"role\": \"user\",
\"content\": [
{\"type\": \"text\", \"text\": \"Convert this image to watercolor style\"},
{\"type\": \"image_url\", \"image_url\": {\"url\": \"data:image/png;base64,\$IMG_B64\"}}
]
}],
\"extra_body\": {
\"height\": 1024,
\"width\": 1024,
\"num_inference_steps\": 50,
\"guidance_scale\": 7.5,
\"seed\": 42
}
}" | jq -r '.choices[0].message.content[0].image_url.url' | cut -d',' -f2 | base64 -d > output.png
方法 2: 使用 Python 客户端¶
python openai_chat_client.py --input input.png --prompt "Convert to oil painting style" --output output.png
# Multi-image editing (Qwen-Image-Edit-2509 server required)
python openai_chat_client.py --input input1.png input2.png --prompt "Combine these images into a single scene" --output output.png
方法 3: 使用 Gradio Demo¶
请求格式¶
图像编辑 (使用 image_url 格式)¶
{
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "Convert this image to watercolor style"},
{"type": "image_url", "image_url": {"url": "data:image/png;base64,..."}}
]
}
]
}
图像编辑 (使用简化的 image 格式)¶
{
"messages": [
{
"role": "user",
"content": [
{"text": "Convert this image to watercolor style"},
{"image": "BASE64_IMAGE_DATA"}
]
}
]
}
带参数的图像编辑¶
使用 extra_body 传递生成参数
{
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "Convert to ink wash painting style"},
{"type": "image_url", "image_url": {"url": "data:image/png;base64,..."}}
]
}
],
"extra_body": {
"height": 1024,
"width": 1024,
"num_inference_steps": 50,
"guidance_scale": 7.5,
"seed": 42
}
}
多图像编辑 (Qwen-Image-Edit-2509)¶
在 content 中提供多个图像 (顺序很重要)
{
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "Combine these images into a single scene"},
{"type": "image_url", "image_url": {"url": "data:image/png;base64,..."} },
{"type": "image_url", "image_url": {"url": "data:image/png;base64,..."} }
]
}
]
}
生成参数 (extra_body)¶
| 参数 | 类型 | 默认值 | 描述 |
|---|---|---|---|
height | int | None | 输出图像高度 (像素) |
width | int | None | 输出图像宽度 (像素) |
size | str | None | 输出图像尺寸 (例如 "1024x1024") |
num_inference_steps | int | 50 | 去噪步数 |
guidance_scale | float | 7.5 | CFG 引导比例 |
seed | int | None | Random seed (reproducible) |
negative_prompt | str | None | 负面提示 |
num_outputs_per_prompt | int | 1 | 要生成的图像数量 |
响应格式¶
{
"id": "chatcmpl-xxx",
"created": 1234567890,
"model": "Qwen/Qwen-Image-Edit",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": [{
"type": "image_url",
"image_url": {
"url": "data:image/png;base64,..."
}
}]
},
"finish_reason": "stop"
}],
"usage": {...}
}
常见的编辑指令示例¶
| 指令 | 描述 |
|---|---|
将此图像转换为水彩风格 | 风格迁移 |
将图像转换为黑白 | 去饱和 |
增强色彩饱和度 | 色彩调整 |
转换为卡通风格 | 卡通化 |
添加复古滤镜效果 | 滤镜效果 |
将白天场景转换为夜晚场景 | 场景转换 |
文件描述¶
| 文件 | 描述 |
|---|---|
run_server.sh | 服务器启动脚本 |
run_curl_image_edit.sh | curl 图像编辑示例 |
openai_chat_client.py | Python 客户端 |
gradio_demo.py | Gradio 交互界面 |
示例材料¶
gradio_demo.py
#!/usr/bin/env python3
"""
Qwen-Image-Edit Gradio Demo for online serving.
Usage:
python gradio_demo.py [--server https://:8092] [--port 7861]
"""
import argparse
import base64
from io import BytesIO
import gradio as gr
import requests
from PIL import Image
def _pil_to_b64_png(img: Image.Image) -> str:
buffer = BytesIO()
img.save(buffer, format="PNG")
return base64.b64encode(buffer.getvalue()).decode("utf-8")
def edit_image(
input_image: Image.Image,
extra_images: list[str] | None,
prompt: str,
steps: int,
guidance_scale: float,
seed: int | None,
negative_prompt: str,
server_url: str,
) -> Image.Image | None:
"""Edit an image using the chat completions API."""
if input_image is None:
raise gr.Error("Please upload an image first")
images: list[Image.Image] = [input_image]
if extra_images:
for p in extra_images:
try:
images.append(Image.open(p).convert("RGB"))
except Exception as e:
raise gr.Error(f"Failed to open image: {p}. Error: {e}") from e
# Build user message with text and image
content: list[dict[str, object]] = [{"type": "text", "text": prompt}]
for img in images:
content.append({"type": "image_url", "image_url": {"url": f"data:image/png;base64,{_pil_to_b64_png(img)}"}})
messages = [
{
"role": "user",
"content": content,
}
]
# Build extra_body with generation parameters
extra_body = {
"num_inference_steps": steps,
"guidance_scale": guidance_scale,
}
if seed is not None and seed >= 0:
extra_body["seed"] = seed
if negative_prompt:
extra_body["negative_prompt"] = negative_prompt
# Build request payload
payload = {"messages": messages, "extra_body": extra_body}
try:
response = requests.post(
f"{server_url}/v1/chat/completions",
headers={"Content-Type": "application/json"},
json=payload,
timeout=300,
)
response.raise_for_status()
data = response.json()
content = data["choices"][0]["message"]["content"]
if isinstance(content, list) and len(content) > 0:
image_url = content[0].get("image_url", {}).get("url", "")
if image_url.startswith("data:image"):
_, b64_data = image_url.split(",", 1)
image_bytes = base64.b64decode(b64_data)
return Image.open(BytesIO(image_bytes))
return None
except Exception as e:
print(f"Error: {e}")
raise gr.Error(f"Edit failed: {e}")
def create_demo(server_url: str):
"""Create Gradio demo interface."""
with gr.Blocks(title="Qwen-Image-Edit Demo") as demo:
gr.Markdown("# Qwen-Image-Edit Online Editing")
gr.Markdown(
"Upload an image and describe the editing effect you want. "
"For multi-image editing, upload extra images (requires Qwen-Image-Edit-2509 server)."
)
with gr.Row():
with gr.Column(scale=1):
input_image = gr.Image(
label="Input Image",
type="pil",
)
extra_images = gr.File(
label="Additional Images (Optional)",
file_count="multiple",
type="filepath",
)
prompt = gr.Textbox(
label="Edit Instruction",
placeholder="Describe the editing effect you want...",
lines=2,
)
negative_prompt = gr.Textbox(
label="Negative Prompt",
placeholder="Describe what you don't want...",
lines=2,
)
with gr.Row():
steps = gr.Slider(
label="Inference Steps",
minimum=10,
maximum=100,
value=50,
step=5,
)
guidance_scale = gr.Slider(
label="Guidance Scale (CFG)",
minimum=1.0,
maximum=20.0,
value=7.5,
step=0.5,
)
with gr.Row():
seed = gr.Number(
label="Random Seed (-1 for random)",
value=-1,
precision=0,
)
edit_btn = gr.Button("Edit Image", variant="primary")
with gr.Column(scale=1):
output_image = gr.Image(
label="Edited Image",
type="pil",
)
# Examples
gr.Examples(
examples=[
["Convert this image to watercolor style"],
["Convert the image to black and white"],
["Enhance the color saturation"],
["Convert to cartoon style"],
["Add vintage filter effect"],
["Convert daytime to nighttime"],
["Convert to oil painting style"],
["Add dreamy blur effect"],
],
inputs=[prompt],
)
def process_edit(img, imgs, p, st, g, se, n):
actual_seed = se if se >= 0 else None
return edit_image(img, imgs, p, st, g, actual_seed, n, server_url)
edit_btn.click(
fn=process_edit,
inputs=[input_image, extra_images, prompt, steps, guidance_scale, seed, negative_prompt],
outputs=[output_image],
)
return demo
def main():
parser = argparse.ArgumentParser(description="Qwen-Image-Edit Gradio Demo")
parser.add_argument("--server", default="https://:8092", help="Server URL")
parser.add_argument("--port", type=int, default=7861, help="Gradio port")
parser.add_argument("--share", action="store_true", help="Create public link")
args = parser.parse_args()
print(f"Connecting to server: {args.server}")
demo = create_demo(args.server)
demo.launch(server_port=args.port, share=args.share)
if __name__ == "__main__":
main()
openai_chat_client.py
#!/usr/bin/env python3
"""
Qwen-Image-Edit OpenAI-compatible chat client for image editing.
Usage:
python openai_chat_client.py --input qwen_image_output.png --prompt "Convert to watercolor style" --output output.png
python openai_chat_client.py --input input.png --prompt "Convert to oil painting" --seed 42
python openai_chat_client.py --input input1.png input2.png --prompt "Combine these images into a single scene"
"""
import argparse
import base64
from io import BytesIO
from pathlib import Path
import requests
from PIL import Image
def _encode_image_as_data_url(input_path: Path) -> str:
image_bytes = input_path.read_bytes()
try:
img = Image.open(BytesIO(image_bytes))
mime_type = f"image/{img.format.lower()}" if img.format else "image/png"
except Exception:
mime_type = "image/png"
image_b64 = base64.b64encode(image_bytes).decode("utf-8")
return f"data:{mime_type};base64,{image_b64}"
def edit_image(
input_image: str | Path | list[str | Path],
prompt: str,
server_url: str = "https://:8092",
height: int | None = None,
width: int | None = None,
steps: int | None = None,
guidance_scale: float | None = None,
seed: int | None = None,
negative_prompt: str | None = None,
) -> bytes | None:
"""Edit an image using the chat completions API.
Args:
input_image: Path(s) to input image(s). For multi-image editing, pass multiple paths.
prompt: Text description of the edit
server_url: Server URL
height: Output image height in pixels
width: Output image width in pixels
steps: Number of inference steps
guidance_scale: CFG guidance scale
seed: Random seed
negative_prompt: Negative prompt
Returns:
Edited image bytes or None if failed
"""
input_images = input_image if isinstance(input_image, list) else [input_image]
input_paths = [Path(p) for p in input_images]
for p in input_paths:
if not p.exists():
print(f"Error: Input image not found: {p}")
return None
# Build user message with text and image
content: list[dict[str, object]] = [{"type": "text", "text": prompt}]
for p in input_paths:
content.append({"type": "image_url", "image_url": {"url": _encode_image_as_data_url(p)}})
messages = [
{
"role": "user",
"content": content,
}
]
# Build extra_body with generation parameters
extra_body = {}
if steps is not None:
extra_body["num_inference_steps"] = steps
if guidance_scale is not None:
extra_body["guidance_scale"] = guidance_scale
if seed is not None:
extra_body["seed"] = seed
if negative_prompt:
extra_body["negative_prompt"] = negative_prompt
# Build request payload
payload = {"messages": messages}
if extra_body:
payload["extra_body"] = extra_body
# Send request
try:
response = requests.post(
f"{server_url}/v1/chat/completions",
headers={"Content-Type": "application/json"},
json=payload,
timeout=300,
)
response.raise_for_status()
data = response.json()
# Extract image from response
content = data["choices"][0]["message"]["content"]
if isinstance(content, list) and len(content) > 0:
image_url = content[0].get("image_url", {}).get("url", "")
if image_url.startswith("data:image"):
_, b64_data = image_url.split(",", 1)
return base64.b64decode(b64_data)
print(f"Unexpected response format: {content}")
return None
except Exception as e:
print(f"Error: {e}")
return None
def main():
parser = argparse.ArgumentParser(description="Qwen-Image-Edit chat client")
parser.add_argument("--input", "-i", required=True, nargs="+", help="Input image path(s)")
parser.add_argument("--prompt", "-p", required=True, help="Edit prompt")
parser.add_argument("--output", "-o", default="output.png", help="Output file")
parser.add_argument("--server", "-s", default="https://:8092", help="Server URL")
parser.add_argument("--height", type=int, default=1024, help="Output image height")
parser.add_argument("--width", type=int, default=1024, help="Output image width")
parser.add_argument("--steps", type=int, default=50, help="Inference steps")
parser.add_argument("--guidance", type=float, default=7.5, help="Guidance scale")
parser.add_argument("--seed", type=int, help="Random seed")
parser.add_argument("--negative", help="Negative prompt")
args = parser.parse_args()
if len(args.input) == 1:
print(f"Input: {args.input[0]}")
else:
print(f"Inputs ({len(args.input)}): {', '.join(args.input)}")
print(f"Prompt: {args.prompt}")
image_bytes = edit_image(
input_image=args.input,
prompt=args.prompt,
server_url=args.server,
height=args.height,
width=args.width,
steps=args.steps,
guidance_scale=args.guidance,
seed=args.seed,
negative_prompt=args.negative,
)
if image_bytes:
output_path = Path(args.output)
output_path.write_bytes(image_bytes)
print(f"Image saved to: {output_path}")
print(f"Size: {len(image_bytes) / 1024:.1f} KB")
else:
print("Failed to edit image")
exit(1)
if __name__ == "__main__":
main()