Qwen2.5-VL-3B-Instruct
Qwen2.5-VL-3B-Instruct is a multimodal model that processes images and text together to perform visual reasoning, captioning, question answering, and structured output tasks. It integrates a vision encoder with an instruction-tuned language backbone to support complex visual understanding and interactive multimodal responses.
API Reference
INTEGRATE
Complete technical specification for integration
Request Response
Examples 4
CODE
Ready-to-use code snippets for common workflows
Taxidermy Workshop Curiosity Cabinet
In a dimly lit, sepia-toned laboratory filled with shelves of jars containing various specimens, two birds stand on a wooden desk. The desk is cluttered with tools, including a magnifying glass, a small knife, and a green and red object that resembles a toy car. A large glass dome encases a collection of feathers and other natural items. To the left, a book lies open, while to the right, a skull and other anatomical models are displayed. {
"taskType": "caption",
"taskUUID": "74c7c126-bc3d-4e55-b8fa-36e20e2a4e62",
"model": "runware:152@1",
"prompt": "Write a precise 5-7 sentence caption describing the scene, highlighting the main subject, notable objects, materials, lighting, and overall atmosphere. Mention spatial relationships clearly and avoid speculation beyond what is visible.",
"inputs": {
"image": "https://assets.runware.ai/assets/inputs/465571c4-411b-476e-a975-6e8cc06c397d.jpg"
}
}{
"taskType": "imageCaption",
"taskUUID": "74c7c126-bc3d-4e55-b8fa-36e20e2a4e62",
"text": "In a dimly lit, sepia-toned laboratory filled with shelves of jars containing various specimens, two birds stand on a wooden desk. The desk is cluttered with tools, including a magnifying glass, a small knife, and a green and red object that resembles a toy car. A large glass dome encases a collection of feathers and other natural items. To the left, a book lies open, while to the right, a skull and other anatomical models are displayed.",
"cost": 0.0026
}Harbor Market Eclipse Tableau
In this vibrant seaside town, a dramatic solar eclipse casts a radiant halo over the bustling market, where vendors sell fresh produce and seafood. Children play near a makeshift telescope, while adults tend to their stalls under the enchanting sky. The town's architecture features colorful awnings and signs in Arabic script, adding to the lively atmosphere. A large ship sails past the dock, and seagulls soar above, enhancing the scene's dynamic energy. {
"taskType": "caption",
"taskUUID": "3e62120b-9210-4117-9231-9f6bdbfecc8a",
"model": "runware:152@1",
"prompt": "Write a precise, vivid caption for this image in 2-3 sentences. Identify the main setting, the unusual sky event, notable human activity, and at least five specific visual details without guessing hidden context.",
"inputs": {
"image": "https://assets.runware.ai/assets/inputs/13dcd4da-6c25-4422-b900-b3abab18ce24.jpg"
}
}{
"taskType": "imageCaption",
"taskUUID": "3e62120b-9210-4117-9231-9f6bdbfecc8a",
"text": "In this vibrant seaside town, a dramatic solar eclipse casts a radiant halo over the bustling market, where vendors sell fresh produce and seafood. Children play near a makeshift telescope, while adults tend to their stalls under the enchanting sky. The town's architecture features colorful awnings and signs in Arabic script, adding to the lively atmosphere. A large ship sails past the dock, and seagulls soar above, enhancing the scene's dynamic energy.",
"cost": 0.0026
}Salt-Flat Breakfast Balloonists
Two individuals are enjoying a picnic on a vast, snowy plain under a clear sky. They are seated on patterned rugs, surrounded by a colorful hot air balloon festival. The balloons are vibrant, with hues of red, yellow, blue, and green, floating above them. The setting sun casts a warm glow over the scene, creating a serene and picturesque atmosphere. {
"taskType": "caption",
"taskUUID": "e340cc77-2772-4f13-8297-1a7dffcf525c",
"model": "runware:152@1",
"prompt": "Write a vivid, accurate caption for this image in 2-3 sentences. Describe the main subjects, setting, notable objects, colors, lighting, and the overall mood. Avoid speculation beyond what is clearly visible.",
"inputs": {
"image": "https://assets.runware.ai/assets/inputs/02da3f55-7a4b-4064-a905-ca325e849a70.jpg"
}
}{
"taskType": "imageCaption",
"taskUUID": "e340cc77-2772-4f13-8297-1a7dffcf525c",
"text": "Two individuals are enjoying a picnic on a vast, snowy plain under a clear sky. They are seated on patterned rugs, surrounded by a colorful hot air balloon festival. The balloons are vibrant, with hues of red, yellow, blue, and green, floating above them. The setting sun casts a warm glow over the scene, creating a serene and picturesque atmosphere.",
"cost": 0.0019
}Rooftop Pigeon Racing Finale
A lively rooftop gathering features two men flying colorful kites amidst a flock of pigeons, with a backdrop of traditional buildings and a minaret silhouetted against the setting sun. {
"taskType": "caption",
"taskUUID": "66a9623a-dbd9-4db9-8975-592584b3d844",
"model": "runware:152@1",
"prompt": "Write a concise, vivid caption in 1-2 sentences describing the main action, setting, and mood of this image.",
"inputs": {
"image": "https://assets.runware.ai/assets/inputs/7590c546-9819-4c3b-b3a7-3b16838e1222.jpg"
}
}{
"taskType": "imageCaption",
"taskUUID": "66a9623a-dbd9-4db9-8975-592584b3d844",
"text": "A lively rooftop gathering features two men flying colorful kites amidst a flock of pigeons, with a backdrop of traditional buildings and a minaret silhouetted against the setting sun.",
"cost": 0.0013
}