Skywork

Multimodal foundation models spanning video generation, reasoning, and agent systems

Skywork develops multimodal foundation models across video generation, audiovisual synthesis, vision-language reasoning, and agentic systems. Its SkyReels series focuses on cinematic video generation and editing, while broader Skywork research spans language models, multimodal understanding, and open model releases. As a Runware provider, Skywork expands the available set of multimodal models for high-control video and reasoning workflows.

Models by Skywork

Launch View details

SkyReels V4

SkyReels V4 is a unified multimodal video foundation model for joint video-audio generation, inpainting, and editing. It accepts text, images, video clips, masks, and audio references, and supports cinematic outputs up to 1080p, 32 FPS, and 15 seconds with synchronized audio, making it suitable for prompt-driven generation as well as guided editing workflows.