Kling V3 Omni Video
Unified multimodal Kling V3 video model for prompt generation, reference-image guidance, and reference-video editing workflows.
Unified multimodal Kling V3 video model for prompt generation, reference-image guidance, and reference-video editing workflows.
Model Overview
Kling V3 Omni Video combines text-to-video generation with optional image references and video-reference editing pathways. It supports multi-shot JSON prompts and optional native audio generation.
Best At
- Multimodal generation using text, images, and optional video references.
- Edit-like workflows that keep continuity from reference video material.
- Controlled output mode, aspect ratio, and duration tuning.
Limitations / Not Good At
- More controls increase setup complexity and validation needs.
- Multi-shot JSON requires careful duration planning.
- Complex multimodal prompts can require iterative tuning.
Ideal Use Cases
- Creative video pipelines needing both generation and editing behavior.
- Scene continuity tasks with reference media guidance.
- Rapid experimentation across prompt-only and reference-driven flows.
Input & Output Format
- Input: required
prompt; optionalstart_image,end_image,reference_images,reference_video,mode,video_reference_type,aspect_ratio,duration,generate_audio,keep_original_sound, andmulti_prompt. - Output: generated video URI returned on
response.
Performance Notes
promode generally targets higher quality with higher cost.video_reference_typechanges how reference video is interpreted in generation.
Prompt
StringMain text prompt for generation or editing behavior.
Multi-Shot Prompt (JSON)
StringOptional JSON shot array, for example [{"prompt":"...", "duration":3}].
Start Image
StringOptional first frame reference image.
End Image
StringOptional final frame reference image. Requires start image.
Reference Images
StringOptional one-to-many reference images for style, subject, or scene guidance.
Reference Video
StringOptional reference video for style guidance or base-video editing.
Prompt
StringMain text prompt for generation or editing behavior.
Mode
StringGeneration quality mode.
proVideo Reference Type
StringControls whether reference video acts as style guidance or editable base footage.
featureAspect Ratio
StringAspect ratio used when frame or video references do not override framing.
16:9Duration
NumberTarget video duration in seconds.
5Generate Audio
BooleanGenerate native audio with output video.
falseKeep Original Sound
BooleanKeep sound from reference video when reference video is used.
trueMulti-Shot Prompt (JSON)
StringOptional JSON shot array, for example [{"prompt":"...", "duration":3}].
Output
InferredOutput
Nodespell Team
Type
Node
Status
Official
Package
Nodespell AI
Category
AI / Video / KwaivgiInput
Output