Ovi
Ovi Text to Video is a model that generates videos from text prompts.
Ovi Text to Video is a model that generates videos from text prompts.
It allows for control over the video content through text descriptions, including specifying the desired audio characteristics.
Model Overview
The Ovi Text to Video model generates videos from textual descriptions. It provides a unified paradigm for audio-video generation, allowing users to create videos based on their text prompts.
Best At
- Generating videos that align with the provided text prompts.
- Creating videos with specified audio characteristics.
Limitations / Not Good At
- May produce videos with jitter, blur, or distortion if negative prompts are not used effectively.
- The model struggles with hands.
Ideal Use Cases
- Creating short video clips for social media.
- Generating visual content for educational materials.
- Prototyping video scenes for storyboarding.
Input & Output Format
- Input: Text prompt (string), optional negative prompts, number of inference steps, audio negative prompt, seed, and resolution.
- Output: A video file (mp4) and the seed used for generation.
Performance Notes
- The number of inference steps affects the quality and generation time of the video.
Prompt
StringThe text prompt to guide video generation.
Prompt
StringThe text prompt to guide video generation.
Resolution
StringResolution of the generated video in W:H format. One of (512x992, 992x512, 960x512, 512x960, 720x720, or 448x1120).
992x512Num Inference Steps
NumberThe number of inference steps.
30Audio Negative Prompt
StringNegative prompt for audio generation.
robotic, muffled, echo, distortedNegative Prompt
StringNegative prompt for video generation.
jitter, bad hands, blur, distortionSeed
NumberRandom seed for reproducibility. If None, a random seed is chosen.
-1Output
InferredOutput
Type
Node
Status
Official
Package
Nodespell AI
Category
AI / Video / Character AiInput
Output