News

Zhipu AI Upgrades “Qingying AI” Model with Automatic Sound Effects

topics: News Date: Author:CHINESEAIVIDEO Views:109

Zhipu AI has announced a significant upgrade to its “Qingying” AI video generation Model, introducing automatic, AI-generated sound effects to its already advanced video creation capabilities. Powered by the upgraded CogVideoX v1.5 model, the enhanced Qingying system enables the creation of 10-second, 4K-resolution videos at 60 frames per second, with seamless integration of sound effects that match the video content. This marks a new milestone in AI-powered video generation, offering a fully immersive audiovisual experience for creators across various industries.

Core Features of Qingying AI Video

  1. Cinema-Grade Video Quality
    Qingying supports 10-second videos with up to 4K resolution and 60 FPS, delivering a cinematic experience with unmatched clarity and smooth motion.
  2. Flexible Aspect Ratios
    Breaking free from the limitations of fixed aspect ratios, Qingying allows users to generate videos in custom proportions, enabling seamless adaptation to various playback platforms and creative needs.
  3. Multi-Channel Video Outputs
    With a single input—whether a text prompt or an image—users can generate four unique videos simultaneously, greatly enhancing efficiency and creativity.
  4. Integrated AI Sound Effects
    Leveraging the CogSound model, Qingying generates audio effects that perfectly match the video content. This marks the end of the “silent era” of AI-generated videos and paves the way for immersive audiovisual creations.
  5. User-Friendly Workflow
    Qingying simplifies the video creation process:

    • Script Generation: The GLM-4 Plus system creates detailed movie scripts from user-provided themes.
    • Visual Input: Users upload high-quality images generated by AI.
    • Prompt Integration: With customizable prompts, users can fine-tune video effects.
    • Video Output: Qingying produces professional-grade videos effortlessly.

Technical Innovations Driving Qingying

CogVideoX v1.5 Model

CogVideoX v1.5 introduces a host of advancements:

  • Advanced Semantic Understanding: The model can comprehend complex prompts and generate videos with greater narrative and thematic depth.
  • Improved Aesthetic Quality: Videos exhibit enhanced realism, fluid motion, and aesthetic consistency.
  • Unprecedented Video Generation Capability: Supports up to 768P resolution for 16-frame videos in 5- or 10-second lengths, as well as videos in any aspect ratio.

CogSound Integration

  • CogSound generates audio effects that align with video content, ensuring seamless audio-visual synchronization.
  • Powered by latent-space diffusion models and cross-attention mechanisms, CogSound delivers precise, time-aligned soundscapes for a rich user experience.

State-of-the-Art Architecture

  • Expert Transformer modules and causal 3D convolutions ensure coherent and natural transitions in generated videos.
  • Adaptive layer normalization enhances video realism and continuity.

Application Scenarios

  1. Advertising Production
    Create dynamic, professional-grade video advertisements tailored to diverse platforms.
  2. Short Video Content
    Streamline content creation for social media, empowering influencers and marketers with high-quality visuals.
  3. Creative Projects
    Design custom animations, memes, and video snippets that captivate audiences.
  4. Film Previsualization
    Generate visual concepts for movie scenes, facilitating the creative process for filmmakers.

Industry Impact and Vision

With Qingying, Zhipu AI has made significant strides in advancing AI video generation and has released the CogVideoX v1.5 model as open source, fostering collaboration and innovation within the global AI community. By integrating video and audio generation into a seamless pipeline, Qingying represents a major leap forward in multi-modal AI technologies.

A Game-Changer for Short-Form Video Production

  • Qingying democratizes access to high-quality video production, enabling creators to produce cinematic-quality content without specialized tools or expertise.
  • By automating key aspects of video creation, the model significantly reduces production time while maintaining high creative standards.

Future Prospects

Zhipu AI plans to expand its multi-modal offerings with the release of CogMusic, a music-generation model that complements its existing GLM-4-Voice system for human voice synthesis. Together, these innovations will enable Zhipu AI to cover all aspects of video and audio creation, further simplifying workflows for content creators.


Conclusion

The launch of the Qingying AI video system highlights Zhipu AI’s commitment to empowering users with cutting-edge AI solutions. With its groundbreaking features—such as 4K resolution, 60 FPS capability, and integrated sound effects—Qingying is set to transform industries ranging from advertising to entertainment and beyond.

Related Articles

  • Shengshu Technology Unveils Vidu AI 1.5: A Leap Forward in AI Video Generation

    Shengshu Technology has announced the release of Vidu AI 1.5, a groundbreaking update to its flagship AI video generation model. The new version addresses one of the most persistent challenges in the field: maintaining subject and scene consistency across long, dynamic videos. Tackling the Consistency Challenge With the introduction of a multi-image reference feature, Vidu 1.5 enables creators to upload up to three reference images....

Leave a Comment