Google releases AI video generator Veo 3: It can generate video and sound effects at the same time

At this year’s I/O developer conference, Google announced the third-generation video generation model, Veo 3.

It is reported that Veo 3 is benchmarked against OpenAI’s Sora and has the ability to generate videos and embed sound effects simultaneously.

Google said that Veo 3 can not only generate high-quality videos based on text and image prompts, but also match scenes such as character dialogue, birdsong or street traffic with corresponding sound effects for a more realistic audio-visual experience.

Eli Collins, vice president of product at Google DeepMind, said: “From text and image cues to real-world physics and precise lip syncing, Veo 3 is excellent. ”

Currently, this model is primarily aimed at Gemini Ultra subscribers in the US region, with a monthly fee of $249.99.

In addition, Veo 3 will also be incorporated into Google’s Vertex AI platform for enterprise customers.

In addition to Veo 3, Google has also released a number of new products related to generative AI, including the upgraded image generation model Imagen 4 and the filmmaking assistance tool Flow.

In addition, Google also announced an update to the original Veo 2 and added support for adding and removing objects in videos through text prompts.

Currently, the application of generative AI in the field of image and video creation is becoming increasingly popular.

However, it is worth noting that Google’s history in the field of AI image generation has not been smooth sailing.

In 2024, Google faced widespread criticism for the Imagen 3 model for generating image content with historical errors, and was subsequently forced to re-release the tool.

Google co-founder Sergey Brin later admitted that the problem stemmed from “inadequate testing.”

Google releases AI video generator Veo 3: It can generate video and sound effects at the same time

33938983275