0
Please log in or register to do it.

OpenAI, which developed ChatGPT, on Thursday launched Sora, its first artificial intelligence (AI)-based text-to-video generation model. The company claims that it can create videos up to 60 seconds long. That’s longer than any competitor in the sector, including Google’s Lumiere, which was unveiled last month. Sora is currently available for use by red teams, cybersecurity experts, and select content creators who extensively test software to help companies improve their software. The AI ​​company also plans to include Coalition for Content Provenance and Authenticity (C2PA) metadata in the future once its models are deployed in OpenAI products.

The company introduced the AI ​​video generator through an Interestingly, the length of videos this company claims to produce is more than 10 times the length of videos offered by its competitors. Google’s Lumiere can generate videos that are 5 seconds long, while Runway AI and Pika 1.0 can generate videos that are 4 and 3 seconds long, respectively.

Sam Altman, X account and CEO of OpenAI, shared several videos created by Sora along with the prompts used to create them. The resulting video appears highly detailed with smooth motion, something that other video generators on the market struggle with somewhat. According to the company, it can create complex scenes with multiple characters, multiple camera angles, specific types of motion, and precise details of the subject and background. This is possible because the text-to-video model uses both prompts and “how these things exist in the real world.”

Sora is essentially a diffusion model that uses a transducer architecture similar to the GPT model. Likewise, the data you use and generate are represented by the term patches, which are again similar to tokens in the text generation model. A patch is a collection of videos and images grouped into smaller parts according to the company. Using this visual data, OpenAI was able to train a video generation model with different durations, resolutions, and aspect ratios. In addition to converting text to video, Sora can also create videos by taking still images.

However, it is not without flaws. OpenAI said on its website, “The current model has weaknesses. You may have difficulty accurately simulating the physics of complex scenes and may not understand specific instances of cause and effect. For example, a person may take a bite of a cookie, but later there may be no trace of the bite.”

To ensure that AI tools aren’t used to create deepfakes or other harmful content, the company is building tools to help detect misleading content. We also plan to use C2PA metadata for the generated videos, after recently adopting the practice for our DALL-E 3 model. We’re also working with the Red Team, especially domain experts in misinformation, hate content, and bias, to improve our models.

Currently, only the Red Team and a small group of visual artists, designers, and filmmakers have access to feedback on the product.

Affiliate links may be generated automatically. Please see our Ethics Statement for more information.

Apple temporarily suspends foldable phones...
Meta shows small businesses a new way.

Reactions

0
0
0
0
0
0
Already reacted for this post.

Reactions

Your email address will not be published. Required fields are marked *