Phenaki is a new technology that can generate realistic videos based on text descriptions. Developed by Google Research, it uses a combination of machine learning models to compress videos into tokens, translate text into tokens, and de-tokenize tokens into videos. Phenaki can create videos of varying lengths and qualities based on a series of text…

What is Phenaki?

Phenaki is a breakthrough in video synthesis as a result of it could deal with open-domain and time-varying cues, in contrast to earlier approaches that have been restricted by knowledge availability and computational value. Phenaki may generate movies based mostly on nonetheless pictures and cues, equivalent to enlarging a cat's eyes or making it yawn. Phenaki has many potential functions equivalent to leisure, schooling, storytelling, and artwork. To be taught extra about Phenaki, you possibly can learn the paper revealed by Google Analysis or watch a number of the instance movies Phenaki generates on its web site or YouTube channel.


It can generate videos of variable length and quality, up to two minutes long. It can handle open domain and time-varying prompts, such as stories or descriptions. It leverages a large number of image-text pairs and a small amount of video text to generalize to examples outside of video datasets.


Training and running the model requires high computational costs and resources. It may produce unrealistic or untrue results for certain prompts or domains. It may raise ethical or legal issues regarding the use and ownership of the generated videos.