Thursday 14 March 2024

AI takes a giant LEAP

Eki sent me a link to three SORA* videos.  And one he made  for a  for a competition. AI did the script, narration, video, edit with a prompt. He said some day AI could doing  his job. When Midjourney came out a couple of years ago people who thought  ‘what a hoot’, are out of a jobs today. Better check if you’re on the endangered species list.  For the time being Open AI is keeping Sora under wraps until it can figure out how to control its use. As they say, ‘good luck with…

There’s a monster in our midst, smarter than US. Geoffrey Hinton, the god-father of AI, is spooked by his creation. And hopes to hell someone figures out how to rein it in. 50% of the world holds elections in 2024. Fake AI crap will flood social media.  What’s to stop it?   Trump & Co. are hyper-ventilating.

Eki says AI is in its pre-teen phase. You can bet that it’s going to get smarter and more versatile. Without answers it’s better to sit back, pour a glass a glass of something, strong, smoke a joint, meditate.  Or as Gestalt-ers say as infinitum: ‘stay in the  here and now’.  AI is here for better or worse. 

*Japanese word for ’sky’

Sources: Eki, New York Times, Economist

Next week: ‘Double, double, toil and trouble……’: *  3rd party candidates for US     president 2024 

*Three witches in Macbeth

Note: When Maggy sent me this blog, it took me a few seconds to remember which giant leap we are talking about - they come at a pace where "a week ago" is ancient history. Since the SORA announcement, there have been many other giant leaps already.

There's a fully autonomous software engineer called Devin, an AI that can not only do code, but also all other steps required in making applications: research, plan, test, compile and deploy. Essentially, it is a computer program that can make computer programs without human intervention. Claude 3 was announced, and it apparently outperforms ChatGPT 4Stable Diffusion 3 is coming too, and it seems to be better than Dall-E 3 or Midjourney - and it also does videos. And then there's also another breakthrough with OpenAI brains - Figure-01 (video above) is a humanoid robot that is not only dexterous, but can also see, hear - and think. Essentially, it is the robot from science fiction films - a real life cousin of 3CPO (Before you ask, it's the golden robot from Star Wars, Maggy). As far as video goes, the "Chinese Amazon", Alibaba, announced a new model called EMO AI. It can take any image, and any song or speech, and make the image perform to the audio. 

But back to SORA. Sora is a video-generating AI that can make long (up to a minute) videos that are nearly indistinguishable from actual footage, with a lot of motion and even emotion. I'm of course already using the state-of-the-art-available-now video generators in my work, but they are nowhere near SORA - yet - but even as-is they are good enough to many tasks: creating elements for further manipulation, even full-screen videos as long as the clip is short (2-4 seconds or so) and the motion is minimal.

SORA could also be used to create, well, pretty much anything. Including using it for nefarious purposes. So, just like they did with ChatGPT, OpenAI has a period of testing and aligning the model to acceptable values. We will get our hands on it some day, but it can take months. But when we do, it will be interesting, not least for the future of my discipline, visual effects production.




