The utilization of individuals’ work by AI models without permission or compensation is a longstanding issue, with notable entities such as The New York Times and Getty Images filing lawsuits against AI creators alongside artists and writers. OpenAI’s CTO, Mira Murati, added to the ongoing uncertainty by expressing uncertainty regarding whether Sora, the company’s new text-to-video AI tool, gathers data from YouTube, Instagram, or Facebook posts. In response, YouTube’s CEO, Neal Mohan, issued a clear warning to OpenAI, asserting that using YouTube videos to train Sora would constitute a “clear violation” of the platform’s terms of use.
Mohan emphasized creators’ expectations regarding the terms of service, stating that uploading content to the platform entails an understanding that such terms will be upheld, which prohibits actions like downloading transcripts or video segments. This uncertainty and controversy extend to how OpenAI trains Sora, ChatGPT, and DALL-E, with reports indicating the company’s intention to use YouTube video transcriptions for training GPT-5. Conversely, OpenAI’s competitor Google appears to be adhering to rules, particularly regarding YouTube, which it owns. Google’s AI model Gemini requires similar data for learning, but Mohan asserts that it only utilizes specific videos based on permissions outlined in each creator’s licensing agreement.