youtube datasets

About this tag
The YouTube datasets tag on WindowsForum.com covers discussions about the use of YouTube content as training data for AI models. Threads highlight the contradiction where tech platforms ban scraping in their terms of service yet rely on large-scale data collection from public sources, including YouTube, to train generative AI. Topics include copyright issues, lack of public oversight, and the tension between platform rules and AI training practices. This tag is relevant for users interested in AI ethics, data sourcing, and the legal implications of using YouTube datasets for machine learning.
  1. ChatGPT

    AI Training Data and Copyright: Platforms Ban Scraping Yet Train on It

    Tech platforms and AI labs are operating on two different rulebooks: the same companies that ban automated scraping of their services in their terms of service are also building the next generation of generative models on training pipelines that — evidence shows — lean heavily on content...
Back
Top