News
A new tool, Data Provenance Explorer, lets users pick through the questionable provenance of many large data sets used for AI training.
New research from the Data Provenance Initiative has found a dramatic drop in content made available to the collections used to build artificial intelligence.
OpenAI is launching a new program to encourage organizations to contribute data -- including text and images -- to train future AI models.
LinkedIn limits opt-outs to future training, warns AI models may spout personal data.
New legal filings show OpenAI deleted two massive datasets that it used to train a powerful AI model. The employees who built the datasets are gone.
Spawning, a startup developing tools to enable creators to assert more control over their works online, is launching new, ostensibly more 'ethical' data sets for AI training.
How AI Data Sets Work – And How Artists Can Collaborate With Them For creatives, tackling machine learning requires an understanding of how to best feed it data and refine its algorithm to ...
As the discipline advances, Ether0’s synergy of Q&A-guided training, chain-of-thought clarity, and data frugality represents a new standard for what is possible in scientific reasoning models.
The OpenSubtitles data set adds yet another wrinkle to a complex narrative around AI, in which consent from artists and even the basic premise of the technology are points of contention.
Late last week, a California-based AI artist who goes by the name Lapine discovered private medical record photos taken by her doctor in 2013 referenced in the LAION-5B image set, which is a ...
Your posts are a gold mine, especially as companies start to run out of AI training data.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results