Reddit signs over its user data to train unnamed AI model

As user data becomes an increasingly hot commodity, an unnamed AI company is reportedly plotting a $60 million swoop to scrape Reddit for AI training purposes. Should we be concerned?

Decades of Reddit ramblings could become fuel for the next generation of AI models.

The self-proclaimed ‘front page of the internet’ has reportedly negotiated a content licensing deal with an unnamed AI company. For the cool price of $60 million, this business, in theory, now has the right to train its AI model using anything and everything posted on Reddit.

Millions of ceaseless posts from the most popular subreddits, the lurkers, and the dregs of questionable topics will become a rolling annual commodity for this undisclosed ‘big player’ of Silicon Valley.

This surprising deal arrives months after Reddit threatened to cut off Google and Bing’s search crawlers if an official deal couldn’t be struck to trade in its data. One source told the Washington Post at the time that the platform ‘can survive’ without search. Perhaps this current AI deal was in the offing back then?

If you use Reddit, they just sold you out to AI. https://t.co/0vjrl6Oyhs

— Reid Southen (@Rahll) February 20, 2024

Though Reddit’s yearly revenue was up by 20% in 2023, it was roughly $200 million shy of its $1 billion target set two years prior. The impending AI pact, paired with an opening for public investment next month, however, will likely send Reddit’s readies way north of this figure.

After a tumultuous few years, this data trade off makes perfect sense for Reddit financially speaking. Exactly what it means for consumers, though, and the ever-murky ethics of AI remains up for debate.

It’s one of the worst kept secrets that our user data is anything but private in the modern world. Remember that recent watchdog study into Meta that showed 48,000 companies had sent the platform data on a single user without consent?

Until recently, most AI companies used the open web to train their models without any sort of verified green light, but a combination of high-profile cases in 2023 appeared to be changing the landscape.

OpenAI’s quibble with the New York Times and Apple’s negotiations for major news partners suggested that AI companies were beginning to establish a strong legal framework for data scraping in line with copyright laws in other sectors.

The AI being thrust back to the technological stone age after 2 reddit posts: https://t.co/mg0DYwwrc2 pic.twitter.com/o9X9CiwUm3

— squeeb 🍉 (@Squeebus1) February 20, 2024

Similarly, Reddit’s deal shows that host companies are beginning to demand compensation for data, but the key difference here is that its 812 million monthly users have not explicitly given their consent to become part of the AI machine.

Whether or not Reddit’s terms of service are updated in the future for transparency about where our data goes, we know that all digital information prior to this deal is also fair game. In laymen’s terms, you can’t ask for permission after the fact, can you?

Perhaps this is the key difference between scraping news outlets and social media platforms for AI learning. The latter is almost entirely populated by user generated content; the use of which seems to be completely down to their respective management’s discretion.

In lieu of creating AGI (Artificial General Intelligence) platforms with a more distinctly human quality, this type of deal will likely become more common in the years to come.

While this may sound positive, just think of the endless droves of misinformation and nonsense this unidentified learning machine will be subject to if all goes to plan.

It’s Reddit for Pete’s sake. Those AI safeguards will need reinforcing for sure.

Toy Story 5 comes for big tech, but does the moral lesson land?

The latest instalment in the beloved franchise tackles a new villain: the iPad. But what does ‘Toy Story 5’ get wrong about kids and screentime? Since the first film dropped way back in 1995, Toy Story has warmed hearts and raised minds with its ragtag group of friends. Each iteration of the franchise – like every good kids movie – has presented a moral lesson, often taught with the assistance of a...

By Flo Bellinger Brighton, UK

Yes, it’s true: Sony is ditching physical discs in 2028

Credit: Sony/Thred

Gaming

Yes, it’s true: Sony is ditching physical discs in 2028

Sony has announced it will no longer be making physical discs for any new releases starting in January 2028. A major blow to the secondhand market, the change means that console players no longer own any game they purchase. Sony, the producer of PlayStation consoles, has announced that it will no longer make physical discs for new game releases, starting from January 2028. The news was announced in an

By Charlie Coombs London, UK

HR teams are constantly battling AI generated grievances

Credit: Thred

People

HR teams are constantly battling AI generated grievances

HR teams are reporting being bogged down by overly lengthy grievances framed around legal precedents that don’t even exist. Rearing its head on a daily basis, the culprit is generative AI. Have you ever felt so aggrieved you couldn’t articulate yourself properly? Maybe you considered letting AI be angry for you? If the answer is yes, and your choice of battleground was email, you may be one of a growing number...

By Jamie Watts London, UK

Could white hydrogen discovery fast-track global shift to clean energy?

Credit: Sustainable Energy First

Science

Could white hydrogen discovery fast-track global shift to clean energy?

While billions have been poured into manufacturing hydrogen the dirty way, experts in Canada found what was once thought to be impossible: a steady stream of white hydrogen. Last year, thanks to a surge in solar panel installations, the world raved about how renewables overtook coal to become one of the largest suppliers of energy. Not to burst that bubble, but as great as that achievement is, it only...

By Anyaa Shrii Kumar Kuala Lumpur, Malaysia

OpenAI alleges the New York Times tricked ChatGPT into plagiarism

Yes, it’s true: Sony is ditching physical discs in 2028

Could white hydrogen discovery fast-track global shift to clean energy?

More from thred.

Yes, it’s true: Sony is ditching physical discs in 2028

HR teams are constantly battling AI generated grievances

Could white hydrogen discovery fast-track global shift to clean energy?

As user data becomes an increasingly hot commodity, an unnamed AI company is reportedly plotting a $60 million swoop to scrape Reddit for AI training purposes. Should we be concerned?

Related articles

OpenAI alleges the New York Times tricked ChatGPT into plagiarism

Study lays bare Meta’s intrusive data gathering for targeted ads

Popular

Why Elon Musk’s trillionaire status is unsettling

Yes, it’s true: Sony is ditching physical discs in 2028

Could white hydrogen discovery fast-track global shift to clean energy?

Keep up with thred by signing up to our planet-positive newsletter!

More from thred.

Toy Story 5 comes for big tech, but does the moral lesson land?

Yes, it’s true: Sony is ditching physical discs in 2028

HR teams are constantly battling AI generated grievances

Could white hydrogen discovery fast-track global shift to clean energy?