As user data becomes an increasingly hot commodity, an unnamed AI company is reportedly plotting a $60 million swoop to scrape Reddit for AI training purposes. Should we be concerned?
Decades of Reddit ramblings could become fuel for the next generation of AI models.
The self-proclaimed โfront page of the internetโ has reportedly negotiated a content licensing deal with an unnamed AI company. For the cool price of $60 million, this business, in theory, now has the right to train its AI model using anything and everything posted on Reddit.
Millions of ceaseless posts from the most popular subreddits, the lurkers, and the dregs of questionable topics will become a rolling annual commodity for this undisclosed โbig playerโ of Silicon Valley.
This surprising deal arrives months after Reddit threatened to cut off Google and Bingโs search crawlers if an official deal couldnโt be struck to trade in its data. One source told the Washington Post at the time that the platform โcan surviveโ without search. Perhaps this current AI deal was in the offing back then?
If you use Reddit, they just sold you out to AI. https://t.co/0vjrl6Oyhs
โ Reid Southen (@Rahll) February 20, 2024
Though Redditโs yearly revenue was up by 20% in 2023, it was roughly $200 million shy of its $1 billion target set two years prior. The impending AI pact, paired with an opening for public investment next month, however, will likely send Redditโs readies way north of this figure.
After a tumultuous few years, this data trade off makes perfect sense for Reddit financially speaking. Exactly what it means for consumers, though, and the ever-murky ethics of AI remains up for debate.
Itโs one of the worst kept secrets that our user data is anything but private in the modern world. Remember that recent watchdog study into Meta that showed 48,000 companies had sent the platform data on a single user without consent?