The beef between Microsoft and Reddit came to light after I published a story revealing that Reddit is currently blocking every crawler from every search engine except Google, which earlier this year agreed to pay Reddit $60 million a year to scrap the site for its generative AI products.
I know the author meant “scrape”, but sometimes it really does feel like AI is just scrapping the old internet for parts.
[Joke] See, Reddit’s doing a good thing here! They’re making sure nobody ends up toxifying their own dataset by using Reddit’s garbage heap of bot posts!
I know the author meant “scrape”, but sometimes it really does feel like AI is just scrapping the old internet for parts.
Yeah, aren’t like over half of reddit comments/posts by bots these days?
yep, and the longer that happens the less value to the dataset. its becoming aged.
[Joke] See, Reddit’s doing a good thing here! They’re making sure nobody ends up toxifying their own dataset by using Reddit’s garbage heap of bot posts!
google needs a checkbox of ‘ignore reddit’ im sick of having to manually add -reddit
Hey good news. Turns out you can use bing and not get back Reddit results
yeah but then i get back bing results. no one needs that
There’s a browser extension for that. It also works on Pintrest and other useless sites. https://iorate.github.io/ublacklist/docs