Microsoft Probing If DeepSeek-Linked Group Improperly Obtained data the same way OpenAI did. –FTFY
OMG Competition!
QUICK, they’re a foreign threat! They’re coming right for us!!!
When a writer copies someone else’s work without cites or compensation, it’s called “plagiarism.” But when an AI does it, it’s called “LLM training.”
When a reader reads someone else’s work that’s called “reading”. But when an AI does it, it’s called “training”.
Unless that AI is not OpenAI, then it’s “plagiarism” still.
What the fuck is Microsoft getting involved for?! Maybe concentrate on not providing shitty fucking software fuck heads!
They have a large stake in OpenAI, last I checked.
Stealing from thieves isn’t a crime.
Especially not when China turns around and Robin Hoods it back to the world.
Just saying.
Making R1 open source really makes it such a big FU to all the grifters asking for billions for AI in the us. Especially funny because high-flyer is a hedge fund firm themselves. The ai race should only be determined by what you do with it, not protecting how much IP you hoovered up and are now trying to cry about it being copied by others.
China really did one on our oligarchs haha
Beautiful
These parasites expect me to side with them?
What data? they one OpenAI illegally obtained first?!
Lol its like fucking lavrov from fucking russia screaming “this is against international law” when Europe froze their assets.
Bro… US reaction here is so pathetic…
The behavior is indicative of a bigger issue. They really do think only they are allowed to cheat and steal to win lol
What’s the game plan if they did?
Trade restrictions?
China already proved those did fuck all to stop them from developing their own model.
Ducking knew this ai bubble would burst sooner or later, just glad we can finally get on with it now.
- Name the company involved a military asset
- Forbid US companies from hosting the models
- Pressure foreign companies to not work with them
I ducking knew it too, I’ve been a long for the ride though. The models still do have some niche applications where they’re actually useful.
This whole thing with OpenAI and Microsoft whinging about fair play is truly laughable though. What clowns.
As a side note, it took a few tries to write ducking, my keyboard kept correcting it to fucking. We’re definitely 2 different people. Lol.
“You can’t steal that public data! We stole it first!”
And considering that’s exactly what Microsoft did to Apple with point and click, what irony!
They didn’t steal it from Smith & Wesson?
They both stole point and click from Xerox if my memory serves me correctly
Apple did pay Xerox for it if I’m remembering right
yeah xerox invented the GUI and mouse
Actually, it was invented by Douglas Engelbart in Stanford in the 60s
https://dougengelbart.org/content/view/162/000/
Xerox (re)made it for the PC in the 80s.
Ah TIL! I didn’t know it originated elsewhere.
So while it’s true Apple and Microsoft got the idea from Xerox, Xerox didn’t originate it.
Oh really? **Rabbit hole unlocked
https://www.youtube.com/watch?v=UFcb-XF1RPQ
The relevant part of Pirates of Silicon Valley. After which you should watch the whole thing. It’s fan fiction, but it’s the best explanation of what happened between Apple and Microsoft leading into the 1990s.
The irony is overwhelming
An irony curtain.
Somebody better call the WAHMBULANCE!
Surely they’d like some cheese to go with that whine?
Chinese company:
Truly, you have a dizzling intellect.
Microsoft:
AND IM JUST GETTING STARTED! Where was I?
Chinese company:
Stealing data…
What, you mean like Microsoft, uh, OpenAI did?
Yep, NOW it’s a problem, though! Because it’s someone else doing the same thing, someone who isn’t part of the human centipede starting at Trump’s colon.
You mean they ripped off the copyrighted material that OpenAI ripped off?
we stole it fair and square
Are they worried that deepsink too stuff written by others, mixed it up, and repackaged it as it’s own?
Well, yeah, that’s all AI is. An expensive weighted pachinko machine, that uses human made content, and remixes it.
The question isn’t whether they’ve used the same information. It’s whether they’ve faked the process to achieve that 20x efficiency.
Look at it like a dictionary. Writing one from scratch is a huge task, no matter how many other books exist. How do you even go about finding all of the words?
But if other people have already written dictionaries, you can just use their word lists and go from there.
It’s more efficient, but only because it’s a completely different task.
No AI company has ever made any of their own content to train their models, they took what others created, remixed it, and presented it as something new.
This AI model did the same thing.
AI lost its job to AI.
Yes, but that doesn’t mean it is more efficient, which is what the whole thing is about.
Let’s pretend we’re not talking about AI, but tuna fishing. OpenTuna is sending hundreds of ships to the ocean to go fishing. It’s extremely expensive, but it gets results.
If another fish distributor shows up out of nowhere selling tuna for 1/10 the price, it would be amazing. But if you found out that they could sell them cheap because they were stealing the fish from OpenTuna warehouses, you wouldn’t argue that the secret to catching fish going forward is theft and stop building boats.
Yes, I would.
So what happens when OpenTuna runs out of fish to steal and there are no more boats?
Information doesn’t stop being created. AI models need to be constantly trained and updated with new information. One of the biggest issues with GPT3 was the 2021 knowledge cutoff.
Let’s pretend you’re building a legal analysis AI tool that scrapes the web for information on local, state, and federal law in the US. If your model was from January 2008 and was never updated, then gay marriage wouldn’t be legal in the US, the ACA wouldn’t exist, Super PACs would be illegal, the Consumer Financial Protection Bureau wouldn’t exist, zoning ordinances in pretty much every city would be out of date, and openly carrying a handgun in Texas would get you jailtime.
It would essentially be a useless tool, and copying that old training data wouldn’t make a better product no matter how cheap it was to do.
Once tuna runs out, and we run out of boats?
Maybe we then stop destroying the tuna population?
Or, to bring this back to point: the environment will be better off once the AI bubble collapses.
That’s a very important, but entirely separate conversation.
Is it worth it? Let me work it I put my thing down, flip it and reverse it