The catarrhine who invented a perpetual motion machine, by dreaming at night and devouring its own dreams through the day.

  • 0 Posts
  • 517 Comments
Joined 10 months ago
cake
Cake day: January 12th, 2024

help-circle


  • When it comes to how people feel about AI translation, there is a definite distinction between utility and craft. Few object to using AI in the same way as a dictionary, to discern meaning. But translators, of course, do much more than that. As Dawson puts it: “These writers are artists in their own right.”

    That’s basically my experience.

    LLMs are useful for translation in three situations:

    • declension/conjugation table - faster than checking a dictionary
    • listing potential translations for a word or expression
    • a second row of spell/grammar-proofing, just to catch issues that you didn’t

    Past that, LLM-based translations are a sea of slop: they screw up with the tone and style, add stuff not present in the original, repeat sentences, remove critical bits, pick unsuitable synonyms, so goes on. All the bloody time.

    And if you’re handling dialogue, they will fuck it up even in shorter excerpts, by making all characters sound the same.





  • For real. Companies being extra pushy with their product always makes me picture their decision makers saying:

    “What do you mean, «we’re being too pushy»? Those are customers! They are not human beings, nor deserve to be treated as such! This filth is stupid and un-human-like, it can’t even follow simple orders like «consume our product»! Here we don’t appeal to its reason, we smear advertisement on its snout until it needs to open the mouth to breath, and then we shove the product down its throat!”

    Is this accurate? Probably not. But it does feel like this, specially when they’re trying to force a product with limited use cases into everyone’s throats, even after plenty potential customers said “eeew no”. Such as machine text and image generation.




  • Bots are parasites: they only thrive if the host population is large enough to maintain them. Once the hosts are gone, the parasites are gone too.

    In other words: botters only bot a platform when they expect human beings to see and interact with the output of their bots. As such they can never become the majority: once they do, botting there becomes pointless.

    That applies even to repost bots - you could have other bots upvoting the repost, but you won’t do it unless you can sell the account to an advertiser, and the advertiser will only buy it if they can “reach” an “audience” (i.e. spam humans).






  • I have a way to make it work.

    Have the monkey write down a single character. Just one. 29/30 of the time, it won’t be the same character as the first one in Shakespeare’s complete works; discard that sheet of paper, then try again. 1/30 of the time the monkey will type out the right character; when they do it, keep that sheet of paper and make copies out of it.

    Now, instead of giving a completely blank sheet to the monkey, give them one of those copies. And let them type the second character. If different from the actual second character in Shakespeare’s works, discard that sheet and give him a new copy (with the right 1st char still there - the monkey did type it out!). Do this until the monkey types the correct second character. Keep that sheet with 2 correct chars, make copies out of it, and repeat the process for the third character.

    And then the fourth, the fifth, so goes on.

    Since swapping sheets all the time takes more time than letting the monkey go wild, let’s increase the time per typed character (right or wrong), from 1 second to… let’s say, 60 times more. A whole minute. And since the monkey will type junk 29/30 of the time, it’ll take around 30min to type the right character.

    It would take even longer, right? Well… not really. Shakespeare’s complete works have around 5 million characters, so the process should take 5*10⁶ * 30min = 2.5 million hours, or 285 years.

    But we could do it even better. This approach has a single monkey doing all the work; the paper has 200k of them. We could split Shakespeare’s complete works into 200k strings of 25 chars each, and assign each string to a monkey. Each monkey would complete their assignment, on average, after 12h30min; some will take a bit longer, but now we aren’t talking about the thermal death of the universe or even centuries, it’ll take at most a few days.


    Why am I sharing this? I’m not invalidating the paper, mind you, it’s cool maths.

    I’ve found this metaphor of monkeys typing Shakespeare quite a bit in my teen years, when I still arsed myself to discuss with creationists. You know, the sort of people who thinks that complex life can’t appear due to random mutations, just like a monkey can’t type the full works of Shakespeare.

    Complex life is not the result of a single “big” mutation, like a monkey typing the full thing out of the blue; it involves selection and inheritance, as the sheets of paper being copied or discarded.

    And just like assigning tasks to different monkeys, multiple mutations can pop up independently and get recombined. Not just among sexual beings; even bacteria can transmit genes horizontally.

    Already back then (inb4 yes, I was a weird teen…) I developed the skeleton of this reasoning. Now I just plopped the numbers that the paper uses, and here we go.



  • I’m not expecting a big exodus, but rather a slow decline in both the number of users and their engagement. With a few peaks here and there that seem to revert the downwards trend, but each peak being smaller than the one before.

    They won’t be leaving for the same reason as most people here did, pissed at the IPO-related changes (such as killing 3rd party apps). It’ll be more like “…meh, why would I check Reddit? There’s better stuff elsewhere.” We can already see the decline of the content quality in Reddit now; it’ll get only worse over time.

    I think that most will end in Discord. Some in Bluesky, and some will simply touch grass. Conservatives might end in Minitrue “truth social” or crap like that.

    Facebook might perhaps absorb some of the former Reddit users. It feels disgusting for the privacy conscious, but for them it’ll be a simple matter of not finding interesting stuff in Reddit.

    The same applies to Reddit’s liquid profit - for now, that value extraction still creates a small peak on raw profit, to the point that the bottom line became positive; later on the peak will barely reach the surface; later on, value extraction will be necessary to avoid making the bottom line too negative.




  • I fucked it up and switched the terms, sorry. Look for “value extraction” instead; you’ll find multiple references to the concept such as this or Mazzucato’s “The Value of Everything”.

    To keep it short: you create value when you produce desirable goods/services for the customers; however, when you extract it, you’re picking the value that was already created (by society, your customers, or even your own business) and turning it into profit. The later is faster but unsustainable, as that value doesn’t pop up from nowhere, so when a business shifts from value creation to value extraction it’ll get some quick cash and then go kaboom.

    In Reddit’s case, this value is mostly users willing to generate, curate, and share content with the platform, and other users knowing this:

    • someone recommends you a product/brand. The person might be wrong, but you were reasonably sure that they aren’t a corporation astroturfing their own product. Someone else might criticise it instead.
    • you hop into your favourite subreddit and, while the content there isn’t the best, it’s still good enough - because the mods gave some fucks about growing their subreddits;
    • you discuss some controversial topic. You might get dogpiled, but at least you know that the dogs piling you are human beings, that sometimes might listen to reason; a bot will never;
    • et cetera.

    All that value was being slowly extracted through the last years, but the changes in 2023/2024 did it the hardest.