@the_sisko

the_sisko@startrek.website · 1 year ago

the_sisko@startrek.website · 2 years ago

the_sisko@startrek.website · edit-2 2 years ago

Not a “hater” in terms of trying/wanting to be mean, but I do disagree. I think a lot of people downvoting are frustrated because this attitude takes an issue in one application (yay), for one distro, and says “this is why Linux sucks / can’t be used by normies”. Clearly that’s not true of this specific instance, especially given that yay is basically a developer tool. At best, “this is why yay sucks”. (yay is an AUR helper - a tool to help you compile and install software that’s completely unvetted - see the big red banner. Using the AUR is definitely one of those things that puts you well outside the realm of the “common person” already.)

Maybe the more charitable interpretation is “these kinds of issues are what common users face”, and that’s a better argument (setting aside the fact that this specific instance isn’t really part of that group). I think most people agree that there are stumbling blocks, and they want things to be easier for new users. But doom-y language like this, without concrete steps or ideas, doesn’t feel particularly helpful. And it can be frustrating – thus the downvotes.

the_sisko@startrek.website · 2 years ago

Usually it’s a bunch of different string hashes of the text content. They could be different hashing algorithms, but it’s more common to take a single hash algorithm and simply create a bunch of hash functions that operate on different parts of the data.

If it’s not text data, there’s a whole bunch of other hashing strategies but I only ever saw bloom filters used with text.

the_sisko@startrek.website · 2 years ago

People aren’t misunderstanding the issue. Third party cookie support is being dropped by all browsers. Chrome is also dropping them, but replacing them with topics. Sure, topics is less invasive than third party cookies, but it is still more invasive than the obvious user friendly approach of not having an invasive tracker built into your browser. No other major browser vendor is considering supporting topics. So they’re doing an objectively user unfriendly thing here. This is the shit that happens when the world’s largest internet advertising company also controls the browser.

the_sisko@startrek.website · 2 years ago

A classic use for them is spam filtering.

Suppose you have a set of spam detection systems/rules which are somewhat expensive to execute, eg a ML model or keyword blocklist. Spam tends to come in waves, and frequently it can be as simple as reposting the same message dozens of times.

Once your systems determine a piece of content is spam (or you manually flag content), it’s a good idea to insert the content into a bloom filter. This means that future posts of the identical content will be flagged without needing to execute the expensive checks, especially if there’s a surge of content stressing your systems.

Since it’s probabilistic, you can’t use this unless you have some sort of manual reviewing queue or system, as it’s possible for false positives to be flagged. However, you can also run more intensive checks once you’ve flagged content, to detect false positives.

The false positives can also be a feature, not a bug: with careful choice of hash functions, your bloom filter can actually detect slightly modified content, since most of the hashes may still be the same.

I’ve worked at companies which use this strategy so it’s very real world.

the_sisko@startrek.website · 2 years ago

In other news, emacs still didn’t ship my init.el as part of the default configuration! Lol

the_sisko@startrek.website · 2 years ago

I’d argue that’s not true. That’s what the extern keyword is for. If you do #include , you don’t get the actual printf function defined by the preprocessor. You just get an extern declaration (though extern is optional for function signatures). The preprocessed source code that is fed to cc is still not complete, and cannot be used until it is linked to an object file that defines printf. So really, the unnamed “C preprocessor output language” can access functions or values from elsewhere.

the_sisko@startrek.website · 2 years ago

I would imagine the risk of bias here is much lower than, for example, the predictive policing systems that are already in use in US police departments. Or the bias involved in ML models for making credit decisions. 🙃

the_sisko@startrek.website · 2 years ago

someone playing music on their phone though the car audio (super common now) tapping the phone to ignore a call is just as much a crime as texting a novel to an ex.

They are all crimes. Set up your music before you go, or use voice command. Ignore the call with voice command or just let it go to voicemail. Lol. It’s not hard.

And you are kidding yourself if you think almost every person driving for a living is not at some level forced to use their phone by their company (I was)

This is a great of the strength of this system: this company will find its drivers and vehicles getting ticketed a lot, and they’ll have to come up with a way to allow drivers to do their jobs without interacting with their phones will moving at high speeds.

I would much rather have someone pulled over when driving erratically then the person getting an automated ticket 3 weeks after mowing down a pedestrian.

The camera doesn’t magically remove traffic enforcement humans from the road. They can still pull over the obviously drunk/erratic driver.

the_sisko@startrek.website · 2 years ago

I literally watched cops driving while on their phone everyday after it was made illegal. Nothing was done, Nothing changed, they hand out tickets while breaking the same rules.

I mean yeah, fuck the police :) Seems like we’re in agreement here.

Might kill someone is a precrime, a issue with these tickets in this case is that without the AI camera nothing would have been seen (literally victimless). If someone crashes into anything while on their phone the chances it will be used in prosecution is low.

Using your fucking phone while driving is the crime. This isn’t some “thought police” situation. Put the phone away, and you won’t get the ticket. It’s that simple. We don’t need to wait for a person to mow down a pedestrian in order to punish them for driving irresponsibly.

In the same spirit, if a person gets drunk and drives home, and they don’t kill somebody – well that’s a crime and they should be punished for it.

And if you can’t handle driving responsibly, then the privilege of driving on public roads should be revoked.

I don’t think texting while driving is a good idea, like not wearing a seatbelt. However this is offloading a lot to AI, distracted driving is not well defined and considering the nuances I don’t want to leave any part to AI. Here is an example: eating a bowl of soup while operating a vehicle would be distracted right? What if the soup was in a cup? What if the soup was made of coffee beans?

This is such a weird ad absurdum argument. Nobody is telling some ML system “make a judgment call on whether the coffee bean soup is a distraction.” The system is identifying people violating a cut-and-dried law: using their phone while driving, or not wearing a seatbelt. Assuming it can do it in an unbiased way (which is a huge if, to be fair), then there’s no slippery slope here.

For what it’s worth, I do worry about ML system bias, and I do think the seatbelt enforcement is a bit silly: I personally don’t mind if a person makes a decision that will only impact their own safety. I care about the irresponsible decisions that people make affecting my safety, and I’d be glad for some unbiased enforcement of the traffic rules that protect us all.

the_sisko@startrek.website · 2 years ago

I’m definitely a fan of better enforcement of traffic rules to improve safety, but using ML* systems here is fraught with issues. ML systems tend to learn the human biases that were present in their training data and continue to perpetuate them. I wouldn’t be shocked if these traffic systems, for example, disproportionately impact some racial groups. And if the ML system identifies those groups more frequently, even if the human review were unbiased (unlikely), the outcome would still be biased.

It’s important to see good data showing these systems are fair, before they are used in the wild. I wouldn’t support a system doing this until I was confident it was unbiased.

it’s all machine learning - NOT artificial intelligence. No intelligence involved, just mathematical parameters “learned” by an algorithm and applied to new data.

the_sisko@startrek.website · 2 years ago

Ah yes, the famously victimless crime of using your phone while driving. Honestly screw anybody who does that, they deserve to be ticketed each time, cause each time they might kill somebody.

the_sisko@startrek.website · 2 years ago

I mean you still get served the ads that provide them revenue. But it’s not like I’m assigning you personal responsibility for keeping them in business, or saying you’re wrong or bad for staying. Just sharing why I want people to get off the platform quicker.

the_sisko@startrek.website · 2 years ago

It seems obvious to me. Twitter has historically been used by public figures, and especially public institutions like local governments, transit agencies, etc, to make official announcements & statements. Of course having that on a centrally owned social media site was never good, but now with Space Karen making it actively hostile to users (and trying to prevent logged out users from seeing that info), it’s very bad. The sooner Twitter completes its inevitable collapse, the sooner those public figures & institutions will move to a better way to deliver those - Mastodon, RSS, webpages, whatever.

IMO it’s in the public’s best interest for all the holdouts to get out now so we can move on.

the_sisko@startrek.website · 2 years ago

Sphinx has warnings for these already. They’re just suppressed and ignored :)

the_sisko@startrek.website · 2 years ago

They probably know what it is, but it’s a bad point if they’re trying to paint DAGs as esoteric CS stuff for the average programmer. I needed to use a topological sort for work coding 2 weeks ago, and any time you’re using a build system, even as simple as Make, you’re using DAGs. Acting like it’s a tough concept makes me wonder why I should accept the rest of the argument.

Can’t say I have a strong feeling about Gradle though 🤷‍♀️

the_sisko@startrek.website · 2 years ago

It’s a cathartic, but not particularly productive vent.

Yes, there are stupid lines of time.sleep(1) written in some tests and codebases. But also, there are test setUp() methods which do expensive work per-test, so that the runtime grew too fast with the number of tests. There are situations where there was a smarter algorithm and the original author said “fuck it” and did the N^2 one. There are container-oriented workflows that take a long time to spin up in order to run the same tests. There are stupid DNS resolution timeouts because you didn’t realize that the third-party library you used would try to connect to an API which is not reachable in your test environment… And the list goes on…

I feel like it’s the “easy way out” to create some boogeyman, the stupid engineer who writes slow, shitty code. I think it’s far more likely that these issues come about because a capable person wrote software under one set of assumptions, and then the assumptions changed, and now the code is slow because the assumptions were violated. There’s no bad guy here, just people doing their best.