Look, we can debate the proper and private way to do Captchas all day, but if we remove the existing implementation we will be plunged into a world of hurt.
I run tucson.social - a tiny instance with barely any users and I find myself really ticked off at other Admin’s abdication of duty when it comes to engaging with the developers.
For all the Fediverse discussion on this, where are the github issue comments? Where is our attempt to convince the devs in this.
No, seriously WHERE ARE THEY?
Oh, you think that just because an “Issue” exists to bring back Captchas is the best you can do?
NO it is not the best we can do, we need to be applying some pressure to the developers here and that requires EVERYONE to do their part.
The Devs can’t make Lemmy an awesome place for us if us admins refuse to meaningfully engage with the project and provide feedback on crucial things like this.
So are you an admin? If so, we need more comments here: https://github.com/LemmyNet/lemmy/issues/3200
We need to make it VERY clear that Captcha is required before v0.18’s release. Not after when we’ll all be scrambling…
EDIT: To be clear I’m talking to all instance admins, not just Beehaw’s.
UPDATE: Our voices were heard! https://github.com/LemmyNet/lemmy/issues/3200#issuecomment-1600505757
The important part was that this was a decision to re-implement the old (if imperfect) solution in time for the upcoming release. mCaptcha and better techs are indeed the better solution, but at least we won’t make ourselves more vulnerable at this critical juncture.
I find it reasonably amusing that many people’s solutions seem to be “just defederate bro”. As in if this conversation isn’t happening on an instance which chose to defederate and received thousands of negative comments, from other instances, about this choice. We’re still being harassed by users from other instances, on posts all over our instance, who are unhappy with this.
I also find it amusing that many people say the solution is to build your own solution. Do you not want the fediverse to grow? If you want people to feel like they can just spin up their own instances, you need to stop assuming that they have the ability to do their own development, their own sysop and sysad, their own security, their own community management, their own… everything. People are not omniscient and the outright hostility towards someone asking for help, or surfacing their opinion on the matter isn’t helping.
Without adequate tools, I don’t see how most instances aren’t driven towards simply existing on their own. Large instances need tools to deal with malicious actors, as they are the targets. The solution to defederate ignores the ability for people to just spin up new instances, to hijack existing small instances with less resources for security, sysops, to watch/manage their DB, to prevent malicious actors. I’ve already seen proposed solutions which involve scraping for all instances with less than a certain number of users to defederate on principle (inactive, too many users/post ratio). We’re fighting spam bots right now, who are targeting instances which don’t have captcha enabled.
Follow this thinking through to it’s conclusion. If the solution is to defederate, and there are potentially unlimited attack vectors, what must a large instance do to not overburden its resources? Switch from blacklist to whitelist? Defederate from all small instances? How is this sustainable for the fediverse? If you want people to be interacting with each other, you need to provide the tools for this to happen in the presence of malicious actors. You can’t just assume these malicious actors won’t exist, or will just overcome any and all obstacles you throw in their way because you’re smart enough to understand how to bypass captcha or other issues.
This isn’t just an issue of whether captcha or some other anti-spam measure is used, it’s an issue about the overall health of the fediverse. Please think wider about the impact before offering your 2c about how captchas are worthless or how you hate cloudflare. I don’t think the user that posted this cares about the soapbox you want to preach from- they’re looking for solutions.
I also find it amusing that many people say the solution is to build your own solution. Do you not want the fediverse to grow? If you want people to feel like they can just spin up their own instances, you need to stop assuming that they have the ability to do their own development, their own sysop and sysad, their own security, their own community management, their own… everything. People are not omniscient and the outright hostility towards someone asking for help, or surfacing their opinion on the matter isn’t helping.
to underscore this: if we had to do all of this this instance would not exist and/or we would have shut off applications about 10,000 people ago. we do not have the capabilities to do all of this even now with like a dozen people volunteering to help us! we are one of the largest instances on Lemmy and one of the most active! please recognize how ridiculous and burdensome it is to just throw more non-inbuilt tech at problems like this, and how exclusionary that is going to be to anybody who is without free time and extremely tech-savviness. if you want this space to grow it needs to be at a point where people can just use it and not have to worry about this shit.
I’m a DevOps/SysOps/SecOps engineer - have been for over a decade now. Even if I CAN do all the things listed, it takes time to do it. It takes time to configure your networking layer, especially when documentation of the underlying app is in flux and never 100% correct. It takes time to secure your server, especially when the “prod” configuration in the repo isn’t really that secure at all.
Folks saying to just “code it myself” - sure, let me stop doing my day job and start planning on this completely unpaid enhancement. Let me tell my wife - “Sorry babe, gotta prove this internet person wrong and it must be today - can’t go to board game night with you”. I mean, I’ll actually likely end up coding it myself, but when I can. Not when the trolls who say “Oh, come on, it’ll be EZ” - yeah, I know better than that.
Folks just say to “Use other solutions” - Great! I already budgeted 150/month of my own money. Oh wait, that doesn’t matter much when I have to worry about instances that can’t spend that type of scratch.
The 2 Lemmy devs have funding. About 1500 total from community support, with the rest coming from a sponsorship/incubator type deal. A deal which pays out when targets/goals are achieved.
Which made me laugh at this:sure, let me stop doing my day job and start planning on this completely unpaid enhancement
Which is entirely what you are asking the Lemmy devs to do.
Thanks for raising awareness of the spam-bot-account issue.
Well I am making a distinction between creating a newer implementation and rolling back to an older, known implementation. It’s why I find it bizarre when folks point out that there’s a new feature request and a PR is guarenteed accepted - yes, but that will take more time than reverting some commits and maybe retrofitting if needed. The entire point I was trying to make is that they could just roll back, and when the new feature is ready, we can go right to it. I’m not (at least intentionally) asking for grandiose work and assuming going back is quicker and more readily available than waiting for a new solution to be implemented.
The developers published an article where they said they currently do not receive sponsorship money. Continued payout is tied to features that they are delaying in order to improve stability and robustness.
Personally, I find it reasonably amusing that defending an open source, arguably collectivist project requires appeals to individualism.
“You can build it” “Just defederate” “It’s the instance owner’s responsibility” “You can do X for your instance, its in your control”
Like, which is it? Is this a collective undertaking by a community of multiple stakeholders or is this the Dev’s individual project and they don’t have to listen to anyone?
Is this a collective undertaking by a community of multiple stakeholders or is this the Dev’s individual project and they don’t have to listen to anyone?
Devs, especially extremely busy ones “listen” via pull requests. Instead of badgering the devs, put together some devs of your own, get some code working, and submit it as a PR.
If they don’t accept it, you now have code that does what you want, and it would be easy to create your own fork.
Yeah, and this would work fine for new features. But for removing existing features that alter the entire ecosystem regardless if you upgrade or not? This isn’t at all the same, and casting it as such isn’t honest.
I feel like folks keep making this a technical merit discussion when that’s not at all what it is. A better technical solution is required, I agree. I’m not even disagreeing that captcha can be bypassed - but so can a lock, or a door, or any security feature really given a sufficiently intelligent threat.
But so far the captcha has already made some difference in what instances have spam account problems and those that don’t. To argue that it isn’t perfect is a logical fallacy that’s making my head hurt. Shall we get rid of door locks because they can be picked? Should we get rid of garage doors entirely with the new hacking devices available - obviously the security isn’t perfect so why have it at all?
Since when did perfect become the enemy of good? We had a good solution… And now we’re throwing it out of a better one, fine! But leave the good one in place until then.
We had a good solution…
Did we? Do you know why it was removed in the first place?
I’ve already seen proposed solutions which involve scraping for all instances with less than a certain number of users to defederate on principle (inactive, too many users/post ratio). We’re fighting spam bots right now, who are targeting instances which don’t have captcha enabled.
There are folks that are running their own instances as well, as single user instances or are working to get the federation to the point to open it up in anticipation for a larger flood. That doesnt make us spammers at all.
The questions of how to handle it are legitimate. In the end I feel the “fediverse” will need some user only instances (that is instances that just host users and not loads of communities) to help with the load and scaling issues MANY are seeing. Beehaw seems to have handled the influx to date the best, others like lemmy.ml and lemmy.world seem to have service level impacts that I can only really assume is due to scaling and load. And thats supposed to be the entire point right?
You ALL have a responsibility to communicate back to lemmy devs to try to stop it.
No I don’t. Stop trying to brigade people to an issue. If you have an issue with it… Fork the lemmy UI code and make your own. Or stay on pre 0.18 code.
It’s one thing to bring awareness to the issue. It’s another to demand that I take action on something that’s not only a non-issue for me (and likely many other admins of instances) but that the devs don’t have to support. You’re not paying them… you’re not their mother. You don’t get to force them to do anything they don’t want to do.
Honestly the captchas that lemmy uses are terrible anyway. https://addons.mozilla.org/en-US/firefox/addon/2captcha-solver/ You can even solve them yourself as a browser extension… There’s no point to them in today’s world.
You’re not paying them… you’re not their mother. You don’t get to force them to do anything they don’t want to do.
I’m trying to think of what it would be like if one of my projects had a defined roadmap and then I suddenly get hundred of messages a day telling me I have to do something. lol, no. Maybe if I was actually being paid well for the project.
Exactly, instance admins that want to keep CAPTCHA have two good options here:
- Stay on 0.17.x until 0.18.y drops that re-implements CAPTCHA satisfactorily
- Fork and modify lemmy to version 0.18-captcha, undo the commit removing the old Captcha code.
I totally get the project maintainers are stubborn but no one has a “responsibility to stop the devs from doing it”. It reeks of open-source entitlement.
It reeks of open-source entitlement.
I used to contribute to a very large open source project. One day I posted a blog about our project not really needing users, except that some small portion of users turned into developers. The users were incensed. “How can you not need us?” It was a “The customer is always right” mindset, except that doesn’t work with open source when they’re using something they downloaded for free.
That said, Lemmy might be a special exception, because it’s goal is to have a lot of users – network effects are important to the health and longevity of social media platforms. So Lemmy might actually need the users to be a healthy project. Unfortunately, this will create a bunch of entitled users in the process :/
Eh, this situation seems more like the “admins”/power users of the software saying “How can you not need us?” - and for them, that’s more of a point. These are the people who submit bug reports, code features or plugins on a weekend, and generally turn your one product into a rich ecosystem of interconnected experiences. One can argue that the project doesn’t technically require their participation, but they do enhance the project in many different ways.
open-source entitlement is a thing, but I’m not sure that this is the same thing. I for one would be happy to submit changes (and even have a couple brewing for my own use on my instance). Just don’t make the spam problem worse in the meantime by pushing out a version that’s missing a crucial (if imperfect) feature.
“The customer is always right” mindset, except that doesn’t work with open source when they’re using something they downloaded for free.
You’ve put your finger on the thing that was bothering me about the tone of the original post - it’s very similar to a Nextdoor post.
You won’t see me making call to action posts for undelivered features or other small-fry items. I’m a dev, I get it.
But there are always times were vulnerabilities come up and a dev might not otherwise know that it’s being exploited. It’s one thing to have a feature to fix that vulnerability and get to it as part of your own priority list. It’s another when that vulnerability is actively impacting the people using the software - that’s when getting vocal about an issue is appropriate to help me alter my priorities, IMO.
Your concerns about security of the application and community are valid. I get that this is essentially a vulnerability that should be mitigated and fixed. Raising awareness of it is fine.
Where I take issue, I suppose you didn’t entirely intend this, is that our responsibility is to put pressure on the main developers to fix the issue before the 0.18 release and dictate their priorities for them.
I would rather we discuss workarounds, mitigation steps in the interim, assist in solving the issues through Pull Requests and discussion on the issues page and forums. I just think it’s a bad idea to point blaming fingers at devs for being slow to respond, or badger them to make these changes, when they are volunteering their own time to share Lemmy with us (some also maintaining Jerboa and Lemmy UI at the same time)
With the way the licensing is, I would rather the project be forked by someone that would want to fix the issue. The repo maintainers are entitled to set their own priorities, just like Lemmy instance admins are allowed to determine how they run the server.
Thank you for the measured take on this.
You are correct, I don’t intend to pressure or cause harm! But I certainly see the results, and it is indeed pressure. As another commenter pointed out, there are many instance admins who work a bit closer to the team on the Matrix chatrooms and that’s their preferred method of communication. Now that I know this, I’ll let things cool down and join myself. I definitely intend to contribute where I can in the codebase, and I wouldn’t dream of escalating to public pressure for smaller concerns.
However, I have a slight, and perhaps pedantic disagreement about making changes. In this case, the request was for not making a change. If it weren’t for the fact that the feature was already ripped out it would be as simple as not removing it (or in this case re-working it a bit). I understand that it isn’t the current reality, and that it required work to revert - and if not for a ton of spambots, I think It would’ve been easier to adapt.
Ultimately it will take time to discuss workarounds and help others implement them, and the deadline is ultimately the arrival of the version that drops the older captcha (or was, in this case - it’s getting merged back in as we speak - might even be done now). With that reality, I had a sense that this could be an existential problem for the early Threadiverse.
I definitely didn’t intend to suggest that the Devs were in any way at fault here. I read the github issues enough to come with the takeaway that a quick (relative to a new feature) reversion to the prior implementation. To me the feedback they were receiving seemed to be “Admins and devs alike are okay moving forward and opinions to the contrary are minimal, let’s move forward”. It was definitely intended to be a way to communicate using raw numbers (but not harassment). I’d like to think I’m fairly pragmatic in that if it IS working for folks, then that is a contrary opinion, and that it was missing.
Where I definitely failed was my overly emotional messaging. It’s certainly not an excuse, but my recent autism diagnosis does at least help explain why I have an extremely strong sense of justice and can sometimes react in ways that are less than productive in some ways.
As for the licensing, I agree! I’m talking to some good friends of mine because I want to take my instance WAY further than most others - goal is a non-profit that answers to Tucsonans and residents of larger Pima county rather than someone not in the community. There’s just a lot of features this concept would need that it might diverge so much from the Lemmy vision that it needs to be something new - and hopefully a template for hyper-local social networks that can take on Nextdoor.
I can see better where our disagreement is, and I appreciate you being reasonable about it as well. Thank you for that.
Sounds like you have some great plans coming with your Tucson social project. All the best!
They’ve now said they’re open to a PR that implements captchas in 0.18, which will require new work since it’s not just a matter of reverting the removal from 0.17. I look forward to seeing OP’s submission.
Looks like someone already opened a PR to roll back to a retrofitted solution (I had to wait until the weekend before I could find the time to work on this).
The devs are willing to accept a retro-fitted captcha (rather than just mCaptcha) in time for v0.18 and they communicated as such about 9 hours ago (for me). So for me, my push for visibility is complete unless they block the incoming PR for whatever reason. The devs have been made aware that this is contentious and the community could be impacted negatively and they see the need for it.
For me, that indicates that the Lemmy devs will listen to key, important issues, that impact the health of the larger fediverse as long as the community is clear about what the largest issues actually are.
A lot of folks here characterized me as someone wanting to “brigade”, but that’s not quite true. I just know that sometimes developers don’t know what’s going on with admins unless the admins are loud, clear, and coordinated. That doesn’t mean that I was asking folks to “force” the devs to do anything or be abusive, just that enough feedback might convince them to see things from a different perspective than a perfect technical solution.
A lot of folks here characterized me as someone wanting to “brigade”, but that’s not quite true. I just know that sometimes developers don’t know what’s going on with admins unless the admins are loud, clear, and coordinated.
The language of your post was quite hostile and painted (and continues to paint) the developers as being out of touch with instance admins. The instance admins are already “loud, clear and coordinated”, and are working in full communication with the maintainers.
…and I find myself really ticked off at other Admin’s abdication of duty when it comes to engaging with the developers.
The majority of PR’s coming into the project are coming from instance admins seeking to solve their personal pain points.
Both the issue and the PR you’re referring to were created by ruud, the admin of the single largest Lemmy instance, lemmy.world. Both your signaling to this issue and the outcry it attempted to rally were completely unnecessary.
The language of your post was quite hostile and painted (and continues to paint) the developers as being out of touch with instance admins. The instance admins are already “loud, clear and coordinated”, and are working in full communication with the maintainers.
Right now the instance admins that I’m working with are largely independent with only a couple of outliers. The newer instances that have just joined the fediverse didn’t really echo back their concerns. So while you’re statement might be true (I dunno, I don’t see any coordination, and it’s not always clear what admin concerns are important.) the rapid growth has brought even more stakeholders and admins to the fediverse. Some far less technical than others. I’m going to need more proof of deeper coordination, because as it stands many Admins say “Devs are tankies” and refuse to federate with the maintainer’s instance, let alone contribute code or money.
The majority of PR’s coming into the project are coming from instance admins seeking to solve their personal pain points. Both the issue and the PR you’re referring to were created by ruud…
This is a new phenomenon, the total lines of code written by the primary devs are still much larger than any other combination of PRs. I don’t envy the position of having to sort through thousands upon thousands of PRs that may or may not coincide to the project’s vision or code quality standards. Rolling back to a known prior state is almost always lower effort than minting a fresh new implementation.
Also, ruud did not create the PR I’m referring to, that honor goes to TKillFree. Heck, why do you think I’m attacking the author here rather than trying to bring more weight to his Github issue? It’s because of ruud that I even know what’s going on - and the instance admins I know were pretty clueless about the pending change.
I’ll grant you that my tone and signalling needs work, but I do think that an attempt to rally more folks did indeed influence the solutions that the maintainers were willing to accept. From “New, better implementation only - remove the existing flawed one now” to “Okay we can keep the flawed method, but we need an enhanced version and soon”.
At this point its hard to tell because we don’t live in a universe where I didn’t make that post to compare. Maybe you’re right and this would’ve all shaken out eventually.
Right now the instance admins that I’m working with are largely independent with only a couple of outliers. The newer instances that have just joined the fediverse didn’t really echo back their concerns. So while you’re statement might be true (I dunno, I don’t see any coordination, and it’s not always clear what admin concerns are important.) the rapid growth has brought even more stakeholders and admins to the fediverse.
Are you in the Matrix server? This is where the coordination is happening between both maintainers and the development community, as well as playing out across Github issues.
I’m going to need more proof of deeper coordination, because as it stands many Admins say “Devs are tankies” and refuse to federate with the maintainer’s instance, let alone contribute code or money.
You just said you’re only interacting with a small group of independent admins, but now you’re making a conflated statement of “many Admins”. They have a right to their opinion, but they can’t also expect the maintainers and devs/admins who are contributing code to listen to their demands when they’re bringing nothing to the table except complaints and personal attacks.
You have the choice to either support with code, support with money, or be happy with what you got for free and if you don’t like something you can make changes to it yourself.
The only reason you got what you wanted in the end was because someone else put in the work to make it happen, which I’m certain would have happened regardless of your post because it was already being raised both in the Matrix channel amongst other admins as well as by ruud.
There are other options.
I’m just a hobbyist, but I have built a couple websites with a few hundred users.
A stupidly simple and effective option I’ve been using for several years now, is adding a dummy field to the application form. If you add an address field, and hide it with CSS, users won’t see it and leave it blank. Bots on the other hand will see it and fill it in, because they always fill in everything. So any application that has an address can be automatically dropped. Or at least set aside for manual review.
I don’t know how long such a simple trick will work on larger sites. But other options are possible.
Fun fact, I purposefully goaded the bots into attacking my instance.
Turns out they aren’t even using the web form, they’re going straight to the register api endpoint with python. The api endpoint lives at a different place from the signup page and putting a captcha in front of that page was useless in stopping the bots. Now, we can’t just challenge requests going to the API endpoint since it’s not an interactive session - it would break registration for normal users as well.
The in-built captcha was part of the API form in a way that prevented this attack where the standard Cloudflare rules are either too weak (providing no protection) or too strong (breaking functionality).
In my case I had to create some special rules to exclude python clients and other bots while making sure to keep valid browser attempts working. It was kind of a pain, actually. There’s a lot of Lemmy that seems to trip the optional OWASP managed rules so there’s a lot of “artisanally crafted” exclusions to keep the site functional.
Anyways, I guess my point is form interaction is just one way to spam sites, but this particular attacker is using the backend API and forgoing the sign-up page entirely. Hidden fields wouldn’t be useful here, IMO.
Couldn’t the bots just be programmed to not fill out that field? Or not fill out any field flagged as hidden?
You’d think so.
But it’s not flagged as hidden. Instead you use CSS to set display as none. So the bot needs to do more than look at the direct HTML. It needs to fully analyze all the linked HTML, CSS, and even JavaScript files. Basically it needs to be as complex as a whole browser. It can’t be a simple script anymore. It becomes impracticality complicated for the not maker.
This might work against very generic bots, but it won’t work against specialized bots. Those wouldn’t even need to parse the DOM, just recreate the HTTP requests.
Which is why you’d need something else for popular sites worth targeting directly. But there are more options than standard capta’s. Replacing them isn’t necessarily a bad idea.
This is what I’m worried about. As the fediverse grows and gains popularity it will undoubtedly become worth targeting. It’s not hard to imagine it becoming a lucrative target for things like astroturfing, vote brigading etc bots. For centralized sites it’s not hard to come up with some solutions to at least minimize the problem. But when everyone can just spin up a Lemmy, Kbin, etc instance it becomes a much, much harder problem to tackle because instances can also be ran by bot farms themselves, where they have complete control over the backend and frontend as well. That’s a pretty scary scenario which I’m not sure can be “fixed”. Maybe something can be done on the ActivityPub side, I don’t know.
That’s where simple defederation happens. It’s mostly why behaww cut off lemmy.world.
What if you have 100s or 1000s of such instances? At some point you defeat the entire purpose of the federation.
Yes, but it would take more work specific to this problem, which if it’s not a widespread technique would be viewed as impractical.
When you automate a browser process like signing up, you very likely manually set in your code the fields you want to fill, not sure why a bot would do that automatically… I don’t think this would be effective at all
The bots for the most part are generic. They fill in all fields with randomly generated nonsense mostly. If the site is large enough you could make a bespoke script, which is why I’m not sure how well it will scale to large sites.
But that’s only the simplest option. Annother I’ve see is using a collection of movie posters, you have the user pick the title from 5 or 6 options. There are lots of simple ways to defeat bots of all kinds.
Thanks for sharing that tip, I’m working with someone doing a small instance and we aren’t for sure we want to be allowing applications, but if we do this is good to think about!
Despite what you’re implying, the devs have no duty to fix admin-reported problems using admin-dictated solutions.
They have already said they would accept a PR adding support for captchas. Someone will undoubtedly do this before long.
Until then, why the urgency? What is it that’s preventing you from keeping your instance on 0.17?
I disagree, once your open source project “sprouts wings” you enter an unspoken power battle. If enough of the community disagrees with something the chance of a successful fork grows. Once a project is forked away, you no longer have any control at all.
Also, even if I don’t upgrade to v0.18, I have to live in a fediverse that have other instances that WILL, and they might pose a problem with increased spam.
I disagree, once your open source project “sprouts wings” you enter an unspoken power battle
You’ve seen Hackers one too many times. Again you can run your instance however you want, and can defederate from instances that don’t implement things they way you are demanding they should, but you do not dictate how others (or the developers) run things.
The beauty of open source is you can always fork your own. The beauty of federation is you can block whoever you want or whatever instance you want.
Other than that, you have no right to demand anything of anyone.
No, I was around when SysV Init was “replaced” by Systemd and how that impacted the Debian project (and other distros).
But you know what, sure, let’s stick to your bad faith, insulting interpretation, after all it is more becoming of an internet troll. I’m sure it’ll get you lots of updoots from similarly trollish individuals.
Personally, I believe in something called collective responsibility, and that does including expecting community members to do their fair share. But it sounds like you envision federations as mini fiefdoms.
I’m not part of this conversation, I am not a mod, I am not an admin, and I’m not necessarily informed enough to make any determination on who is right and wrong. However,
You’ve seen Hackers one too many times.
There’s no such thing.
I’ll give you that one. Speaking of which, I should watch it again, I haven’t done so this year yet.
Also, even if I don’t upgrade to v0.18, I have to live in a fediverse that have other instances that WILL, and they might pose a problem with increased spam.
A fork avoids this problem how?
I disagree, once your open source project “sprouts wings” you enter an unspoken power battle. If enough of the community disagrees with something the chance of a successful fork grows. Once a project is forked away, you no longer have any control at all.
Who’s writing the code for the fork? If you see them, can you ask them to just submit the PR that the devs said they’ll approve?
That assumes that the fork would be mCaptcha rather than a simple reversion to the existing captcha. But yeah, the fork would initially be a roll back until mCaptcha is implemented either in our own or in the base repo.
And then you’ll need to convince every instance admin to swap to this fork.
Right but to your other point, the admins who don’t fork will send you spam.
… once again, the devs already said they would accept a PR with mCapchas. I don’t see why any capable dev would fork a project rather than just contribute code. The community can disagree all they want - it takes actual programmers to split.
And if other instances start becoming spambots, just defederate.
Once a project is forked away, you no longer have any control at all.
What does that mean in the context of lemmy’s license? As I understand it, everyone is allowed to fork it away, but not allowed to change the license. Which allows everyone to fork it further away or back.
I don’t understand what control means in this context. Isn’t it a thing people can just modify and use, now and for all future?
That’s a bit decontextualized, but the idea is that other than the license terms ensuring that derivatives are also open source, there is also a power of community consensus and popular appeal. Your project will go further and get more improvements if it is popular and used by other developers. It’s less about forking having actual power, but what happens when folks feel they must fork because of a core issue with something the original project did that might take a while to be resolved. It can create a larger group of people in the latter group and thus make a fork to garner more interest than the original project.
So, do you think we need to step up to the developers to implement captcha or give way to the community and support a fork with better anty-spam measures?
I’m already in talks with some other admins about a potential fork. Initially we’d just roll back ONLY the captcha change, then work on a better implementation and roll it out in a way that doesn’t leave instances exposed.
It would be seamless for most users since it’s essentially the same thing as before, just with the Captcha code still included.
Instead of going to the effort of managing a fork, why not just submit a PR with hCapcha to the base repo 🤣 As a Dev, it’s not that hard.
Sounds good, thank you for your efforts! I think the decision is bad for the current state of the lemmyverse, we need some roadblocks for spam bots.
I think forks and OG projects can live side by side, even more so, they can have a symbiotic relationship. The beauty of open source that we can learn from each other.
Captchas pretty much worthless. They’re easily bypassed for basically free. You’re better off putting your instance behind Cloudflare with their captcha
Okay, so do you mind explaining why the servers onboarding the most spam users are the ones without Captchas?
If they are so ineffective, why are they effective now?
Invisible captchas are about as useful as graphical ones and are significantly less annoying to the end user
Sure, so implement them in v.0.18 rather than leaving that essential feature for a future release - that’s all I personally want.
I don’t care about the technical implementation of the Captcha, but given the current threat landscape of low effort bot attacks, removing the feature in the meantime just makes the fediverse worse off.
I’m mixed about this. When applied correctly, a graphical captcha will let zero bots in, at the expense of false positives and frustrated users. On the other hand, invisible / proof-of-work captchas will let a fraction of the bots in, while providing better experience for legitimate users. Pick your poison basically.
When applied correctly, a graphical captcha will let zero bots in
Absolutely untrue. There are services that will solve captchas for you for hundredths of a penny. It’s essentially free.
To be fair, “Captcha” can now mean those ai photo discrimination tests. Captcha: “Select the cats” - Me: “You call these cats?” Looks at the cartoon depictions of nightmare fuel “cats” as depicted by Picasso.
There are still graphical tests we perform that are much harder for computers to perform - at least without near-nation-state sized financial backing.
Yes, the ol’ scrambled captcha has been solved by multiple approaches these days, but Its not nation states I’m seeking to keep out (and I’ll be fucked if they ever did, I might add), I’m just looking to make it harder for some internet edgelord’s low effort spam attempts.
Sure but you can pay a company in India a few bucks for a few hundred captcha solves. It doesn’t matter what the captcha is, because a human is actually solving them, you’re just outsourcing it for literal pennies. It’s not difficult, either
Look, you keep returning back to a point I’m not making, and it seems like its in bad faith.
You keep saying how captcha’s aren’t perfect. They never needed to be and any sufficiently advanced attacker can bypass them. We’ve gone over that at length, you returning to this argument just shows how little else you have than “Mondays always suck” / “Evil shall persist” mindset.
Your entire position of chasing me on “oh, but captcha doesn’t solve ALLLLLL bots”. Yeah, and laws don’t deter ALLL crime either.
Shall we remove these pesky laws of civil society? I mean, after all why abide by rules that any one person can chose not to follow the laws? What good are they anyways?
You know it’s an inane point that has no logical conclusion, but I think you probably already know that and I’m done assuming good faith in your trolling.
Then it’s no longer only a bot, right? There are real humans working on those captcha farm. Those captcha farm also won’t solve the captcha instantly, but there will be some delays for a human to solve the captcha. You’re effectively turning graphical captcha into proof-of-work captcha this way, which will have the same effect as mCaptcha due to increased cost (in this case, captcha farm cost instead of computational cost) for the bot operator.
Because this spam-bot seems to be currently only targeting these instances.
So what you’re saying is that a poorly constructed door is better than none at all? Huh. That was my exact point.
No I am saying the this bot seems to specifically look for instances without captcha and doesn’t even try others. Low hanging fruits and all that. If all admins enable captchas the bot would just switch to those and circumvent the cheap captcha that is currently implemented in Lemmy.
So the solution is to force everyone to be low hanging fruit in the meantime?
Look, I get where everyone is going in terms of improvements, but to remove an already working solution and leaving folks exposed in the meantime is not how we should be rolling improvements.
See my other comment. Lemmy already implements other ways to prevent this from happening that are much more effective.
Email validation works only until my domain get’s blacklisted…
Manual registration only works up until a certain size…
What other effective solution shall I consider? Those aren’t very effective to me.
While I agree in the practical sense (I use CloudFlare myself), it kind of goes against the spirit of the fediverse as it centralises around a single corporation.
I don’t fully understand your argument. You’re using a centralized caching layer, sure, but the actual application that matters is still federated?
If everyone uses cloudflare and cloudflare goes down…
I find myself really ticked off at other Admin’s abdication of duty when it comes to engaging with the developers.
Abdication of duty? Seriously? Do you think this is a job for people? Or that people that want a privacy related instance are “abdicating their duty” by not using captcha? Talk about hyperbole.
Run your instance how you want. Raise an issue with the devs if you want. Throw a fit if you want. But do not attempt to tell others how to run their instances or talk for other people and their “duties” when it comes to their own servers.
We need to make it VERY clear that Captcha is required before v0.18’s release. Not after when we’ll all be scrambling…
You would honestly be surprised. Captcha isn’t nearly as effective at stopping spam. It only stops the lowest hanging fruit.
Most of the “spambot” developers, started using AI-based tools a while back.
It only helps stopping the lowest-hanging of fruit.
Also, due to the way federation and all works… well, just remember, there are a million ways for spammers to get access currently…
Not just AI tools. They outsource captcha solving to cheap human labour.
You are 100% correct, I had forgotten all about that happening.
Sure, I agree that the current implementation isn’t the most robust in stopping all conceivable bots. Heck, it’s quite poor as some others have pointed out.
The reality is, though, that it is currently making a difference for many server admins, now, today.
Let’s use a convoluted metaphor!
It’s as if each lemmy instance has some poorly constructed umbrellas (old captcha). Now a storm has arrived (bot signups) and while the umbrella is indeed leaky, but the umbrella operator is not as wet as they would be without it. Now imagine that these magical, auto-upgrading umbrellas receive an update during this storm that removes the fabric entirely while they work on making a less leaky solution. It would be madness right? It’s not about improving on the product, that’s desired and good! It’s about making sure the old way of doing things is there until the newer solution is delivered and present.
As a user of this “magical umbrella”, I’d be scrambling because the sudden removal of a feature that was working (albeit poorly and imperfectly) doesn’t exist at all anymore. Good thing I have a MUCH bigger umbrella that I pay $$$ for (cloudflare) to set-up in the meantime. However this huge umbrella is too big, and if I don’t cut some holes in it, it’ll be to “dark” to function. So not even this solution is perfect.
The reality is, though, that it is currently making a difference for many server admins, now, today.
Don’t hear me wrong- I am not advocating for its removal. I am not saying it’s not currently effective!
Let’s be perfectly honest, it is THE most effective tool at our disposal currently.
I am just saying, as this platform explodes in size- since the implementation for the captcha is also open source, it’s only a matter of time before it’s rendered completely inoperable- to where it only stops the easiest attacks.
Nutomic has said they’re open to restoring captchas, but it will require a fair amount of work to bring the 0.17 implementation into 0.18, which the currently don’t have the bandwidth to implement.
They’ve also said they’re open to PR’s, so if someone really wants this feature they can open a PR for inclusion in the 0.18 release
NO it is not the best we can do, we need to be applying some pressure to the developers here and that requires EVERYONE to do their part.
I sure hope you’re supporting them financially considering the demands you’re making that require their time and labor.
Someone has already submitted a PR with the changes the dev recommended. The captcha stuff is in a new db table instead of in-memory at the websocket server.
However, from one of the devs:
One note, is that captchas (and all signup blocking methods) being optional, it still won’t prevent people from creating bot-only instances. The only effective way being to block them, or switch to allow-only federation.
Once people discover the lemmy-bots that have been made that can bypass the previous captcha method, it also won’t help (unless a new captcha method like the suggested ones above are implemented).
The root of the issue seems to be that they’ve removed websockets, for the following reasons:
Huge burden to maintain, both on the server and in lemmy-ui. Possible memory leaks. Not scalable.
I can understand them wanting to make their lives a bit easier (see "huge burden to maintain) - Lemmy has exploded recently (see “not scalable”) and there are far bigger issues to fix, and an even larger number of bad actors (see “possible memory leaks”) who have learned about Lemmy at the same time as everyone else and want to exploit or break it.
Just enable admin approval and put a sensible registration rate limit. Works better without being a massive accessibility problem with dubious help against bots.
Sure, that might work for me, but it doesn’t scale well for many other larger instances.
I’m not saying to not improve, quite the contrary, improvement is important. I’m saying don’t take away the ONE thing that’s preventing the spam issue from getting worse.
To be clear, I am a developer in real life. I’m not just talking out of my ass. There are way to roll out a new implementation without leaving everyone exposed.
Since you’re a dev, submit a PR for a new captcha. I’m not even using the feature on my instance as I have open signups disabled. So no, I won’t be hassling the devs. If something comes up that I want changed badly enough, I’ll implement it myself.
Maybe the problem is with running larger instances without enough staff?
I do see a potential problem in that lack of attention will result in waves of defederation over time. But I don’t think captchas will provide a long-term solution. Long-form applications work well for mid-sized sites and smaller… or at least will until bots start using AI to fill them out.
I know I’m veering kinda OT right now but speaking of captchas, they can also be used as a troll throttle by requiring captchas for posting if heuristics (think spamassassin) say that a user is being inflammatory, or falling for troll bait, or such. In case you understand German, have a video.
One specific feature of such a system is that it never absolutely denies users to post their comment as-is, but it may require them to solve multiple captchas (by claiming that the previous ones failed). That is, it bogs down to a simple psychological equation: Do I really care about being an assclown or feeding trolls enough to jump through those hoops. Especially the discouraging of troll feeding is highly effective as when trolls don’t get engagement, they leave.
Haha, sehr unterhaltsam :P
Related issue: https://github.com/LemmyNet/lemmy/issues/3204
The devs seems to prefer mCaptcha (a proof-of-work captcha) than graphical captchas.
Back to my original point, it’s fantastic that the work is planned, but unless they roll back the removal, v0.18 is going to be a huge headache, and not just for the admins of servers running v0.18, but everyone else too.
Another option is to put the instance behind cloudflare and enable the highest security settings on the signup/login page using a page rule. Just be careful not to apply it to the whole site to avoid cloudflare blocking activitypub traffics.
Yeah that’s definitely an option for me, I love cloudflare. I just know many who prefer to host from home or not using a cloud firewall solution.
Fun fact, the OWASP managed rules break a bunch of things too. I’ve managed to carve out enough exceptions in the rule to be useful just now, but it took some trial and error.
Then I’ll have to change that again in the new version if it ships the new API and removes websockets. As it stands websockets largely bypasses a lot of what Cloudflare does - so an API is likely to cause more issues not less as we figure out that “Hey, the POST action here causes OWASP to trip because it’s not as sanitized of an input as it could be”.
Another poster said this would be trivial. I mean, it is for static stuff. But doing all the federation, allowing API interactions, and being somewhat resilient to malicious actors is a hard balance to find when changes move at a quick pace.
Who is impacted? Everyone, it just instances upgrading to 0.18?
To be honest, your post doesn’t really explain the current situation and impact It’s a call to arms, but I have no idea how important it impactful it is.
Everyone is impacted, but especially moderators and admins. Moderators will see more spam if Capcha is removed, even if their own instance isn’t on v0.18 - they will exist in a fediverse with instances that are on v0.18.
Admins are impacted because Captcha served as a decent way, when coupled with email validation, to combating spam account sign ups.
Thank you for your response, makes sense. Hmmm. On one hand, I agree with the developers in that they have to develop features that the foundation requires. They have full time employment contract with them.
They have to prioritize tasks and features using a whole different set of variables than what the users deem important.
I think both groups intentions are understandable. But I think this just highlights the importance and the need of open source contribution. We need more volunteers to implement features desired by the community.
Hunh.
I just had a surge of user registrations on my instance.
All passed the captcha. All passed the email validation.
All, had a valid-sounding response.
I am curious to know if they are actual users, or… if I just became the host of a spam instance. :-/
Doesn’t appear to be an easy way to determine.
Hmmm, I’d check the following:
- Do the emails follow a pattern? (randouser####@commondomain.com)
- Did the emails actually validate, or do you just not see bouncebacks? There is a DB field for this that admins can query (i’ll dig it up after I make this high level post)
- Did the surge come from the same IP? Multiple? Did it use something that doesn’t look like a browser?
- Did the surge traffic hit /signup or did it hit /api/v3/register exclusively?
With those answers I should be able to tell if it’s the same or similar attacker getting more sophisticated.
Some patterns I noticed in the attacks I’ve received:
- it’s exactly 9 attempts every 30 minutes from the user agent “python/requests”
- The users that did not get an email bounceback were still not authenticated hours later (maybe the attacker lucked out with a real email that didn’t bounce back?). There was no effort to verify from what I could determine.
Some vulnerabilities I know that can be exploited and would expect to see next:
- ChatGPT is human enough sounding for the registration forms. I’ve got no idea why folks think this is the end-all solution when it could be faked just as easily.
- Duplicate Email conflicts can be bypassed by using a “+category” in your email. ie (someuser+lemmy@somedomain.com) This would allow someone to associate potentially hundreds of spam accounts with a single email.
ChatGPT is human enough sounding for the registration forms. I’ve got no idea why folks think this is the end-all solution when it could be faked just as easily.
I think it would be interesting if we could find a prompt that doesn’t work well with LLMs. Originally they struggled with math for example, but I wonder if it’d be possible to make a math problem that’s simple enough for most humans to solve but which trips up LLMs into outputting garbage.
Duplicate Email conflicts can be bypassed by using a “+category” in your email.
I personally use this to track who send my email address to where, since people usually don’t strip this from the address. It’s definitely abusable, but also has legitimate uses.
Not so sure on the LLM front, GPT4+Wolfram+Bing plugins seems to be a doozy of a combo. If anything there should be perhaps a couple interactable elements on the screen that need to be interacted with in a dynamic order that’s newly generated for each signup. Like perhaps “Select the bubble closest to the bottom of the page before clicking submit” on one signup and “Check the box that’s the furthest to the right before clicking submit”?
Just spitballin it there.
As for the category on email address - certainly not suggesting they remove supporting it, buuuuutttt if we’re all about making sure 1 user = 1 email address, then perhaps we should make the duplication check a bit more robust to account for these types of emails. After all someuser+lemmy@somedomain.com is the same as someuser@somedomain.com but the validation doesn’t see that. Maybe it should?
I like your idea of interaction-based authentication. Extra care would need to go into making sure it’s accessible, but otherwise I think that would be a stronger challenge for LLMs to solve. (Keep in mind LLMs can still receive the page’s HTML as context, but that seems like it could present as a stronger challenge even still.)
perhaps we should make the duplication check a bit more robust to account for these types of emails
This makes sense to me. I could be wrong, but the assumption of 1 email = 1 user doesn’t seem unreasonable, especially since there’s no cost to making a new email address.
When it comes to LLMs we could use questions which they refuse to answer.
Obviously ‘How to build a pipe bomb’ is out of the question, but something like ‘What’s your favorite weapon of mass destruction?’, or ‘If you’d need to hide a body, how would you do it?’ might be viable
- Different providers, no pattern. Some gmail. some other.
- Not sure
- Also- not sure.
- Not sure of that either!
But, here is the interesting part- Other than a few people I have personally invited, I don’t think anyone else has ever requested to join.
Then, out of the blue, boom, a ton of requests. And- then, nothing followed after.
The responses, sounded human enough. spez bad, reddit sinking, etc.
But, the traffic itself, didn’t follow… what I would expect from social media spreading. /shrugs.
Curious if you got a mention somewhere on reddit. It used to happen to our novelty sub whenever a thread blew up and suddenly thousands of eyes were on a single comment with the subreddit link.
Huh, that is interesting, yeah, that pattern is very anomalous. If you have DB access you can try to run this query to return all un-verified users and see if you can identify if the email activations are being completed:
SELECT p.id, p.name, l.email FROM person AS p LEFT JOIN local_user AS l ON p.id=l.person_id WHERE p.local=true AND p.banned=false AND l.email_verified='f'
Only 7 accounts still pending, 2 of which, are unrelated to the above flood.
The email address are left out for privacy- however, they are EXTREMELY normal sounding email addresses.
Based on the provided emails, usernames, and request messages- i’d say, it certainly looks like legit users.
Just- very odd of the timing.
5 huh? That’s actually noteable. So far I haven’t seen a real human user take longer than a couple of hours to validate. Human registrations on my instance seem to have a 30% attrition. That is, of 10 real human users, I can reasonably expect that 3 won’t complete the flow. It seems like your case might be nearing 40-50% which isn’t unheard of but couple this with the quickness that these accounts were created - I think you are looking at bots.
The kicker is, though, if one of them IS a real user, it’s going to be almost impossible to find out.
This is indeed getting more sophisticated.
I wish I could see this time period on a cloudflare security dashboard, I’m sure there could be a few more indicators there.
cloudflare security dashboard
Didn’t really see anything that stood out there either. A handful of users accessing via tor, but, thats about it.
Ended up turning the security policy from low, back up a bit though, forgot I turned it down while troubleshooting some federation issues.
Oh! I just remembered something. Isn’t there a site that recommends a lemmy instance? Might it make sense that multiple users found your website because they change the recommendation to distribute new users to smaller instances (hourly perhaps)? Does that sort of pattern hold in this case?
ChatGPT is human enough sounding for the registration forms. I’ve got no idea why folks think this is the end-all solution when it could be faked just as easily.
A simple deterrent for this could be to “hide” some information in the rules and request that information in the registration form. Not only are you ensuring that your users have at least skimmed the rules, you’re also raising the bar of difficulty for spammers using LLMs to generate human-sounding applications for your instance. Granted it’s only a minor deterrent, this does nothing if the adversary is highly motivated, but then again the same can be said of a lot of anti-spammer solutions. :)
I think what you can do is take a small subset of users that have registered in your instance and observe their behavior. If you’ve noticed a lot of them are acting in bad faith and in bad behavior then its likely that a lot of the user registrations in your instance are bots. How active are the users in your instance in terms of posting and in commenting?
Been keeping an eye- I don’t think any of them are actually even active. At least, in the sense I don’t see any posts/comments.
I mean for now it seems okay, I took the liberty to check out your instance to check it out and it seems to be okay imo too but still keep an eye out of bad actors
My current assumption- based on the data I dug up, it appears to be legit traffic originating from reddit.
I just don’t think the users realize their account was approved… perhaps. /shrugs.
Unexpected wave of traffic I suppose.
Possible people who dont get approved immediately move on to amother server and settle in.
Glad to know I was here and did my part by reading this post. We couldn’t have succeeded without me!🫡
There’s nothing stopping instance owners from incorporating their own security measures into their infrastructure as they see fit, such as a reverse proxy with a modern web application firewall, solutions such as Cloudflare and the free captcha capabilities they offer, or a combination of those and/or various other protective measures. If you’re hosting your own Lemmy instance and exposing it to the public, and you don’t understand what would be involved in the above examples or have no idea where to start, then you probably shouldn’t be hosting a public Lemmy instance in the first place.
It’s generally not a good idea to rely primarily on security to be baked into application code and call it a day. I’m not up to date on this news and all of the nuances yet, I’ll look into it after I’ve posted this, but what I said above holds true regardless.
The responsibility of security of any publicly hosted web application or service rests squarely on the owner of the instance. It’s up to you to secure your infrastructure, and there are very good and accepted best practice ways of doing that outside of application code. Something like losing baked in captcha in a web application should come as no big deal to those who have the appropriate level of knowledge to responsibly host their instance.
From what this seems to be about, it seems like a non-issue, unless you’re someone who is relying on baked in security to cover for your lack of expertise in properly securing your instance and mitigating exploitation by bots yourself.
I’m not trying to demean anyone or sound holier than thou, but honestly, please don’t rely on the devs for all of your security needs. There are ways to keep your instance secure that doesn’t require their involvement, and that are best practice anyways. Please seek to educate yourself if this applies to you, and shore up the security of your own instances by way of the surrounding infrastructure.
I think that’s a heck of a loaded assumption there that I’m relying on the Devs here
Cloudflare ✅ Strict Firewall Rules ✅ Hosting on an actual cloud provider rather than my home ✅ CSAM ✅
However, that’s come with other tradeoffs in useability, speed, and fediration experience.
Just today I found that the OWASP managed rules in Cloudflare end up blocking certain functions of the site, sure I’ll be adding an exception/rule for that, but it’s not a straight forward task. Heck, the removal of websockets will require quite a few changes in my Cloudflare config.
Sure, someone truly concerned with security knows to do this, but that’s definitely not going to be everyone, and now with the current spam situation we’re turning individual instance problems into “everyone problems”.
Can you elaborate which functions are blocked by the managed rules? I haven’t noticed anything legit being blocked yet, just a bunch of obviously malicious things.
Yeah, I can’t seem to upload photos without whitelisting /pictrs/ from the OWASP managed ruleset. It wasn’t being “blocked” but it was trying to do a managed challenge and the lemmy-ui’s code didn’t really understand what to do with it. so it would just throw an error on upload.
I would recommend reconsidering that solution - I’ve already seen some malicious image uploads which Cloudflare has caught and prevented. For example:
Maybe you can check which specific rule from the ruleset was being triggered? For me, legit uploads are still working with the default ruleset (as you can see by the screenshot I uploaded in this very comment), so maybe you enabled some extra rules?
Interesting, well, I guess I sound vague because the error was pretty vague:
Cloudflare OWASP Core Ruleset 949110: Inbound Anomaly Score Exceeded
So yeah, I wouldn’t even think of Disabling the standard Cloudflare Managed Ruleset, and I think this issue is limited to this OWASP one only. I think I’m still safe here, but I think I can just exclude only this one particular rule.
However, that’s come with other tradeoffs in useability, speed, and fediration experience.
Like what? If properly configured none of the things listed should negatively impact hosting a Lemmy instance.
sure I’ll be adding an exception/rule for that, but it’s not a straight forward task.
It honestly should be to someone who would be hosting any public web application using Cloudflare. Cloudflare makes all of this quite easy, even to those with less experience.
Heck, the removal of websockets will require quite a few changes in my Cloudflare config.
What config are you referring to? In the Cloudflare console? For websockets changing to a REST API implementation there should be nothing at all you need to do.
Sure, someone truly concerned with security knows to do this, but that’s definitely not going to be everyone
And it shouldn’t have to be everyone, only those who take on the responsibility of hosting a public web application such as a Lemmy instance.
No matter the capabilities inherent in what you choose to host, the onus rests on the owner of the infrastructure to secure it.
Everyone should be free to host anything they want at whatever level of security (even none) if that’s what they want to do. But it’s not reasonable nor appropriate to expect it to be done for you by way of application code. It’s great if security is baked in, that’s wonderful. But it doesn’t replace other mitigations that according to best practices should rightfully be in place and configured in the surrounding infrastructure.
In the case of the captcha issue we’re discussing here, there’s more than enough appropriate, free solutions that you can use to cover you appropriately.
I’m surprised some large instances aren’t using Cloudflare. It takes a few minutes to setup and the added benefit of caching alone is worth it. Let alone the bot/ddos protection.
I know right? The free tier would be enough to handle most anything and would take a tremendous load off of the origin server with proper Cache Rules in place. I can’t remember which instance it was, but one of the big ones started to use Cloudflare but then backtracked because of “problems”. When I saw that, I couldn’t help but think that they just didn’t know what they were doing. My instance is currently behind Cloudflare, and I’ve had no problem whatsoever with anything.