FOSS infrastructure is under attack by AI companies

WorkingLemmy@lemmy.world · 2 days ago

FOSS infrastructure is under attack by AI companies

carrylex@lemmy.world · 20 hours ago

While AI crawlers are a problem I’m also kind of astonished why so many projects don’t use tools like ratelimiters or IP-blocklists. These are pretty simple to setup, cause no/very little additional load and don’t cause collateral damage for legitimate users that just happend to use a different browser.

bountygiver [any]@lemmy.ml · 17 hours ago

the article posted yesterday mentioned a lot of these requests are only made once per IP address, the botnet is absolutely huge.

MTK@lemmy.world · 19 hours ago

IP based blocking is complicated once you are big enough or providing service to users is critical.

For example, if you are providing some critical service such as health care, you cannot have a situation where a user cannot access health care info without hard proof that they are causing an issue and that you did your best to not block the user.

Let’s say you have a household of 5 people with 20 devices in the LAN, one can be infected and running some bot, you do not want to block 5 people and 20 devices.

Another example, double NAT, you could have literally hundreds or even thousands of people behind one IP.

litchralee@sh.itjust.works · edit-2 18 hours ago

Let’s say you have a household of 5 people with 20 devices in the LAN, one can be infected and running some bot, you do not want to block 5 people and 20 devices.

Why not, though? If a home network is misbehaving, whoever is maintaining that network needs to: 1) be aware that there’s something wrong, and 2) needs to fix it on their end. Most homes don’t have a Network Operations Center to contact, but throwing an error code in a web browser is often effective since someone in the household will notice. Unlike institutional users, home devices are not totally SOL when blocked, as they can be moved to use cellular networks or other WiFi networks.

At the root of the problem, NAT deprives the users behind it of agency: they’re all in the same barrel, and the maxim about bad apples will apply. You’re right that it gets even worse for CGNAT, but that’s more a reason to refuse all types of NAT and prefer end-to-end IPv6.

carrylex@lemmy.world · 13 hours ago

IP based blocking is complicated once you are big enough

It’s literally as simple as importing an ipset into iptables and refreshing it from time to time. There is even predefined tools for that.