• carrylex@lemmy.world
    link
    fedilink
    arrow-up
    7
    ·
    20 hours ago

    While AI crawlers are a problem I’m also kind of astonished why so many projects don’t use tools like ratelimiters or IP-blocklists. These are pretty simple to setup, cause no/very little additional load and don’t cause collateral damage for legitimate users that just happend to use a different browser.

    • bountygiver [any]@lemmy.ml
      link
      fedilink
      English
      arrow-up
      7
      ·
      17 hours ago

      the article posted yesterday mentioned a lot of these requests are only made once per IP address, the botnet is absolutely huge.

    • MTK@lemmy.world
      link
      fedilink
      arrow-up
      3
      ·
      19 hours ago

      IP based blocking is complicated once you are big enough or providing service to users is critical.

      For example, if you are providing some critical service such as health care, you cannot have a situation where a user cannot access health care info without hard proof that they are causing an issue and that you did your best to not block the user.

      Let’s say you have a household of 5 people with 20 devices in the LAN, one can be infected and running some bot, you do not want to block 5 people and 20 devices.

      Another example, double NAT, you could have literally hundreds or even thousands of people behind one IP.

      • litchralee@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        5
        arrow-down
        1
        ·
        edit-2
        18 hours ago

        Let’s say you have a household of 5 people with 20 devices in the LAN, one can be infected and running some bot, you do not want to block 5 people and 20 devices.

        Why not, though? If a home network is misbehaving, whoever is maintaining that network needs to: 1) be aware that there’s something wrong, and 2) needs to fix it on their end. Most homes don’t have a Network Operations Center to contact, but throwing an error code in a web browser is often effective since someone in the household will notice. Unlike institutional users, home devices are not totally SOL when blocked, as they can be moved to use cellular networks or other WiFi networks.

        At the root of the problem, NAT deprives the users behind it of agency: they’re all in the same barrel, and the maxim about bad apples will apply. You’re right that it gets even worse for CGNAT, but that’s more a reason to refuse all types of NAT and prefer end-to-end IPv6.

      • carrylex@lemmy.world
        link
        fedilink
        arrow-up
        1
        arrow-down
        3
        ·
        13 hours ago

        IP based blocking is complicated once you are big enough

        It’s literally as simple as importing an ipset into iptables and refreshing it from time to time. There is even predefined tools for that.