**Hans-Christoph Steiner** @eighthave@librem.one · Oct 12, 2025, 11:32

**Hans-Christoph Steiner** @eighthave@librem.one · Oct 12, 2025, 11:32

Hans-Christoph Steiner @eighthave@librem.one

Oct 12, 2025, 11:32

Hans-Christoph Steiner @eighthave@librem.one

Ever more websites are using #Cloudflare to block #AI scrapers. Cloudflare is still a man-in-the-middle #MITM attack on the web, but I do think people should have the ability to block the AI crap. So I now have some sympathies for using Cloudflare. What if we had real gov #eID that could be used for captchas? This requires privacy-respecting services that only see the data they need, e.g. "are you a human with an eID? yes/no". There are concerns with eIDs but in implementations not the core idea

**Tuxicoman** @tuxicoman@social.jesuislibre.net · Oct 12, 2025, 11:37

**Tuxicoman** @tuxicoman@social.jesuislibre.net · Oct 12, 2025, 11:37

Oct 12, 2025, 11:37

Tuxicoman @tuxicoman@social.jesuislibre.net

@eighthave but a client side scrupt could automate the interaction witha real id card then automate scrappung using this id.

And no doubt people will rent the use of their id by scrapping companies...

**Hans-Christoph Steiner** @eighthave@librem.one · Oct 12, 2025, 11:42

**Hans-Christoph Steiner** @eighthave@librem.one · Oct 12, 2025, 11:42

Oct 12, 2025, 11:42

Hans-Christoph Steiner @eighthave@librem.one

@tuxicoman sounds like a well known problem with known solutions, for example, APIs with rate limiting, tokens, etc.

**Tuxicoman** @tuxicoman@social.jesuislibre.net · Oct 12, 2025, 11:46

**Tuxicoman** @tuxicoman@social.jesuislibre.net · Oct 12, 2025, 11:46

Oct 12, 2025, 11:46

Tuxicoman @tuxicoman@social.jesuislibre.net

@eighthave you mean throttling usage per "detected" same request origin (same id here)?

**Tuxicoman** @tuxicoman@social.jesuislibre.net · Oct 12, 2025, 11:51

**Tuxicoman** @tuxicoman@social.jesuislibre.net · Oct 12, 2025, 11:51

Oct 12, 2025, 11:51

Tuxicoman @tuxicoman@social.jesuislibre.net

@eighthave

It looks you want to do the same as copyright holders. Control usage of published content. DRM in a sense.

There could be watermarks in your content you could retrieve in the AI outputs but it's fragile.

There could be legal obligations by AI companies to justify the source of their materials (we have it for food, intellectual property) and show proof of rights of use.

How can AI comapnies train of anna's archive and it's fine for them....only.

**Tuxicoman** @tuxicoman@social.jesuislibre.net · Oct 12, 2025, 11:55

**Tuxicoman** @tuxicoman@social.jesuislibre.net · Oct 12, 2025, 11:55

Oct 12, 2025, 11:55

Tuxicoman @tuxicoman@social.jesuislibre.net

@eighthave

Also I suspect those companies can embed their customers in the process.

Gmail users give other users content to Gmail.

Facebook user give picture content containing other humans to Facebook

Same for contact books in Whatsapp....

This is a real problem.

**Hans-Christoph Steiner** @eighthave@librem.one · 2025-10-12T20:12:17Z

Hans-Christoph Steiner @eighthave@librem.one

@tuxicoman that's not how copyright law works. Currently everything is copyrighted once its created, that includes emails. Just because someone forwards someone else email does not mean that they can grant a license for a text that someone else created.

Oct 12, 2025, 20:12 · Web · · ·