Forum: Jacob's Hideout BBS

Venting about forums.debian.net

From Stefan Monnier@3:633/10 to All on Monday, January 19, 2026 23:50:01

Is forums.debian.net linked to Debian or is it some kind of scam using "debian.net" to confuse the user into thinking it's legitimate?

I just failed their "test for human" when trying to register, and the
only reason I tried to register was to see some replies which are hidden
behind an obnoxious "You must be a registered member and logged in to
view the replies in this topic".

That seems rather contrary to Debian's general philosophy.

- Stefan

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Jeffrey Walton@3:633/10 to All on Tuesday, January 20, 2026 00:00:01

On Mon, Jan 19, 2026 at 5:40?PM Stefan Monnier <monnier@iro.umontre
al.ca> wrote:

Is forums.debian.net linked to Debian or is it some kind of scam using "debian.net" to confuse the user into thinking it's legitimate?

I just failed their "test for human" when trying to register, and the
only reason I tried to register was to see some replies which are hidden behind an obnoxious "You must be a registered member and logged in to
view the replies in this topic".

That seems rather contrary to Debian's general philosophy.

forums.debian.net is administratively controlled by the Debian
project; see <https://wiki.debian.org/DebianNetDomains>. I'm guessing
Debian is also the technical contact for the domain and site. I can
only say "guess" because the registrar redacted both administrative
and technical contacts for the domain.

Jeff

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Stefan Monnier@3:633/10 to All on Tuesday, January 20, 2026 05:50:01

Is forums.debian.net linked to Debian or is it some kind of scam using
"debian.net" to confuse the user into thinking it's legitimate?

forums.debian.net is administratively controlled by the Debian
project; see <https://wiki.debian.org/DebianNetDomains>.

Having spent more time around it, and recovered from my frustration,
I can confirm that it seems very legitimate.

[ And I finally managed to register, after finding the fault in my
human reasoning (I stupidly hadn't noticed that there were more fields
to fill beyond the bottom of the window. I guess a bot wouldn't
have made that mistake ?). ]

- Stefan

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From tomas@3:633/10 to All on Tuesday, January 20, 2026 07:50:01

On Mon, Jan 19, 2026 at 11:40:06PM -0500, Stefan Monnier wrote:

Is forums.debian.net linked to Debian or is it some kind of scam using
"debian.net" to confuse the user into thinking it's legitimate?

forums.debian.net is administratively controlled by the Debian
project; see <https://wiki.debian.org/DebianNetDomains>.

Having spent more time around it, and recovered from my frustration,
I can confirm that it seems very legitimate.

Thank the LLM craze for that. Nearly every admin of an open site I
know has seen its site trampled to the ground by those harvesters.
Primitive accumulation, Marx's original sin of simple robbery, "had
eventually to be repeated, lest the motor of capital accumulation
suddenly die down" (Hannah Arendt, as quoted by Shoshana Zuboff.
Arendt, in turn attributes this idea to Rosa Luxemburg).
This is why we can't have nice things.
Cheers
--
tom�s

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Henrik Ahlgren@3:633/10 to All on Tuesday, January 20, 2026 10:10:01

Stefan Monnier <monnier@iro.umontreal.ca> writes:

[ And I finally managed to register, after finding the fault in my
human reasoning (I stupidly hadn't noticed that there were more fields
to fill beyond the bottom of the window. I guess a bot wouldn't
have made that mistake ?). ]

Hmm, I don't get any captcha in the registration form. Or does it occur
only after submitting the form (I did not go that far)? But the site is
behind the very popular Anubis[1] that "weights the soul of your
connection" - however I believe "soul weighting" is only possible if you
have Javascript enabled. Perhaps you get worse treatment without JS.

I fully agree that restricting freely accessible information with a registration requirement goes against the spirit of Debian. I also hate captchas with passion, but I suppose you need to compromise with LLM
robots going wild.

[1] https://github.com/TecharoHQ/anubis

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Svetlana Tkachenko@3:633/10 to All on Tuesday, January 20, 2026 11:50:01

but I suppose you need to compromise with LLM
robots going wild.

Are they not required to follow do_not_track http headers or robots.txt ? If LLM robots do not obey these instructions, they should be probably reported to their hosting provider.

S

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From tomas@3:633/10 to All on Tuesday, January 20, 2026 12:10:02

On Tue, Jan 20, 2026 at 09:47:34PM +1100, Svetlana Tkachenko wrote:

but I suppose you need to compromise with LLM
robots going wild.

Are they not required to follow do_not_track http headers or robots.txt ? If LLM robots do not obey these instructions, they should be probably reported to their hosting provider.

They don't always care. Their hosting provider isn't always in position
to care.
There's enough betting money in this pool to motivate actors to break
and/or creatively bend the rules.
Recommended reading:
https://medium.com/@kolla.gopi/the-cloudflare-perplexity-standoff-why-robots-txt-is-broken-for-the-ai-era-1b9d309bdc2b
Cheers
--
t

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Steve McIntyre@3:633/10 to All on Tuesday, January 20, 2026 13:10:01

svetlana@members.fsf.org wrote:

but I suppose you need to compromise with LLM
robots going wild.

Are they not required to follow do_not_track http headers or robots.txt ? If LLM robots do not obey these
instructions, they should be probably reported to their hosting provider.

Hahahaha. No.

The current crop of LLM morons do not care at all about following
accepted rules or norms. They just want to grab all the data, screw
everybody else. They're ignoring robots.txt, so service admins started
blocking netblocks some time ago.

Now we have the LLM morons using botnets to evade those blocks. We
have massive amounts of downloads coming from random residential IPs
all over the world, carefully spread out to make it more difficult to
block them.

These morons are why we can't have nice things.

--
Steve McIntyre, Cambridge, UK. steve@einval.com Can't keep my eyes from the circling sky,
Tongue-tied & twisted, Just an earth-bound misfit, I...

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Greg Wooledge@3:633/10 to All on Tuesday, January 20, 2026 14:20:01

On Tue, Jan 20, 2026 at 12:03:38 +0000, Steve McIntyre wrote:

svetlana@members.fsf.org wrote:

Are they not required to follow do_not_track http headers or robots.txt ? If LLM robots do not obey these
instructions, they should be probably reported to their hosting provider.

Hahahaha. No.

The current crop of LLM morons do not care at all about following
accepted rules or norms. They just want to grab all the data, screw
everybody else. They're ignoring robots.txt, so service admins started blocking netblocks some time ago.

Now we have the LLM morons using botnets to evade those blocks. We
have massive amounts of downloads coming from random residential IPs
all over the world, carefully spread out to make it more difficult to
block them.

These morons are why we can't have nice things.

I can confirm this. My own wiki was slammed really hard by this,
resulting in my having to take substantial actions to limit the
availability of some "pages".

The issue isn't even that the LLM bots are harvesting every wiki page.
If it were only that, I wouldn't mind. The first problem is wikis allow
you to request the difference between any two revision of a page. So,
let's say a page has 100 revisions. You can request the diff between
revision 11 and revision 37. Or the diff between revision 14 and
revision 69. And so on, and so on.

What happens is the bots request *every single combination* of these
diffs, each one from a random IP address, often (but not always) with
a falsified user-agent header.

I've blocked all the requests that give a robotic user-agent, but there's really nothing I can do about the ones that masquerade as Firefox or
whatever, unless I need to take it a step further and block all requests
that ask for a diff. I haven't had to do that yet. Maybe the LLM herders
have finally put *some* thought into what they're doing and reduced the stupidity level...? Dunno.

Compounding that, MoinMoin has some sort of bizarre calendar thing
that I've never used and don't really understand. But apparently
there's a potential page for every single date in a range that spans
multiple centuries. I've deleted all of those pages *multiple* times,
but spam bots got those pages into their "try to edit" caches, so they
kept coming back. Meanwhile, the LLM harvesters got those pages into
their "try to fetch" caches, so they would keep requesting them, even
though the pages didn't exist any longer.

So, another action I had to take was to block every request that tries
to hit one of those calendar pages, at the web server level, before it
could even make it to the wiki engine.

So, I've got this:

# less /etc/nginx/conf.d/badclient.conf
map $http_user_agent $badclient {
default 0;

"~BLEXBot/" 1;
"~ClaudeBot/" 1;
"~DotBot/" 1;
"~facebookexternalhit/" 1;
"~PetalBot;" 1;
"~SemrushBot" 1;
"~Thinkbot/" 1;
"~Twitterbot/" 1;

"~BadClient/" 2;
}

map $request_uri $badrequest {
default 0;
"~/MonthCalendar/" 1;
"~/MonthCalendar\?" 1;
"~/SampleUser/" 1;
"~/WikiCourse/" 1;
"~/WikiKurs/" 1;
}

And this:

# less /etc/nginx/sites-enabled/mywiki.wooledge.org
server {
listen 80;
listen 443 ssl;
server_name mywiki.wooledge.org;

if ($badclient) {
return 403;
}
if ($badrequest) {
return 403;
}
...
}

So far, these changes (combined with the brute force removal of the MonthCalendar et al. pages) have been sufficient.

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Davidson@3:633/10 to All on Tuesday, January 20, 2026 21:40:01

On Tue, 20 Jan 2026, Greg Wooledge wrote:

On Tue, Jan 20, 2026 at 12:03:38 +0000, Steve McIntyre wrote:

svetlana@members.fsf.org wrote:

Are they not required to follow do_not_track http headers or robots.txt ? If LLM robots do not obey these
instructions, they should be probably reported to their hosting provider. >>

Hahahaha. No.

The current crop of LLM morons do not care at all about following
accepted rules or norms. They just want to grab all the data, screw
everybody else. They're ignoring robots.txt, so service admins started
blocking netblocks some time ago.

Now we have the LLM morons using botnets to evade those blocks. We
have massive amounts of downloads coming from random residential IPs
all over the world, carefully spread out to make it more difficult to
block them.

These morons are why we can't have nice things.

I can confirm this. My own wiki was slammed really hard by this,
resulting in my having to take substantial actions to limit the
availability of some "pages".

The issue isn't even that the LLM bots are harvesting every wiki page.
If it were only that, I wouldn't mind. The first problem is wikis allow
you to request the difference between any two revision of a page. So,
let's say a page has 100 revisions. You can request the diff between revision 11 and revision 37. Or the diff between revision 14 and
revision 69. And so on, and so on.

What happens is the bots request *every single combination* of these
diffs, each one from a random IP address, often (but not always) with
a falsified user-agent header.

I've blocked all the requests that give a robotic user-agent, but there's really nothing I can do about the ones that masquerade as Firefox or whatever, unless I need to take it a step further and block all requests
that ask for a diff. I haven't had to do that yet. Maybe the LLM herders have finally put *some* thought into what they're doing and reduced the stupidity level...? Dunno.

Compounding that, MoinMoin has some sort of bizarre calendar thing
that I've never used and don't really understand. But apparently
there's a potential page for every single date in a range that spans
multiple centuries. I've deleted all of those pages *multiple* times,
but spam bots got those pages into their "try to edit" caches, so they
kept coming back. Meanwhile, the LLM harvesters got those pages into
their "try to fetch" caches, so they would keep requesting them, even
though the pages didn't exist any longer.

So, another action I had to take was to block every request that tries
to hit one of those calendar pages, at the web server level, before it
could even make it to the wiki engine.

So, I've got this:

# less /etc/nginx/conf.d/badclient.conf
map $http_user_agent $badclient {
default 0;

"~BLEXBot/" 1;
"~ClaudeBot/" 1;
"~DotBot/" 1;
"~facebookexternalhit/" 1;
"~PetalBot;" 1;
"~SemrushBot" 1;
"~Thinkbot/" 1;
"~Twitterbot/" 1;

"~BadClient/" 2;
}

map $request_uri $badrequest {
default 0;
"~/MonthCalendar/" 1;
"~/MonthCalendar\?" 1;
"~/SampleUser/" 1;
"~/WikiCourse/" 1;
"~/WikiKurs/" 1;
}

And this:

# less /etc/nginx/sites-enabled/mywiki.wooledge.org
server {
listen 80;
listen 443 ssl;
server_name mywiki.wooledge.org;

if ($badclient) {
return 403;

[rest snipped]

Why not

402 Payment Required

instead, with instructions on how pay for the privilege of getting on
a whitelist?

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Greg Wooledge@3:633/10 to All on Tuesday, January 20, 2026 22:00:01

On Tue, Jan 20, 2026 at 15:35:10 -0500, Davidson wrote:

Why not

402 Payment Required

instead, with instructions on how pay for the privilege of getting on
a whitelist?

For one thing, I know they're never going to pay me.

For another, I'm not doing this because I hate AI or whatever. I'm
doing it for *survival*. The host that my wiki runs on *cannot*
process those thousands of useless dynamic page requests. If I were
to say "hey, pay me X dollars, and you can send as many page diff
requests as you want", I would be making a promise of services I cannot deliver.

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Stefan Monnier@3:633/10 to All on Wednesday, January 21, 2026 01:10:01

We tracked it back to bots hitting our wiki and trying to make
anonymous edits. The bots would try to make edits, and that would
spin-up that useless Wiki Editor from Wikimedia. The edit would
eventually fail (during Save) because the bot was not authenticated.
I seem to recall we were seeing between 3 and 7 edits per second from
bots.

AFAICT the only workable avenue is to limit web sites to static pages.
Anything that requires more resources from the server needs to be firmly protected behind many layers of defenses. ?

- Stefan

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bigsy Bohr@3:633/10 to All on Wednesday, January 21, 2026 15:40:01

On 2026-01-20, Jeffrey Walton <noloader@gmail.com> wrote:

On Tue, Jan 20, 2026 at 3:53?PM Greg Wooledge <greg@wooledge.org> wrote:

On Tue, Jan 20, 2026 at 15:35:10 -0500, Davidson wrote:

Why not

402 Payment Required

instead, with instructions on how pay for the privilege of getting on
a whitelist?

For one thing, I know they're never going to pay me.

For another, I'm not doing this because I hate AI or whatever. I'm
doing it for *survival*. The host that my wiki runs on *cannot*
process those thousands of useless dynamic page requests. If I were
to say "hey, pay me X dollars, and you can send as many page diff
requests as you want", I would be making a promise of services I cannot
deliver.

++.

The Crypto++ project got kicked off of GoDaddy hosting because of
virtual CPU usage. Our CPU usage would also affect co-located sites.
That's when GoDaddy stepped in and closed us down.

We tracked it back to bots hitting our wiki and trying to make
anonymous edits. The bots would try to make edits, and that would
spin-up that useless Wiki Editor from Wikimedia. The edit would
eventually fail (during Save) because the bot was not authenticated.
I seem to recall we were seeing between 3 and 7 edits per second from
bots.

The bots were also causing boatloads of OOM kills on our VM because
the wiki editor was so heavy-weight. We had to constantly repair our
SQL database because Linux was killing the MySQL instance.

Eventually we had to move to Hostinger hosting.

How did that solve the problem (I'm ignorant)?

The thing is with these big or little but soon to big and overvalued on
the Dow Jones or whatever it is companies, you can't just call them on the phone and ask them to be reasonable. I'm not even sure they control or understand exactly what these robots are doing. I think they're already
out of control and on the loose, as it were.

Jeff

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Stefan Monnier@3:633/10 to All on Thursday, January 22, 2026 00:50:02

Stefan Monnier [2026-01-20 19:06:28] wrote:

AFAICT the only workable avenue is to limit web sites to static pages. Anything that requires more resources from the server needs to be firmly protected behind many layers of defenses. ?

Hmm... and if forums.debian.net follows the above principle and caches
the first page of every topic for static delivery, I guess the result
would be that this first page would end with something like "You must be
a registered member and logged in to view the replies in this topic". ?

- Stefan

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Svetlana Tkachenko@3:633/10 to All on Thursday, January 22, 2026 01:50:01

Hi Jeff

Jeffrey Walton wrote:

At Hostinger we got a VM with more resources and a swap file for about
the same price. And we got network protections from bots at no
charge.

Via cloudflare or some router magic? How does the network protection work?

Sveta

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

Who's Online
Recent Visitors
- Wang Bu
  Tuesday, January 27, 2026 01:45:00
  from Manila, Philippines via Telnet
- Wang Bu
  Saturday, January 24, 2026 14:15:55
  from Manila, Philippines via Telnet
- Wang Bu
  Saturday, January 24, 2026 06:56:24
  from Manila, Philippines via Telnet
- Guest
  Friday, January 09, 2026 18:03:22
  from Asdf via RLogin

System Info

Sysop:	Jacob Catayoc
Location:	Pasay City, Metro Manila, Philippines
Users:	5
Nodes:	4 (0 / 4)
Uptime:	19:03:57
Calls:	117
Calls today:	117
Files:	367
D/L today:	540 files (253M bytes)
Messages:	70,845
Posted today:	26

Venting about forums.debian.net

Who's Online

Recent Visitors

System Info