Forum: Jacob's Hideout BBS

groff: how to completely disable hyphenation?

From Kalevi Kolttonen@3:633/10 to All on Thursday, March 26, 2026 14:08:23

Hello!

I am very sorry for posting this off-topic question,
but there seems to be no active groff-related group.

My question is simple: How do I completely disable
groff's hyphenation? I am writing a document in Finnish
using this command:

groff -me -Kutf8 -Tpdf proto > proto.pdf

Everything works fine, but the hyphenation is making
mistakes and I want it. ChatGPT told me to insert
these at the top of the document:

.nh
.hy 0

It seems to work for a while, but then hyphenation
is suddenly active again! Can anyone here help me?

br,
KK

--- PyGate Linux v1.5.13
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Lew Pitcher@3:633/10 to All on Thursday, March 26, 2026 14:38:11

On Thu, 26 Mar 2026 14:08:23 +0000, Kalevi Kolttonen wrote:

Hello!

I am very sorry for posting this off-topic question,
but there seems to be no active groff-related group.

My question is simple: How do I completely disable
groff's hyphenation? I am writing a document in Finnish
using this command:

groff -me -Kutf8 -Tpdf proto > proto.pdf

Everything works fine, but the hyphenation is making
mistakes and I want it. ChatGPT told me

Remember, ChatGPT is a text-prediction program, and has no
experience or intrinsic knowledge of groff. If you use it,
you have to audit it's advice, which usually means that you
have to have experience or intrinsic knowledge of the subject
matter.

to insert these at the top of the document:

.nh
.hy 0

According to the Nroff/Troff User's Manual by Ossanna & Kernighan,
both of these macros do the same thing; turn off hyphenation (the
.nh
macro explicitly turns off hyphenation, while the
.hy 0
macro selects a hyphenation mode, with "0" representing "OFF").

Using both seems to me to be overkill; you only need one.

It seems to work for a while, but then hyphenation
is suddenly active again! Can anyone here help me?

With hyphenation explicitly turned off, it's likely that your
document uses a macro that turns it back on, either explicitly,
or as a side effect.

I don't often use groff, so I can't tell you which macros might
do that. Take a look at your document, comparing the pdf with
the input, to see /where/ in the input the hyphenation turns back
on. That way, you can see what macros/commands/etc that you've used
before that point that might turn hyphenation back on.

br,
KK

Sorry I couldn't be of more help
--
Lew Pitcher
"In Skills We Trust"
Not LLM output - I'm just like this.

--- PyGate Linux v1.5.13
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Lew Pitcher@3:633/10 to All on Thursday, March 26, 2026 14:45:25

On Thu, 26 Mar 2026 14:38:11 +0000, Lew Pitcher wrote:

On Thu, 26 Mar 2026 14:08:23 +0000, Kalevi Kolttonen wrote:

Hello!

I am very sorry for posting this off-topic question,
but there seems to be no active groff-related group.

My question is simple: How do I completely disable
groff's hyphenation? I am writing a document in Finnish
using this command:

groff -me -Kutf8 -Tpdf proto > proto.pdf

Everything works fine, but the hyphenation is making
mistakes and I want it. ChatGPT told me

Remember, ChatGPT is a text-prediction program, and has no
experience or intrinsic knowledge of groff. If you use it,
you have to audit it's advice, which usually means that you
have to have experience or intrinsic knowledge of the subject
matter.

to insert these at the top of the document:

.nh
.hy 0

According to the Nroff/Troff User's Manual by Ossanna & Kernighan,
both of these macros do the same thing; turn off hyphenation (the
.nh
macro explicitly turns off hyphenation, while the
.hy 0
macro selects a hyphenation mode, with "0" representing "OFF").

Using both seems to me to be overkill; you only need one.

It seems to work for a while, but then hyphenation
is suddenly active again! Can anyone here help me?

With hyphenation explicitly turned off, it's likely that your
document uses a macro that turns it back on, either explicitly,
or as a side effect.

I don't often use groff, so I can't tell you which macros might
do that. Take a look at your document, comparing the pdf with
the input, to see /where/ in the input the hyphenation turns back
on. That way, you can see what macros/commands/etc that you've used
before that point that might turn hyphenation back on.

br,
KK

Sorry I couldn't be of more help

This passage from the Nroff/Troff user's manual might help you
identify the element that re-enables hyphenation:
"Automatic hyphenation may be switched off and on. When switched on
with hy, several variants may be set. A hyphenation indicator
character may be embedded in a word to specify desired hyphenation
points, or may be prepended to suppress hyphenation. In addition,
the user may specify a small list of exception words.
Only words that consist of a central alphabetic string surrounded
by (usually null) non-alphabetic strings are candidates for automatic
hyphenation. Words that contain hyphens (minus), em-dashes (\(em),
or hyphenation indicator characters are always subject to splitting
after those characters, whether automatic hyphenation is on or off.

.nh hyphenate - E
Automatic hyphenation is turned off.

.hy N on, N = 1 on, N = 1 E
Automatic hyphenation is turned on for N � 1, or off for N = 0.
If N = 2, last lines (ones that will cause a trap) are not hyphenated.
For N = 4 and 8, the last and first two characters respectively of a
word are not split off. These values are additive; i.e., N = 14 will
invoke all three restrictions.

.hc c \% \% E
Hyphenation indicator character is set to c or to the default \%.
The indicator does not appear in the output.

.hw word ... ignored -
Specify hyphenation points in words with embedded minus signs.
Versions of a word with terminal s are implied; i.e., dig-it implies
dig-its. This list is examined initially and after each suffix stripping.
The space available is small?about 128 characters."

HTH
--
Lew Pitcher
"In Skills We Trust"
Not LLM output - I'm just like this.

--- PyGate Linux v1.5.13
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Kalevi Kolttonen@3:633/10 to All on Thursday, March 26, 2026 14:59:27

Lew Pitcher <lew.pitcher@digitalfreehold.ca> wrote:

Sorry I couldn't be of more help

Thanks, it did help somewhat. I just went through the document.
The culprit was a missing terminating command. I had this:

.(l
\(bu \fIThe Flying Saucers Are Real\fP (1950)
.br
\(bu \fIFlying Saucers from Outer Space\fP (1953)
.br
\(bu \fIThe Flying Saucer Conspiracy\fP (1955)
.br
\(bu \fIFlying Saucers: Top Secret\fP (1960)
.br
\(bu \fIAliens from Space: The Real Story of Unidentified Flying Objects\fP (1973)

And it was missing the terminating:

.)l

Adding .)l fixed the problem.

br,
KK

--- PyGate Linux v1.5.13
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From boltar@3:633/10 to All on Thursday, March 26, 2026 15:07:26

On Thu, 26 Mar 2026 14:38:11 -0000 (UTC)
Lew Pitcher <lew.pitcher@digitalfreehold.ca> gabbled:

On Thu, 26 Mar 2026 14:08:23 +0000, Kalevi Kolttonen wrote:

Hello!

I am very sorry for posting this off-topic question,
but there seems to be no active groff-related group.

My question is simple: How do I completely disable
groff's hyphenation? I am writing a document in Finnish
using this command:

groff -me -Kutf8 -Tpdf proto > proto.pdf

Everything works fine, but the hyphenation is making
mistakes and I want it. ChatGPT told me

Remember, ChatGPT is a text-prediction program, and has no

Its a lot more complicated than that. If you want to see the output of
a text prediction program download some markov chain code, it'll just output grammatically correct (mostly) junk.

--- PyGate Linux v1.5.13
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Lawrence D?Oliveiro@3:633/10 to All on Thursday, March 26, 2026 19:48:46

On Thu, 26 Mar 2026 14:38:11 -0000 (UTC), Lew Pitcher wrote:

I don't often use groff ...

Maybe not directly, but remember it is part of the toolchain every
time you run the man(1) command on a GNU-based system.

--- PyGate Linux v1.5.13
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Popping Mad@3:633/10 to All on Saturday, May 30, 2026 14:30:16

On 3/26/26 10:38 AM, Lew Pitcher wrote:

Remember, ChatGPT is a text-prediction program, and has no
experience or intrinsic knowledge of groff.

it also has no intrinsic logic

--- PyGate Linux v1.5.15
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From boltar@3:633/10 to All on Sunday, May 31, 2026 07:25:33

On Sat, 30 May 2026 14:30:16 -0400
Popping Mad <rainbow@colition.gov> gabbled:

On 3/26/26 10:38 AM, Lew Pitcher wrote:

Remember, ChatGPT is a text-prediction program, and has no
experience or intrinsic knowledge of groff.

it also has no intrinsic logic

It does in the sense of the low level ANN programming that allows it to function, but it builds up its own logic and knowledge as its trained.
People are too dismissive of these systems - they're a lot smarter than a lot of people particularly in IT would like to believe.

--- PyGate Linux v1.5.15
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Lawrence D?Oliveiro@3:633/10 to All on Monday, June 01, 2026 00:11:23

On Sun, 31 May 2026 07:25:33 -0000 (UTC), boltar wrote:

People are too dismissive of these systems - they're a lot smarter
than a lot of people particularly in IT would like to believe.

We look at those who get so attached to using these systems that they
end up overlooking some quite glaring shortcomings.

--- PyGate Linux v1.5.15
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From boltar@3:633/10 to All on Monday, June 01, 2026 08:17:22

On Mon, 1 Jun 2026 00:11:23 -0000 (UTC)
Lawrence =?iso-8859-13?q?D=FFOliveiro?= <ldo@nz.invalid> gabbled:

On Sun, 31 May 2026 07:25:33 -0000 (UTC), boltar wrote:

People are too dismissive of these systems - they're a lot smarter
than a lot of people particularly in IT would like to believe.

We look at those who get so attached to using these systems that they
end up overlooking some quite glaring shortcomings.

Yes, they certainly make some mistakes. But they're not simply turbo charged markov chains that some people seem to think, there is some kind of thinking going on.

--- PyGate Linux v1.5.15
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Lawrence D?Oliveiro@3:633/10 to All on Tuesday, June 02, 2026 00:09:50

On Mon, 1 Jun 2026 08:17:22 -0000 (UTC), boltar wrote:

On Mon, 1 Jun 2026 00:11:23 -0000 (UTC), Lawrence D?Oliveiro wrote:

On Sun, 31 May 2026 07:25:33 -0000 (UTC), boltar wrote:

People are too dismissive of these systems - they're a lot smarter
than a lot of people particularly in IT would like to believe.

We look at those who get so attached to using these systems that
they end up overlooking some quite glaring shortcomings.

Yes, they certainly make some mistakes. But they're not simply turbo
charged markov chains that some people seem to think, there is some
kind of thinking going on.

I rest my case.

--- PyGate Linux v1.5.15
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From boltar@3:633/10 to All on Tuesday, June 02, 2026 08:21:03

On Tue, 2 Jun 2026 00:09:50 -0000 (UTC)
Lawrence =?iso-8859-13?q?D=FFOliveiro?= <ldo@nz.invalid> gabbled:

On Mon, 1 Jun 2026 08:17:22 -0000 (UTC), boltar wrote:

On Mon, 1 Jun 2026 00:11:23 -0000 (UTC), Lawrence D?Oliveiro wrote:

On Sun, 31 May 2026 07:25:33 -0000 (UTC), boltar wrote:

People are too dismissive of these systems - they're a lot smarter
than a lot of people particularly in IT would like to believe.

We look at those who get so attached to using these systems that
they end up overlooking some quite glaring shortcomings.

Yes, they certainly make some mistakes. But they're not simply turbo
charged markov chains that some people seem to think, there is some
kind of thinking going on.

I rest my case.

You didn't have a case. Thinking doesn't mean conciousness, it simply means
the application of logic in an intelligent way.

--- PyGate Linux v1.5.15
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Paul@3:633/10 to All on Tuesday, June 02, 2026 05:58:35

On Tue, 6/2/2026 4:21 AM, boltar@caprica.universe wrote:

On Tue, 2 Jun 2026 00:09:50 -0000 (UTC)
Lawrence =?iso-8859-13?q?D=FFOliveiro?= <ldo@nz.invalid> gabbled:

On Mon, 1 Jun 2026 08:17:22 -0000 (UTC), boltar wrote:

On Mon, 1 Jun 2026 00:11:23 -0000 (UTC), Lawrence D?Oliveiro wrote:

On Sun, 31 May 2026 07:25:33 -0000 (UTC), boltar wrote:

People are too dismissive of these systems - they're a lot smarter
than a lot of people particularly in IT would like to believe.

We look at those who get so attached to using these systems that
they end up overlooking some quite glaring shortcomings.

Yes, they certainly make some mistakes. But they're not simply turbo
charged markov chains that some people seem to think, there is some
kind of thinking going on.

I rest my case.

You didn't have a case. Thinking doesn't mean conciousness, it simply means the application of logic in an intelligent way.

This is the first exposure I had to "automatons", in high school.
One of my fellow students, did a port of ELIZA, to the language
our terminal used. He converted the program to APL (Iverson/IBM).
(Today, that person has a PhD in Artificial Intelligence, and
presumably has retired. His hobby activity, did give him
something to do.)

https://en.wikipedia.org/wiki/ELIZA

"ELIZA is an early natural language processing computer program developed
from 1964 to 1967[1] at MIT by Joseph Weizenbaum.

Weizenbaum intended the program as a method to explore communication between
humans and machines.

He was surprised that some people, including his secretary, attributed human-like feelings <===
to the computer program,[3] a phenomenon that came to be called the ELIZA effect.
"

Even back then, there were a few people who could not separate truth from fiction,
when examining the output. The output wasn't exactly compelling. While testing that port, we spent most of our time laughing at the corny output.

Having that experience back then, makes it easier to have some perspective
when viewing how this version works. Who would have guessed this would happen.

Paul

--- PyGate Linux v1.5.15
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From boltar@3:633/10 to All on Tuesday, June 02, 2026 10:38:10

On Tue, 2 Jun 2026 05:58:35 -0400
Paul <nospam@needed.invalid> gabbled:

Even back then, there were a few people who could not separate truth from >fiction,

Sadly there always were and always will be far too many gullible idiots
like that in any population. You only have to look at current politics and conspiracy theories to realise.

--- PyGate Linux v1.5.15
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Kaz Kylheku@3:633/10 to All on Monday, June 08, 2026 23:34:13

On 2026-05-31, boltar@caprica.universe <boltar@caprica.universe> wrote:

On Sat, 30 May 2026 14:30:16 -0400
Popping Mad <rainbow@colition.gov> gabbled:

On 3/26/26 10:38 AM, Lew Pitcher wrote:

Remember, ChatGPT is a text-prediction program, and has no
experience or intrinsic knowledge of groff.

it also has no intrinsic logic

It does in the sense of the low level ANN programming that allows it to function, but it builds up its own logic and knowledge as its trained.
People are too dismissive of these systems - they're a lot smarter than a lot of people particularly in IT would like to believe.

They are a lot dumber than people believe.

A simple hash table token predictor trained on a large body of text
using 4-grams keys (strings of four words) to predict the fifth word
will produce some amazing outputs.

It's because you have a clear grasp of the scope of the data, and
the approach in the N-gram hash table program, that you know
there is no thought behind the clever-looking piece of output,
no poet or sage.

An algorithm predicting using much more sophisticated linear algebra,
over a data set of terabytes upon terabytes of text, encompasses a scale
that is so completely alien to your experience and scope, that your
intuitions about it have no hope of being correct.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- PyGate Linux v1.5.15
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From boltar@3:633/10 to All on Tuesday, June 09, 2026 08:46:00

On Mon, 8 Jun 2026 23:34:13 -0000 (UTC)
Kaz Kylheku <046-301-5902@kylheku.com> gabbled:

On 2026-05-31, boltar@caprica.universe <boltar@caprica.universe> wrote:

On Sat, 30 May 2026 14:30:16 -0400
Popping Mad <rainbow@colition.gov> gabbled:

On 3/26/26 10:38 AM, Lew Pitcher wrote:

Remember, ChatGPT is a text-prediction program, and has no
experience or intrinsic knowledge of groff.

it also has no intrinsic logic

It does in the sense of the low level ANN programming that allows it to
function, but it builds up its own logic and knowledge as its trained.
People are too dismissive of these systems - they're a lot smarter than a lot

of people particularly in IT would like to believe.

They are a lot dumber than people believe.

A simple hash table token predictor trained on a large body of text
using 4-grams keys (strings of four words) to predict the fifth word
will produce some amazing outputs.

Oh please. I've written at least 4 markov chain programs in my career and
they don't get anywhere close to what an LLM can output. For a start LLMs
don't simply spit out chunks of text they've ingested joined together by keywords, they have some kind of - limited - understanding of the subject in hand or they wouldn't be able to extrapolate and interpolate. If you don't believe my try it. Then there's the part no one mentions - the ability to understand the input text in a meaningful way. Something no markov chain
can do.

An algorithm predicting using much more sophisticated linear algebra,
over a data set of terabytes upon terabytes of text, encompasses a scale
that is so completely alien to your experience and scope, that your >intuitions about it have no hope of being correct.

Don't patronise me.

--- PyGate Linux v1.5.15
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Kaz Kylheku@3:633/10 to All on Tuesday, June 09, 2026 21:15:50

On 2026-06-09, boltar@caprica.universe <boltar@caprica.universe> wrote:

On Mon, 8 Jun 2026 23:34:13 -0000 (UTC)
Kaz Kylheku <046-301-5902@kylheku.com> gabbled:

On 2026-05-31, boltar@caprica.universe <boltar@caprica.universe> wrote:

On Sat, 30 May 2026 14:30:16 -0400
Popping Mad <rainbow@colition.gov> gabbled:

On 3/26/26 10:38 AM, Lew Pitcher wrote:

Remember, ChatGPT is a text-prediction program, and has no
experience or intrinsic knowledge of groff.

it also has no intrinsic logic

It does in the sense of the low level ANN programming that allows it to >>> function, but it builds up its own logic and knowledge as its trained.
People are too dismissive of these systems - they're a lot smarter than a lot

of people particularly in IT would like to believe.

They are a lot dumber than people believe.

A simple hash table token predictor trained on a large body of text
using 4-grams keys (strings of four words) to predict the fifth word
will produce some amazing outputs.

Oh please. I've written at least 4 markov chain programs in my career and they don't get anywhere close to what an LLM can output.

Right; nobody is saying that. My point is that these simple programs
have already produced some outputs that have made people go "wow!".

The point is that people going "wow!" is not a reliable yardstick of anything.

don't simply spit out chunks of text they've ingested joined together by keywords, they have some kind of - limited - understanding of the subject in hand or they wouldn't be able to extrapolate and interpolate.

These things purely interpolate. They do it at such a scale of data that
you can't tell interpolation from extrapolation. Except maybe if it
happens to land in your area of extensive expertise.

If you don't
believe my try it. Then there's the part no one mentions - the ability to understand the input text in a meaningful way. Something no markov chain
can do.

The meaning is entirely bundled in the training data.

You can extract meaning from a book. Yet a book doesn't think.

You don't feel that it can think because you are the one doing the
search for meaning, and the texts you are finding were obviously written
by the author, so you project the thinking ability onto that author.

The LLM interposes itself as a middle man broker of information in such
a way that it appears to be doing the thinking, but all the thinking
already went into the training material.

An algorithm predicting using much more sophisticated linear algebra,
over a data set of terabytes upon terabytes of text, encompasses a scale >>that is so completely alien to your experience and scope, that your >>intuitions about it have no hope of being correct.

Don't patronise me.

By your, I mean everyone, you and me. I should say "we". Our intuitions
are completely in uncharted waters when faced with a contextual
path-finding engine that wades through terabytes of compressed text,
with a high degree of statistical accuracy.

I like to think of it as a transformation that lets me have a quasi-conversation with the training data. I.e. the training data
perhaps holds some scattered pieces of info I'd like to retrieve,
an the LLM provides a conversational query system to get at it.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- PyGate Linux v1.5.15
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From boltar@3:633/10 to All on Wednesday, June 10, 2026 08:19:51

On Tue, 9 Jun 2026 21:15:50 -0000 (UTC)
Kaz Kylheku <046-301-5902@kylheku.com> gabbled:

On 2026-06-09, boltar@caprica.universe <boltar@caprica.universe> wrote:

Oh please. I've written at least 4 markov chain programs in my career and
they don't get anywhere close to what an LLM can output.

Right; nobody is saying that. My point is that these simple programs
have already produced some outputs that have made people go "wow!".

For a few seconds until they read all of the output and realise its either grammatically correct but incoherent gibberish or simply a dump of the input text depending on key length.

The point is that people going "wow!" is not a reliable yardstick of anything.

No, but still going wow after years of use is.

don't simply spit out chunks of text they've ingested joined together by
keywords, they have some kind of - limited - understanding of the subject in

hand or they wouldn't be able to extrapolate and interpolate.

These things purely interpolate. They do it at such a scale of data that
you can't tell interpolation from extrapolation. Except maybe if it
happens to land in your area of extensive expertise.

Like I said, why don't you try it asking it to extrapolate on data that it couldn't possibly have injested because its personal to you or uses long floating point values.

If you don't
believe my try it. Then there's the part no one mentions - the ability to
understand the input text in a meaningful way. Something no markov chain
can do.

The meaning is entirely bundled in the training data.

Yes, and?

You can extract meaning from a book. Yet a book doesn't think.

A book doesn't do anything, its memory, not compute. I'd have thought that distinction was obvious.

The LLM interposes itself as a middle man broker of information in such
a way that it appears to be doing the thinking, but all the thinking
already went into the training material.

Probably the same could be said for you when you were at school.

By your, I mean everyone, you and me. I should say "we". Our intuitions
are completely in uncharted waters when faced with a contextual
path-finding engine that wades through terabytes of compressed text,
with a high degree of statistical accuracy.

I suggest you familiarise yourself with John Searles Chinese Room. It doesn't matter how it works internally, its the output that matters.

--- PyGate Linux v1.5.15
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Paul@3:633/10 to All on Wednesday, June 10, 2026 09:14:39

On Wed, 6/10/2026 4:19 AM, boltar@caprica.universe wrote:

On Tue, 9 Jun 2026 21:15:50 -0000 (UTC)
Kaz Kylheku <046-301-5902@kylheku.com> gabbled:

On 2026-06-09, boltar@caprica.universe <boltar@caprica.universe> wrote:

Oh please. I've written at least 4 markov chain programs in my career and >>> they don't get anywhere close to what an LLM can output.

Right; nobody is saying that. My point is that these simple programs
have already produced some outputs that have made people go "wow!".

For a few seconds until they read all of the output and realise its either grammatically correct but incoherent gibberish or simply a dump of the input text depending on key length.

The point is that people going "wow!" is not a reliable yardstick of anything.

No, but still going wow after years of use is.

don't simply spit out chunks of text they've ingested joined together by >>> keywords, they have some kind of - limited - understanding of the subject in

hand or they wouldn't be able to extrapolate and interpolate.

These things purely interpolate. They do it at such a scale of data that
you can't tell interpolation from extrapolation. Except maybe if it
happens to land in your area of extensive expertise.

Like I said, why don't you try it asking it to extrapolate on data that it couldn't possibly have injested because its personal to you or uses long floating point values.

If you don't
believe my try it. Then there's the part no one mentions - the ability to >>> understand the input text in a meaningful way. Something no markov chain >>> can do.

The meaning is entirely bundled in the training data.

Yes, and?

You can extract meaning from a book. Yet a book doesn't think.

A book doesn't do anything, its memory, not compute. I'd have thought that distinction was obvious.

The LLM interposes itself as a middle man broker of information in such
a way that it appears to be doing the thinking, but all the thinking
already went into the training material.

Probably the same could be said for you when you were at school.

By your, I mean everyone, you and me. I should say "we". Our intuitions
are completely in uncharted waters when faced with a contextual
path-finding engine that wades through terabytes of compressed text,
with a high degree of statistical accuracy.

I suggest you familiarise yourself with John Searles Chinese Room. It doesn't matter how it works internally, its the output that matters.

It's because it is a black box, that we cannot use it.

Half an answer, is no answer at all.

It only seems to be able to access the statistical nature of
the answers, if set to "high reasoning". Which requires multiple
runs, and the LLM-AI has been given some capability to compare
the answers after the run for some sort of convergence property.

For the customer, this will burn up 10x as many tokens.

If it was not for this capability, the "answer" to the LLM-AI
is a "black box" to it as well. It does not know where the
answer is coming from. It can produce URLs it claims are "cites".
But not all the URLs are authoritative. For example, today,
I was given an answer and one of the cites, was AI slop from
some web site. Absolutely nothing in the cited web page was
an attempt at an authoritative answer. I've even had the cite
web pages, where the keywords of the question do not appear.

Why would I pay money for this ?

*******

You can run queries on a home machine.

When I first ran this query, it was on a data center machine.

"What are your capabilities ?"

At the time, that caused the poor thing to have a nervous
breakdown (it's an unbounded question). The safety timer
went off after 15 seconds, and the answer was *erased* from
the screen, so I cannot give sample text showing how unhinged
it was.

The following picture shows the same question run on a home machine,
with a later model. The answer has a certain canned quality to it,
so maybe the safety timer event was used in preparing newer models.
Perhaps the prompt has instructions on what to do with the
unbounded questions.

[Picture] Run-Queries-With-Network-Cable-Disconnected.gif

https://postimg.cc/k20gzLYy

https://imgur.com/a/It9jbsL

I can't be sure, but it appears to be telling me that I can't
have more than 4096 output tokens. The computer this was run
on, does not have a very good video card. I don't think
the video card could be used, as it is too small (GTX1050
with 4GB of VRAM).

Paul

--- PyGate Linux v1.5.15
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From boltar@3:633/10 to All on Wednesday, June 10, 2026 15:01:24

On Wed, 10 Jun 2026 09:14:39 -0400
Paul <nospam@needed.invalid> gabbled:

On Wed, 6/10/2026 4:19 AM, boltar@caprica.universe wrote:

I suggest you familiarise yourself with John Searles Chinese Room. It doesn't

matter how it works internally, its the output that matters.

It's because it is a black box, that we cannot use it.

Huh? No one is asking it to fly a passenger plane.

Half an answer, is no answer at all.

Except most of the time it gives a full answer. ChatGPT has saved me a huge amount of time because I haven't had to wade through the rubbish and arguments on stack overflow to find how to do something. It usually provides something useful even if it needs tweaking.

some web site. Absolutely nothing in the cited web page was
an attempt at an authoritative answer. I've even had the cite
web pages, where the keywords of the question do not appear.

Why would I pay money for this ?

You don't have to , its free.

When I first ran this query, it was on a data center machine.

"What are your capabilities ?"

At the time, that caused the poor thing to have a nervous
breakdown (it's an unbounded question). The safety timer
went off after 15 seconds, and the answer was *erased* from
the screen, so I cannot give sample text showing how unhinged
it was.

I asked ChatGPT the same and it produced a sensible response. Perhaps
expecting it to work on a home machine is asking a lot.

[Picture] Run-Queries-With-Network-Cable-Disconnected.gif

https://postimg.cc/k20gzLYy

Didn't work.

https://imgur.com/a/It9jbsL

Not available in my region.

--- PyGate Linux v1.5.15
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Kaz Kylheku@3:633/10 to All on Tuesday, June 23, 2026 22:56:54

On 2026-06-10, boltar@caprica.universe <boltar@caprica.universe> wrote:

On Tue, 9 Jun 2026 21:15:50 -0000 (UTC)
Kaz Kylheku <046-301-5902@kylheku.com> gabbled:

These things purely interpolate. They do it at such a scale of data that >>you can't tell interpolation from extrapolation. Except maybe if it
happens to land in your area of extensive expertise.

Like I said, why don't you try it asking it to extrapolate on data that it couldn't possibly have injested because its personal to you or uses long floating point values.

People also believe that psychics know something about them personally,
which then "proves" they indeed have telepathic powers.

Same kind of con job.

You can extract meaning from a book. Yet a book doesn't think.

A book doesn't do anything, its memory, not compute. I'd have thought that distinction was obvious.

No it isn't; the weights of a trained LLM are also memory, not compute.

Getting something out of a book or LLM is a function.

In the case of the LLM, the retrieval function is sophisticated. It
allows you to have a pseudo-conversation with a vast body of texts to
extract a wide range of information. (And that is undeniably useful,
at least in situations in which you have a way to check the veracity of
the output.)

By your, I mean everyone, you and me. I should say "we". Our intuitions >>are completely in uncharted waters when faced with a contextual >>path-finding engine that wades through terabytes of compressed text,
with a high degree of statistical accuracy.

I suggest you familiarise yourself with John Searles Chinese Room. It doesn't matter how it works internally, its the output that matters.

I'm addressing the wrong ideas that the thing is thinking, or that
it has access to information outside of its training data; I am not
claiming that the output doesn't matter or isn't useful.

Those wrong ideas are not supported by only-the-output-matters
argumentation, which amounts to nothing more than a feeble attempt
to shut down the discussion.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- PyGate Linux v1.5.17
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Kalevi Kolttonen@3:633/10 to All on Tuesday, June 23, 2026 23:37:32

Kaz Kylheku <046-301-5902@kylheku.com> wrote:

I'm addressing the wrong ideas that the thing is thinking, or that
it has access to information outside of its training data; I am not
claiming that the output doesn't matter or isn't useful.

ChatGPT regularly performs web searches when it needs
to do so. So it definitely has access to information outside
of its training data.

br,
KK

--- PyGate Linux v1.5.17
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From boltar@3:633/10 to All on Wednesday, June 24, 2026 07:13:10

On Tue, 23 Jun 2026 22:56:54 -0000 (UTC)
Kaz Kylheku <046-301-5902@kylheku.com> gabbled:

On 2026-06-10, boltar@caprica.universe <boltar@caprica.universe> wrote:

Like I said, why don't you try it asking it to extrapolate on data that it >> couldn't possibly have injested because its personal to you or uses long
floating point values.

People also believe that psychics know something about them personally,
which then "proves" they indeed have telepathic powers.

A stupid analogy.

Same kind of con job.

Except the difference is I can ask ChatGPT to write some code and it will
and if its not too long it'll work. I don't care how it does, but it does it, and thats good enough for me.

A book doesn't do anything, its memory, not compute. I'd have thought that >> distinction was obvious.

No it isn't; the weights of a trained LLM are also memory, not compute.

Do you understand how neural nets work? The one inside your head seems to
be having trouble with the concept.

I suggest you familiarise yourself with John Searles Chinese Room. It doesn't

matter how it works internally, its the output that matters.

I'm addressing the wrong ideas that the thing is thinking, or that
it has access to information outside of its training data; I am not
claiming that the output doesn't matter or isn't useful.

If its not thinking or reasoning what would you call it? Given you seem to think its a sophisticated markov chain - proving you understand neither
markov chains or LLMs - I don't think I'll give much weight to your opinion.

Those wrong ideas are not supported by only-the-output-matters
argumentation, which amounts to nothing more than a feeble attempt
to shut down the discussion.

Whatever. LLMs are not just dumb statistical output machines.

--- PyGate Linux v1.5.18
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

Who's Online
Recent Visitors
- Wang Bu
  Monday, June 22, 2026 08:10:33
  from Manila, Philippines via Telnet
- Wang Bu
  Monday, June 22, 2026 07:54:48
  from Manila, Philippines via Telnet
- Wang Bu
  Saturday, June 20, 2026 19:49:49
  from Manila, Philippines via Telnet
- Wang Bu
  Sunday, June 14, 2026 19:13:00
  from Manila, Philippines via Telnet

System Info

Sysop:	Jacob Catayoc
Location:	Pasay City, Metro Manila, Philippines
Users:	4
Nodes:	4 (0 / 4)
Uptime:	495146:45:21
Calls:	165
Files:	574
D/L today:	29 files (9,998K bytes)
Messages:	78,216

groff: how to completely disable hyphenation?

Who's Online

Recent Visitors

System Info