• groff: how to completely disable hyphenation?

    From Kalevi Kolttonen@3:633/10 to All on Thursday, March 26, 2026 14:08:23
    Hello!

    I am very sorry for posting this off-topic question,
    but there seems to be no active groff-related group.

    My question is simple: How do I completely disable
    groff's hyphenation? I am writing a document in Finnish
    using this command:

    groff -me -Kutf8 -Tpdf proto > proto.pdf

    Everything works fine, but the hyphenation is making
    mistakes and I want it. ChatGPT told me to insert
    these at the top of the document:

    .nh
    .hy 0

    It seems to work for a while, but then hyphenation
    is suddenly active again! Can anyone here help me?

    br,
    KK

    --- PyGate Linux v1.5.13
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Lew Pitcher@3:633/10 to All on Thursday, March 26, 2026 14:38:11
    On Thu, 26 Mar 2026 14:08:23 +0000, Kalevi Kolttonen wrote:

    Hello!

    I am very sorry for posting this off-topic question,
    but there seems to be no active groff-related group.

    My question is simple: How do I completely disable
    groff's hyphenation? I am writing a document in Finnish
    using this command:

    groff -me -Kutf8 -Tpdf proto > proto.pdf

    Everything works fine, but the hyphenation is making
    mistakes and I want it. ChatGPT told me

    Remember, ChatGPT is a text-prediction program, and has no
    experience or intrinsic knowledge of groff. If you use it,
    you have to audit it's advice, which usually means that you
    have to have experience or intrinsic knowledge of the subject
    matter.

    to insert these at the top of the document:

    .nh
    .hy 0

    According to the Nroff/Troff User's Manual by Ossanna & Kernighan,
    both of these macros do the same thing; turn off hyphenation (the
    .nh
    macro explicitly turns off hyphenation, while the
    .hy 0
    macro selects a hyphenation mode, with "0" representing "OFF").

    Using both seems to me to be overkill; you only need one.

    It seems to work for a while, but then hyphenation
    is suddenly active again! Can anyone here help me?

    With hyphenation explicitly turned off, it's likely that your
    document uses a macro that turns it back on, either explicitly,
    or as a side effect.

    I don't often use groff, so I can't tell you which macros might
    do that. Take a look at your document, comparing the pdf with
    the input, to see /where/ in the input the hyphenation turns back
    on. That way, you can see what macros/commands/etc that you've used
    before that point that might turn hyphenation back on.

    br,
    KK

    Sorry I couldn't be of more help
    --
    Lew Pitcher
    "In Skills We Trust"
    Not LLM output - I'm just like this.

    --- PyGate Linux v1.5.13
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Lew Pitcher@3:633/10 to All on Thursday, March 26, 2026 14:45:25
    On Thu, 26 Mar 2026 14:38:11 +0000, Lew Pitcher wrote:

    On Thu, 26 Mar 2026 14:08:23 +0000, Kalevi Kolttonen wrote:

    Hello!

    I am very sorry for posting this off-topic question,
    but there seems to be no active groff-related group.

    My question is simple: How do I completely disable
    groff's hyphenation? I am writing a document in Finnish
    using this command:

    groff -me -Kutf8 -Tpdf proto > proto.pdf

    Everything works fine, but the hyphenation is making
    mistakes and I want it. ChatGPT told me

    Remember, ChatGPT is a text-prediction program, and has no
    experience or intrinsic knowledge of groff. If you use it,
    you have to audit it's advice, which usually means that you
    have to have experience or intrinsic knowledge of the subject
    matter.

    to insert these at the top of the document:

    .nh
    .hy 0

    According to the Nroff/Troff User's Manual by Ossanna & Kernighan,
    both of these macros do the same thing; turn off hyphenation (the
    .nh
    macro explicitly turns off hyphenation, while the
    .hy 0
    macro selects a hyphenation mode, with "0" representing "OFF").

    Using both seems to me to be overkill; you only need one.

    It seems to work for a while, but then hyphenation
    is suddenly active again! Can anyone here help me?

    With hyphenation explicitly turned off, it's likely that your
    document uses a macro that turns it back on, either explicitly,
    or as a side effect.

    I don't often use groff, so I can't tell you which macros might
    do that. Take a look at your document, comparing the pdf with
    the input, to see /where/ in the input the hyphenation turns back
    on. That way, you can see what macros/commands/etc that you've used
    before that point that might turn hyphenation back on.

    br,
    KK

    Sorry I couldn't be of more help

    This passage from the Nroff/Troff user's manual might help you
    identify the element that re-enables hyphenation:
    "Automatic hyphenation may be switched off and on. When switched on
    with hy, several variants may be set. A hyphenation indicator
    character may be embedded in a word to specify desired hyphenation
    points, or may be prepended to suppress hyphenation. In addition,
    the user may specify a small list of exception words.
    Only words that consist of a central alphabetic string surrounded
    by (usually null) non-alphabetic strings are candidates for automatic
    hyphenation. Words that contain hyphens (minus), em-dashes (\(em),
    or hyphenation indicator characters are always subject to splitting
    after those characters, whether automatic hyphenation is on or off.

    .nh hyphenate - E
    Automatic hyphenation is turned off.

    .hy N on, N = 1 on, N = 1 E
    Automatic hyphenation is turned on for N ò 1, or off for N = 0.
    If N = 2, last lines (ones that will cause a trap) are not hyphenated.
    For N = 4 and 8, the last and first two characters respectively of a
    word are not split off. These values are additive; i.e., N = 14 will
    invoke all three restrictions.

    .hc c \% \% E
    Hyphenation indicator character is set to c or to the default \%.
    The indicator does not appear in the output.

    .hw word ... ignored -
    Specify hyphenation points in words with embedded minus signs.
    Versions of a word with terminal s are implied; i.e., dig-it implies
    dig-its. This list is examined initially and after each suffix stripping.
    The space available is small?about 128 characters."


    HTH
    --
    Lew Pitcher
    "In Skills We Trust"
    Not LLM output - I'm just like this.

    --- PyGate Linux v1.5.13
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Kalevi Kolttonen@3:633/10 to All on Thursday, March 26, 2026 14:59:27
    Lew Pitcher <lew.pitcher@digitalfreehold.ca> wrote:
    Sorry I couldn't be of more help

    Thanks, it did help somewhat. I just went through the document.
    The culprit was a missing terminating command. I had this:

    .(l
    \(bu \fIThe Flying Saucers Are Real\fP (1950)
    .br
    \(bu \fIFlying Saucers from Outer Space\fP (1953)
    .br
    \(bu \fIThe Flying Saucer Conspiracy\fP (1955)
    .br
    \(bu \fIFlying Saucers: Top Secret\fP (1960)
    .br
    \(bu \fIAliens from Space: The Real Story of Unidentified Flying Objects\fP (1973)


    And it was missing the terminating:

    .)l

    Adding .)l fixed the problem.

    br,
    KK

    --- PyGate Linux v1.5.13
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From boltar@3:633/10 to All on Thursday, March 26, 2026 15:07:26
    On Thu, 26 Mar 2026 14:38:11 -0000 (UTC)
    Lew Pitcher <lew.pitcher@digitalfreehold.ca> gabbled:
    On Thu, 26 Mar 2026 14:08:23 +0000, Kalevi Kolttonen wrote:

    Hello!

    I am very sorry for posting this off-topic question,
    but there seems to be no active groff-related group.

    My question is simple: How do I completely disable
    groff's hyphenation? I am writing a document in Finnish
    using this command:

    groff -me -Kutf8 -Tpdf proto > proto.pdf

    Everything works fine, but the hyphenation is making
    mistakes and I want it. ChatGPT told me

    Remember, ChatGPT is a text-prediction program, and has no

    Its a lot more complicated than that. If you want to see the output of
    a text prediction program download some markov chain code, it'll just output grammatically correct (mostly) junk.


    --- PyGate Linux v1.5.13
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Lawrence D?Oliveiro@3:633/10 to All on Thursday, March 26, 2026 19:48:46
    On Thu, 26 Mar 2026 14:38:11 -0000 (UTC), Lew Pitcher wrote:

    I don't often use groff ...

    Maybe not directly, but remember it is part of the toolchain every
    time you run the man(1) command on a GNU-based system.

    --- PyGate Linux v1.5.13
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Popping Mad@3:633/10 to All on Saturday, May 30, 2026 14:30:16
    On 3/26/26 10:38 AM, Lew Pitcher wrote:
    Remember, ChatGPT is a text-prediction program, and has no
    experience or intrinsic knowledge of groff.

    it also has no intrinsic logic

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From boltar@3:633/10 to All on Sunday, May 31, 2026 07:25:33
    On Sat, 30 May 2026 14:30:16 -0400
    Popping Mad <rainbow@colition.gov> gabbled:
    On 3/26/26 10:38 AM, Lew Pitcher wrote:
    Remember, ChatGPT is a text-prediction program, and has no
    experience or intrinsic knowledge of groff.

    it also has no intrinsic logic

    It does in the sense of the low level ANN programming that allows it to function, but it builds up its own logic and knowledge as its trained.
    People are too dismissive of these systems - they're a lot smarter than a lot of people particularly in IT would like to believe.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Lawrence D?Oliveiro@3:633/10 to All on Monday, June 01, 2026 00:11:23
    On Sun, 31 May 2026 07:25:33 -0000 (UTC), boltar wrote:

    People are too dismissive of these systems - they're a lot smarter
    than a lot of people particularly in IT would like to believe.

    We look at those who get so attached to using these systems that they
    end up overlooking some quite glaring shortcomings.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From boltar@3:633/10 to All on Monday, June 01, 2026 08:17:22
    On Mon, 1 Jun 2026 00:11:23 -0000 (UTC)
    Lawrence =?iso-8859-13?q?D=FFOliveiro?= <ldo@nz.invalid> gabbled:
    On Sun, 31 May 2026 07:25:33 -0000 (UTC), boltar wrote:

    People are too dismissive of these systems - they're a lot smarter
    than a lot of people particularly in IT would like to believe.

    We look at those who get so attached to using these systems that they
    end up overlooking some quite glaring shortcomings.

    Yes, they certainly make some mistakes. But they're not simply turbo charged markov chains that some people seem to think, there is some kind of thinking going on.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Lawrence D?Oliveiro@3:633/10 to All on Tuesday, June 02, 2026 00:09:50
    On Mon, 1 Jun 2026 08:17:22 -0000 (UTC), boltar wrote:

    On Mon, 1 Jun 2026 00:11:23 -0000 (UTC), Lawrence D?Oliveiro wrote:

    On Sun, 31 May 2026 07:25:33 -0000 (UTC), boltar wrote:

    People are too dismissive of these systems - they're a lot smarter
    than a lot of people particularly in IT would like to believe.

    We look at those who get so attached to using these systems that
    they end up overlooking some quite glaring shortcomings.

    Yes, they certainly make some mistakes. But they're not simply turbo
    charged markov chains that some people seem to think, there is some
    kind of thinking going on.

    I rest my case.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From boltar@3:633/10 to All on Tuesday, June 02, 2026 08:21:03
    On Tue, 2 Jun 2026 00:09:50 -0000 (UTC)
    Lawrence =?iso-8859-13?q?D=FFOliveiro?= <ldo@nz.invalid> gabbled:
    On Mon, 1 Jun 2026 08:17:22 -0000 (UTC), boltar wrote:

    On Mon, 1 Jun 2026 00:11:23 -0000 (UTC), Lawrence D?Oliveiro wrote:

    On Sun, 31 May 2026 07:25:33 -0000 (UTC), boltar wrote:

    People are too dismissive of these systems - they're a lot smarter
    than a lot of people particularly in IT would like to believe.

    We look at those who get so attached to using these systems that
    they end up overlooking some quite glaring shortcomings.

    Yes, they certainly make some mistakes. But they're not simply turbo
    charged markov chains that some people seem to think, there is some
    kind of thinking going on.

    I rest my case.

    You didn't have a case. Thinking doesn't mean conciousness, it simply means
    the application of logic in an intelligent way.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Paul@3:633/10 to All on Tuesday, June 02, 2026 05:58:35
    On Tue, 6/2/2026 4:21 AM, boltar@caprica.universe wrote:
    On Tue, 2 Jun 2026 00:09:50 -0000 (UTC)
    Lawrence =?iso-8859-13?q?D=FFOliveiro?= <ldo@nz.invalid> gabbled:
    On Mon, 1 Jun 2026 08:17:22 -0000 (UTC), boltar wrote:

    On Mon, 1 Jun 2026 00:11:23 -0000 (UTC), Lawrence D?Oliveiro wrote:

    On Sun, 31 May 2026 07:25:33 -0000 (UTC), boltar wrote:

    People are too dismissive of these systems - they're a lot smarter
    than a lot of people particularly in IT would like to believe.

    We look at those who get so attached to using these systems that
    they end up overlooking some quite glaring shortcomings.

    Yes, they certainly make some mistakes. But they're not simply turbo
    charged markov chains that some people seem to think, there is some
    kind of thinking going on.

    I rest my case.

    You didn't have a case. Thinking doesn't mean conciousness, it simply means the application of logic in an intelligent way.


    This is the first exposure I had to "automatons", in high school.
    One of my fellow students, did a port of ELIZA, to the language
    our terminal used. He converted the program to APL (Iverson/IBM).
    (Today, that person has a PhD in Artificial Intelligence, and
    presumably has retired. His hobby activity, did give him
    something to do.)

    https://en.wikipedia.org/wiki/ELIZA

    "ELIZA is an early natural language processing computer program developed
    from 1964 to 1967[1] at MIT by Joseph Weizenbaum.

    Weizenbaum intended the program as a method to explore communication between
    humans and machines.

    He was surprised that some people, including his secretary, attributed human-like feelings <===
    to the computer program,[3] a phenomenon that came to be called the ELIZA effect.
    "

    Even back then, there were a few people who could not separate truth from fiction,
    when examining the output. The output wasn't exactly compelling. While testing that port, we spent most of our time laughing at the corny output.

    Having that experience back then, makes it easier to have some perspective
    when viewing how this version works. Who would have guessed this would happen.

    Paul

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From boltar@3:633/10 to All on Tuesday, June 02, 2026 10:38:10
    On Tue, 2 Jun 2026 05:58:35 -0400
    Paul <nospam@needed.invalid> gabbled:
    Even back then, there were a few people who could not separate truth from >fiction,

    Sadly there always were and always will be far too many gullible idiots
    like that in any population. You only have to look at current politics and conspiracy theories to realise.



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Kaz Kylheku@3:633/10 to All on Monday, June 08, 2026 23:34:13
    On 2026-05-31, boltar@caprica.universe <boltar@caprica.universe> wrote:
    On Sat, 30 May 2026 14:30:16 -0400
    Popping Mad <rainbow@colition.gov> gabbled:
    On 3/26/26 10:38 AM, Lew Pitcher wrote:
    Remember, ChatGPT is a text-prediction program, and has no
    experience or intrinsic knowledge of groff.

    it also has no intrinsic logic

    It does in the sense of the low level ANN programming that allows it to function, but it builds up its own logic and knowledge as its trained.
    People are too dismissive of these systems - they're a lot smarter than a lot of people particularly in IT would like to believe.

    They are a lot dumber than people believe.

    A simple hash table token predictor trained on a large body of text
    using 4-grams keys (strings of four words) to predict the fifth word
    will produce some amazing outputs.

    It's because you have a clear grasp of the scope of the data, and
    the approach in the N-gram hash table program, that you know
    there is no thought behind the clever-looking piece of output,
    no poet or sage.

    An algorithm predicting using much more sophisticated linear algebra,
    over a data set of terabytes upon terabytes of text, encompasses a scale
    that is so completely alien to your experience and scope, that your
    intuitions about it have no hope of being correct.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From boltar@3:633/10 to All on Tuesday, June 09, 2026 08:46:00
    On Mon, 8 Jun 2026 23:34:13 -0000 (UTC)
    Kaz Kylheku <046-301-5902@kylheku.com> gabbled:
    On 2026-05-31, boltar@caprica.universe <boltar@caprica.universe> wrote:
    On Sat, 30 May 2026 14:30:16 -0400
    Popping Mad <rainbow@colition.gov> gabbled:
    On 3/26/26 10:38 AM, Lew Pitcher wrote:
    Remember, ChatGPT is a text-prediction program, and has no
    experience or intrinsic knowledge of groff.

    it also has no intrinsic logic

    It does in the sense of the low level ANN programming that allows it to
    function, but it builds up its own logic and knowledge as its trained.
    People are too dismissive of these systems - they're a lot smarter than a lot

    of people particularly in IT would like to believe.

    They are a lot dumber than people believe.

    A simple hash table token predictor trained on a large body of text
    using 4-grams keys (strings of four words) to predict the fifth word
    will produce some amazing outputs.

    Oh please. I've written at least 4 markov chain programs in my career and
    they don't get anywhere close to what an LLM can output. For a start LLMs
    don't simply spit out chunks of text they've ingested joined together by keywords, they have some kind of - limited - understanding of the subject in hand or they wouldn't be able to extrapolate and interpolate. If you don't believe my try it. Then there's the part no one mentions - the ability to understand the input text in a meaningful way. Something no markov chain
    can do.

    An algorithm predicting using much more sophisticated linear algebra,
    over a data set of terabytes upon terabytes of text, encompasses a scale
    that is so completely alien to your experience and scope, that your >intuitions about it have no hope of being correct.

    Don't patronise me.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Kaz Kylheku@3:633/10 to All on Tuesday, June 09, 2026 21:15:50
    On 2026-06-09, boltar@caprica.universe <boltar@caprica.universe> wrote:
    On Mon, 8 Jun 2026 23:34:13 -0000 (UTC)
    Kaz Kylheku <046-301-5902@kylheku.com> gabbled:
    On 2026-05-31, boltar@caprica.universe <boltar@caprica.universe> wrote:
    On Sat, 30 May 2026 14:30:16 -0400
    Popping Mad <rainbow@colition.gov> gabbled:
    On 3/26/26 10:38 AM, Lew Pitcher wrote:
    Remember, ChatGPT is a text-prediction program, and has no
    experience or intrinsic knowledge of groff.

    it also has no intrinsic logic

    It does in the sense of the low level ANN programming that allows it to >>> function, but it builds up its own logic and knowledge as its trained.
    People are too dismissive of these systems - they're a lot smarter than a lot

    of people particularly in IT would like to believe.

    They are a lot dumber than people believe.

    A simple hash table token predictor trained on a large body of text
    using 4-grams keys (strings of four words) to predict the fifth word
    will produce some amazing outputs.

    Oh please. I've written at least 4 markov chain programs in my career and they don't get anywhere close to what an LLM can output.

    Right; nobody is saying that. My point is that these simple programs
    have already produced some outputs that have made people go "wow!".

    The point is that people going "wow!" is not a reliable yardstick of anything.

    don't simply spit out chunks of text they've ingested joined together by keywords, they have some kind of - limited - understanding of the subject in hand or they wouldn't be able to extrapolate and interpolate.

    These things purely interpolate. They do it at such a scale of data that
    you can't tell interpolation from extrapolation. Except maybe if it
    happens to land in your area of extensive expertise.

    If you don't
    believe my try it. Then there's the part no one mentions - the ability to understand the input text in a meaningful way. Something no markov chain
    can do.

    The meaning is entirely bundled in the training data.

    You can extract meaning from a book. Yet a book doesn't think.

    You don't feel that it can think because you are the one doing the
    search for meaning, and the texts you are finding were obviously written
    by the author, so you project the thinking ability onto that author.

    The LLM interposes itself as a middle man broker of information in such
    a way that it appears to be doing the thinking, but all the thinking
    already went into the training material.

    An algorithm predicting using much more sophisticated linear algebra,
    over a data set of terabytes upon terabytes of text, encompasses a scale >>that is so completely alien to your experience and scope, that your >>intuitions about it have no hope of being correct.

    Don't patronise me.

    By your, I mean everyone, you and me. I should say "we". Our intuitions
    are completely in uncharted waters when faced with a contextual
    path-finding engine that wades through terabytes of compressed text,
    with a high degree of statistical accuracy.

    I like to think of it as a transformation that lets me have a quasi-conversation with the training data. I.e. the training data
    perhaps holds some scattered pieces of info I'd like to retrieve,
    an the LLM provides a conversational query system to get at it.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From boltar@3:633/10 to All on Wednesday, June 10, 2026 08:19:51
    On Tue, 9 Jun 2026 21:15:50 -0000 (UTC)
    Kaz Kylheku <046-301-5902@kylheku.com> gabbled:
    On 2026-06-09, boltar@caprica.universe <boltar@caprica.universe> wrote:
    Oh please. I've written at least 4 markov chain programs in my career and
    they don't get anywhere close to what an LLM can output.

    Right; nobody is saying that. My point is that these simple programs
    have already produced some outputs that have made people go "wow!".

    For a few seconds until they read all of the output and realise its either grammatically correct but incoherent gibberish or simply a dump of the input text depending on key length.

    The point is that people going "wow!" is not a reliable yardstick of anything.

    No, but still going wow after years of use is.

    don't simply spit out chunks of text they've ingested joined together by
    keywords, they have some kind of - limited - understanding of the subject in

    hand or they wouldn't be able to extrapolate and interpolate.

    These things purely interpolate. They do it at such a scale of data that
    you can't tell interpolation from extrapolation. Except maybe if it
    happens to land in your area of extensive expertise.

    Like I said, why don't you try it asking it to extrapolate on data that it couldn't possibly have injested because its personal to you or uses long floating point values.

    If you don't
    believe my try it. Then there's the part no one mentions - the ability to
    understand the input text in a meaningful way. Something no markov chain
    can do.

    The meaning is entirely bundled in the training data.

    Yes, and?

    You can extract meaning from a book. Yet a book doesn't think.

    A book doesn't do anything, its memory, not compute. I'd have thought that distinction was obvious.

    The LLM interposes itself as a middle man broker of information in such
    a way that it appears to be doing the thinking, but all the thinking
    already went into the training material.

    Probably the same could be said for you when you were at school.

    By your, I mean everyone, you and me. I should say "we". Our intuitions
    are completely in uncharted waters when faced with a contextual
    path-finding engine that wades through terabytes of compressed text,
    with a high degree of statistical accuracy.

    I suggest you familiarise yourself with John Searles Chinese Room. It doesn't matter how it works internally, its the output that matters.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Paul@3:633/10 to All on Wednesday, June 10, 2026 09:14:39
    On Wed, 6/10/2026 4:19 AM, boltar@caprica.universe wrote:
    On Tue, 9 Jun 2026 21:15:50 -0000 (UTC)
    Kaz Kylheku <046-301-5902@kylheku.com> gabbled:
    On 2026-06-09, boltar@caprica.universe <boltar@caprica.universe> wrote:
    Oh please. I've written at least 4 markov chain programs in my career and >>> they don't get anywhere close to what an LLM can output.

    Right; nobody is saying that. My point is that these simple programs
    have already produced some outputs that have made people go "wow!".

    For a few seconds until they read all of the output and realise its either grammatically correct but incoherent gibberish or simply a dump of the input text depending on key length.

    The point is that people going "wow!" is not a reliable yardstick of anything.

    No, but still going wow after years of use is.

    don't simply spit out chunks of text they've ingested joined together by >>> keywords, they have some kind of - limited - understanding of the subject in

    hand or they wouldn't be able to extrapolate and interpolate.

    These things purely interpolate. They do it at such a scale of data that
    you can't tell interpolation from extrapolation. Except maybe if it
    happens to land in your area of extensive expertise.

    Like I said, why don't you try it asking it to extrapolate on data that it couldn't possibly have injested because its personal to you or uses long floating point values.

    If you don't
    believe my try it. Then there's the part no one mentions - the ability to >>> understand the input text in a meaningful way. Something no markov chain >>> can do.

    The meaning is entirely bundled in the training data.

    Yes, and?

    You can extract meaning from a book. Yet a book doesn't think.

    A book doesn't do anything, its memory, not compute. I'd have thought that distinction was obvious.

    The LLM interposes itself as a middle man broker of information in such
    a way that it appears to be doing the thinking, but all the thinking
    already went into the training material.

    Probably the same could be said for you when you were at school.

    By your, I mean everyone, you and me. I should say "we". Our intuitions
    are completely in uncharted waters when faced with a contextual
    path-finding engine that wades through terabytes of compressed text,
    with a high degree of statistical accuracy.

    I suggest you familiarise yourself with John Searles Chinese Room. It doesn't matter how it works internally, its the output that matters.


    It's because it is a black box, that we cannot use it.

    Half an answer, is no answer at all.

    It only seems to be able to access the statistical nature of
    the answers, if set to "high reasoning". Which requires multiple
    runs, and the LLM-AI has been given some capability to compare
    the answers after the run for some sort of convergence property.

    For the customer, this will burn up 10x as many tokens.

    If it was not for this capability, the "answer" to the LLM-AI
    is a "black box" to it as well. It does not know where the
    answer is coming from. It can produce URLs it claims are "cites".
    But not all the URLs are authoritative. For example, today,
    I was given an answer and one of the cites, was AI slop from
    some web site. Absolutely nothing in the cited web page was
    an attempt at an authoritative answer. I've even had the cite
    web pages, where the keywords of the question do not appear.

    Why would I pay money for this ?

    *******

    You can run queries on a home machine.

    When I first ran this query, it was on a data center machine.

    "What are your capabilities ?"

    At the time, that caused the poor thing to have a nervous
    breakdown (it's an unbounded question). The safety timer
    went off after 15 seconds, and the answer was *erased* from
    the screen, so I cannot give sample text showing how unhinged
    it was.

    The following picture shows the same question run on a home machine,
    with a later model. The answer has a certain canned quality to it,
    so maybe the safety timer event was used in preparing newer models.
    Perhaps the prompt has instructions on what to do with the
    unbounded questions.

    [Picture] Run-Queries-With-Network-Cable-Disconnected.gif

    https://postimg.cc/k20gzLYy

    https://imgur.com/a/It9jbsL

    I can't be sure, but it appears to be telling me that I can't
    have more than 4096 output tokens. The computer this was run
    on, does not have a very good video card. I don't think
    the video card could be used, as it is too small (GTX1050
    with 4GB of VRAM).

    Paul

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From boltar@3:633/10 to All on Wednesday, June 10, 2026 15:01:24
    On Wed, 10 Jun 2026 09:14:39 -0400
    Paul <nospam@needed.invalid> gabbled:
    On Wed, 6/10/2026 4:19 AM, boltar@caprica.universe wrote:
    I suggest you familiarise yourself with John Searles Chinese Room. It doesn't

    matter how it works internally, its the output that matters.


    It's because it is a black box, that we cannot use it.

    Huh? No one is asking it to fly a passenger plane.

    Half an answer, is no answer at all.

    Except most of the time it gives a full answer. ChatGPT has saved me a huge amount of time because I haven't had to wade through the rubbish and arguments on stack overflow to find how to do something. It usually provides something useful even if it needs tweaking.

    some web site. Absolutely nothing in the cited web page was
    an attempt at an authoritative answer. I've even had the cite
    web pages, where the keywords of the question do not appear.

    Why would I pay money for this ?

    You don't have to , its free.

    When I first ran this query, it was on a data center machine.

    "What are your capabilities ?"

    At the time, that caused the poor thing to have a nervous
    breakdown (it's an unbounded question). The safety timer
    went off after 15 seconds, and the answer was *erased* from
    the screen, so I cannot give sample text showing how unhinged
    it was.

    I asked ChatGPT the same and it produced a sensible response. Perhaps
    expecting it to work on a home machine is asking a lot.

    [Picture] Run-Queries-With-Network-Cable-Disconnected.gif

    https://postimg.cc/k20gzLYy

    Didn't work.

    https://imgur.com/a/It9jbsL

    Not available in my region.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Kaz Kylheku@3:633/10 to All on Tuesday, June 23, 2026 22:56:54
    On 2026-06-10, boltar@caprica.universe <boltar@caprica.universe> wrote:
    On Tue, 9 Jun 2026 21:15:50 -0000 (UTC)
    Kaz Kylheku <046-301-5902@kylheku.com> gabbled:
    These things purely interpolate. They do it at such a scale of data that >>you can't tell interpolation from extrapolation. Except maybe if it
    happens to land in your area of extensive expertise.

    Like I said, why don't you try it asking it to extrapolate on data that it couldn't possibly have injested because its personal to you or uses long floating point values.

    People also believe that psychics know something about them personally,
    which then "proves" they indeed have telepathic powers.

    Same kind of con job.

    You can extract meaning from a book. Yet a book doesn't think.

    A book doesn't do anything, its memory, not compute. I'd have thought that distinction was obvious.

    No it isn't; the weights of a trained LLM are also memory, not compute.

    Getting something out of a book or LLM is a function.

    In the case of the LLM, the retrieval function is sophisticated. It
    allows you to have a pseudo-conversation with a vast body of texts to
    extract a wide range of information. (And that is undeniably useful,
    at least in situations in which you have a way to check the veracity of
    the output.)

    By your, I mean everyone, you and me. I should say "we". Our intuitions >>are completely in uncharted waters when faced with a contextual >>path-finding engine that wades through terabytes of compressed text,
    with a high degree of statistical accuracy.

    I suggest you familiarise yourself with John Searles Chinese Room. It doesn't matter how it works internally, its the output that matters.

    I'm addressing the wrong ideas that the thing is thinking, or that
    it has access to information outside of its training data; I am not
    claiming that the output doesn't matter or isn't useful.

    Those wrong ideas are not supported by only-the-output-matters
    argumentation, which amounts to nothing more than a feeble attempt
    to shut down the discussion.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- PyGate Linux v1.5.17
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Kalevi Kolttonen@3:633/10 to All on Tuesday, June 23, 2026 23:37:32
    Kaz Kylheku <046-301-5902@kylheku.com> wrote:
    I'm addressing the wrong ideas that the thing is thinking, or that
    it has access to information outside of its training data; I am not
    claiming that the output doesn't matter or isn't useful.

    ChatGPT regularly performs web searches when it needs
    to do so. So it definitely has access to information outside
    of its training data.

    br,
    KK

    --- PyGate Linux v1.5.17
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From boltar@3:633/10 to All on Wednesday, June 24, 2026 07:13:10
    On Tue, 23 Jun 2026 22:56:54 -0000 (UTC)
    Kaz Kylheku <046-301-5902@kylheku.com> gabbled:
    On 2026-06-10, boltar@caprica.universe <boltar@caprica.universe> wrote:
    Like I said, why don't you try it asking it to extrapolate on data that it >> couldn't possibly have injested because its personal to you or uses long
    floating point values.

    People also believe that psychics know something about them personally,
    which then "proves" they indeed have telepathic powers.

    A stupid analogy.

    Same kind of con job.

    Except the difference is I can ask ChatGPT to write some code and it will
    and if its not too long it'll work. I don't care how it does, but it does it, and thats good enough for me.

    A book doesn't do anything, its memory, not compute. I'd have thought that >> distinction was obvious.

    No it isn't; the weights of a trained LLM are also memory, not compute.

    Do you understand how neural nets work? The one inside your head seems to
    be having trouble with the concept.

    I suggest you familiarise yourself with John Searles Chinese Room. It doesn't

    matter how it works internally, its the output that matters.

    I'm addressing the wrong ideas that the thing is thinking, or that
    it has access to information outside of its training data; I am not
    claiming that the output doesn't matter or isn't useful.

    If its not thinking or reasoning what would you call it? Given you seem to think its a sophisticated markov chain - proving you understand neither
    markov chains or LLMs - I don't think I'll give much weight to your opinion.

    Those wrong ideas are not supported by only-the-output-matters
    argumentation, which amounts to nothing more than a feeble attempt
    to shut down the discussion.

    Whatever. LLMs are not just dumb statistical output machines.


    --- PyGate Linux v1.5.18
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)