• ksh93u+m locale issue with select statement

    From Janis Papanagnou@3:633/10 to All on Sunday, January 11, 2026 05:51:26
    Kornshell doesn't seem to handle umlauts or other non-ASCII Unicode
    characters correctly with the 'select' statement; the display shows
    (for example)

    1) abcdefghijklmnopqrstuvwxyz 15) abcdefghijklmnopqrstuvwxyz
    2) abcdefghijklmnopqrstuvwxyz 16) abcdefghijklmnopqrstuvwxyz
    3) „bcdefghijklmnopqrstuvwxyz 17) ABCDEFGHIJKLMNOPQRSTUVWXYZ
    4) „bcdefghijklmn”pqrstuvwxyz 18) ABCDEFGHIJKLMNOPQRSTUVWXYZ
    5) „bcdefghijklmn”pqrstvwxyz 19) ŽBCDEFGHIJKLMNOPQRSTUVWXYZ
    6) abcd?fghijklĉnopqrstuvwxyz 20) ŽBCDEFGHIJKLMN™PQRSTUVWXYZ
    7) abcd?fghijklmnopqrstuvwxyz 21) ŽBCDEFGHIJKLMN™PQRSTšVWXYZ
    8) „bcdefghijklmn”pqrátvwxyz 22) ABCDEFGHIJKLMNOPQRSTUVWXYZ
    9) abcdefghijklmnopqrstuvwxyz 23) ABCDEFGHIJKLMNOPQRSTUVWXYZ
    10) abcdefghijklmnopqrstuvwxyz 24) ŽBCDEFGHIJKLMN™PQR?TšVWXYZ
    11) abcdefghijklmnopqrstuvwxyz 25) ABCDEFGHIJKLMNOPQRSTUVWXYZ
    12) abcdefghijklmnopqrstuvwxyz 26) ABCDEFGHIJKLMNOPQRSTUVWXYZ
    13) abcdefghijklmnopqrstuvwxyz 27) ABCDEFGHIJKLMNOPQRSTUVWXYZ
    14) abcdefghijklmnopqrstuvwxyz

    Somehow it obviously gets confused with the count of the number of
    characters and the count of the octets in the encoding, thus the
    formatting gets corrupted.

    (Playing with other locales doesn't change that effect.)

    Observed in ksh version AJM 93u+m/1.0.8 2024-01-01.

    (Bash handles that correctly.)

    Janis


    --- PyGate Linux v1.5.2
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Martijn Dekker@3:633/10 to All on Wednesday, February 25, 2026 00:39:12
    Op 11-01-2026 om 04:51 schreef Janis Papanagnou:
    Kornshell˙doesn't˙seem˙to˙handle˙umlauts˙or˙other˙non-ASCII˙Unicode characters˙correctly˙with˙the˙'select'˙statement;

    The code for showing the menu is not aware of multibyte locales. Thanks for the report. I've fixed it for the next release. See: https://github.com/ksh93/ksh/commit/18e4cbc1

    Observed in ksh version AJM 93u+m/1.0.8 2024-01-01.

    FYI, you're two point releases behind.

    --
    || modernish -- harness the shell
    || https://github.com/modernish/modernish
    ||
    || KornShell lives!
    || https://github.com/ksh93/ksh

    --- PyGate Linux v1.5.12
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Wednesday, February 25, 2026 03:06:35
    Thanks for fixing the select issue!

    On 2026-02-25 01:39, Martijn Dekker wrote:
    Op 11-01-2026 om 04:51 schreef Janis Papanagnou:
    [...]
    Observed in ksh version AJM 93u+m/1.0.8 2024-01-01.

    FYI, you're two point releases behind.

    Thanks. - That's probably because since my distro meanwhile supports
    "u+m" already I'm just using what's provided.[*]

    But maybe I should again get the newest one and ignore what comes out
    of the box with my system? (I think if any of the two other issues I
    have with "u+m" would get fixed then I'd do that, but I recall you
    haven't intended to change these[**], so there's [at the moment] no
    pressing need to change to the latest release.)

    Janis

    [*] If it were still an "u+" I'd certainly get "u+m" from a more up to
    date source since I prefer that. I'm glad that "u+m" made it into the
    distros.

    [**] One was the vi-mode expansion of $(cmd) when typing a '*', which
    was already _existing_ in original "u+" (and which, sadly, you removed
    from "u+m"); I had used that very often and I'm really missing it. :-/


    --- PyGate Linux v1.5.12
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Richard Harnden@3:633/10 to All on Tuesday, March 17, 2026 10:55:21
    On 25/02/2026 00:39, Martijn Dekker wrote:
    Op 11-01-2026 om 04:51 schreef Janis Papanagnou:
    Kornshell˙doesn't˙seem˙to˙handle˙umlauts˙or˙other˙non-ASCII˙Unicode
    characters˙correctly˙with˙the˙'select'˙statement;

    The code for showing the menu is not aware of multibyte locales. Thanks
    for the report. I've fixed it for the next release. See: https://github.com/ksh93/ksh/commit/18e4cbc1

    Observed in ksh version AJM 93u+m/1.0.8 2024-01-01.

    FYI, you're two point releases behind.


    Same for printf "%-*s" - counts octets, not printable characters.

    93u+m/1.0.6, so way behind.




    --- PyGate Linux v1.5.13
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Geoff Clare@3:633/10 to All on Tuesday, March 17, 2026 13:31:17
    Richard Harnden wrote:

    Same for printf "%-*s" - counts octets, not printable characters.

    That's what printf is supposed to do, as POSIX requires that field
    widths specify the number of bytes (I assume for compatibility with
    the printf() C function).

    --
    Geoff Clare <netnews@gclare.org.uk>

    --- PyGate Linux v1.5.13
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Richard Harnden@3:633/10 to All on Tuesday, March 17, 2026 14:09:08
    On 17/03/2026 13:31, Geoff Clare wrote:
    Richard Harnden wrote:

    Same for printf "%-*s" - counts octets, not printable characters.

    That's what printf is supposed to do, as POSIX requires that field
    widths specify the number of bytes (I assume for compatibility with
    the printf() C function).


    Ah, okay, thanks.

    I have a function that pads by ignoring utf8 continuation bytes and
    ignores ansi escapes. Not exactly pretty, but it works for me.


    --- PyGate Linux v1.5.13
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)