On Wed, 31 Dec 2025 02:01:55 -0000 (UTC), Michael Sanders wrote:
*ISO C (C17 / C23)*:
C17, 5.1.2.2.1 "Program startup"
The value of argc shall be nonnegative.
argv[argc] shall be a null pointer.
If the value of argc is greater than zero, the array members argv[0]
through argv[argc?1] inclusive shall contain pointers to strings
which are given implementation-defined values.
...
What say you?
Clearly on Windows, there are no guarantees about argc contains, so
you shouldn?t be relying on it.
Summary: Some systems guarantee that argc>=1 and argv[0] points to
a valid string, but software that's intended to be portable should
tolerate argc==0 and argv[0]==NULL.
For more information, see
<https://github.com/Keith-S-Thompson/argv0>.
On Wed, 31 Dec 2025 03:10:52 -0000 (UTC), Lawrence D?Oliveiro wrote:
Clearly on Windows, there are no guarantees about argc contains, so
you shouldn?t be relying on it.
Some windows snippets:
Michael Sanders <porkchop@invalid.foo> writes:
On Tue, 30 Dec 2025 18:42:30 GMT, Scott Lurndal wrote:
What if 'argv[0]' is NULL (and argc == 0)?
Well, seems we have to make a choice, ISO vs. POSIX:
*ISO C (C17 / C23)*:
C17, 5.1.2.2.1 "Program startup"
The value of argc shall be nonnegative.
argv[argc] shall be a null pointer.
[...]
*POSIX.1-2017 (and later)*
POSIX execve() specification:
The argument argv is an array of character pointers
to null-terminated strings.
The application shall ensure that argv[0] points to a filename
string that is associated with the process being started.
[...]
What say you?
It happens that I recently spent some time looking into this.
As you say, POSIX requires argc >= 1, but ISO C only guarantees
argc >= 0.
If argc == 0, a program that assumes argv[0] is non-null
can run into serious problems if that assumption is invalid.
In particular, a program called "pkexec" would try to traverse
arguments starting with argv[1], which logically doesn't
exist if argc==0. Due to the way program arguments are laid
out in memory, argv[1] is also envp[0]. Frivolity ensued.
See <https://nvd.nist.gov/vuln/detail/cve-2021-4034>.
The Linux kernel updated execve to ensure that the invoked program
has argc>=1. It was patched in early 2022. NetBSD still has this vulnerability.
Summary: Some systems guarantee that argc>=1 and argv[0] points to
a valid string, but software that's intended to be portable should
tolerate argc==0 and argv[0]==NULL.
For more information, see
<https://github.com/Keith-S-Thompson/argv0>.
On Wed, 31 Dec 2025 02:01:55 -0000 (UTC), Michael Sanders wrote:
*ISO C (C17 / C23)*:
C17, 5.1.2.2.1 "Program startup"
The value of argc shall be nonnegative.
argv[argc] shall be a null pointer.
If the value of argc is greater than zero, the array members argv[0] through argv[argc?1] inclusive shall contain pointers to strings
which are given implementation-defined values.
...
What say you?
Clearly on Windows, there are no guarantees about argc contains, so
you shouldn?t be relying on it.
Lawrence D?Oliveiro <ldo@nz.invalid> writes:
On Wed, 31 Dec 2025 02:01:55 -0000 (UTC), Michael Sanders wrote:
*ISO C (C17 / C23)*:
C17, 5.1.2.2.1 "Program startup"
The value of argc shall be nonnegative.
argv[argc] shall be a null pointer.
If the value of argc is greater than zero, the array members argv[0]
through argv[argc?1] inclusive shall contain pointers to strings
which are given implementation-defined values.
...
What say you?
Clearly on Windows, there are no guarantees about argc contains, so
you shouldn?t be relying on it.
That's not clear. Linux (since 2022) guarantees argc>=1.
Lawrence D?Oliveiro <ldo@nz.invalid> writes:
On Wed, 31 Dec 2025 02:01:55 -0000 (UTC), Michael Sanders wrote:
*ISO C (C17 / C23)*:
C17, 5.1.2.2.1 "Program startup"
The value of argc shall be nonnegative.
argv[argc] shall be a null pointer.
If the value of argc is greater than zero, the array members argv[0]
through argv[argc?1] inclusive shall contain pointers to strings
which are given implementation-defined values.
...
What say you?
Clearly on Windows, there are no guarantees about argc contains, so
you shouldn?t be relying on it.
That's not clear. Linux (since 2022) guarantees argc>=1. I don't
know whether Windows makes a similar guarantee, but it's entirely
plausible that it could.
But not all systems guarantee argc>=1, so any portable code shouldn't
make assumptions beyond what's guaranteed by ISO C.
On Tue, 12/30/2025 10:37 PM, Keith Thompson wrote:
Lawrence D?Oliveiro <ldo@nz.invalid> writes:
On Wed, 31 Dec 2025 02:01:55 -0000 (UTC), Michael Sanders wrote:
*ISO C (C17 / C23)*:
C17, 5.1.2.2.1 "Program startup"
The value of argc shall be nonnegative.
argv[argc] shall be a null pointer.
If the value of argc is greater than zero, the array members argv[0]
through argv[argc?1] inclusive shall contain pointers to strings
which are given implementation-defined values.
...
What say you?
Clearly on Windows, there are no guarantees about argc contains, so
you shouldn?t be relying on it.
That's not clear. Linux (since 2022) guarantees argc>=1. I don't
know whether Windows makes a similar guarantee, but it's entirely
plausible that it could.
But not all systems guarantee argc>=1, so any portable code shouldn't
make assumptions beyond what's guaranteed by ISO C.
The Windows guarantee is with MSVC, according to this answer.
********** CoPilot Question **********
Does the Windows operating system guarantee that argc >= 1 for an EXE program ?
argv[0] usually contains the program absolute path.
********** CoPilot Answer ************
Here?s the clear, standards?grounded, Windows?specific answer you?re looking for ? with citations from the search results.
# **Short Answer**
**Yes. On Windows, when using the Microsoft C runtime (MSVCRT), argc
is always guaranteed to be ò 1**, and ** argv[0] always exists**. However, **Windows itself does *not* guarantee what argv[0] contains** -
only that the C runtime *treats* it as the program name.
---
# **What Windows Guarantees vs. What the C Runtime Guarantees**
### **Microsoft C Runtime Guarantee**
Microsoft?s documentation explicitly states:
- *?Since the program name is considered an argument, the value of argc is at least one.?*
So if your EXE is built with MSVC and uses the standard main or wmain , you can rely on:
- argc >= 1
- argv[0] is a valid null?terminated string
- argv[argc] == NULL
### **MSVC Startup Code Behavior**
MSVC startup code parses the command line and
**always inserts the program name as argv[0] ** (quoted if needed).
---
# **What Windows *Does Not* Guarantee**
Windows itself does **not** enforce that the first token of the command line is the executable name. This is a convention, not a rule:
- * By convention the module name ... is the first argument ... Note that this is
only a convention and is not enforced in any way. *
This means:
- A caller can technically pass *anything* as the first token.
- A process created via CreateProcess can have an arbitrary lpCommandLine .
- argv[0] may not be the absolute path.
- It may not even be a valid path at all.
This is why portable code should not assume argv[0] is the executable path.
---
# **Cross?Platform Note**
The C standard does **not** guarantee that argv[0] contains the program name ? only that it exists.
This is echoed in the search results:
- * In short: No... this is more of a common convention than a strict rule. *
---
# **Final Summary**
| Question | Answer |
|------------------------------------------------------|------------------------------------------|
| **Does Windows guarantee argc >= 1 ?** | **Yes ? if using MSVC?s C runtime.** |
| **Is argv[0] guaranteed to exist?** | **Yes ? with MSVC.** |
| **Is argv[0] guaranteed to be the absolute path?** | **No.** |
| **Does Windows enforce any of this?** | **No ? it?s the C runtime, not the OS.** |
If you need the *actual* executable path, use:
- GetModuleFileName(NULL, ...) (Windows API)
---
https://learn.microsoft.com/en-us/cpp/c-language/argument-description?view=msvc-170&utm_source=copilot.com
https://learn.microsoft.com/en-us/cpp/c-language/parsing-c-command-line-arguments?view=msvc-170&utm_source=copilot.com
https://github-wiki-see.page/m/johnstevenson/winbox-args/wiki/How-Windows-parses-the-command-line?utm_source=copilot.com
https://www.codegenes.net/blog/is-argv-0-name-of-executable-an-accepted-standard-or-just-a-common-convention/?utm_source=copilot.com
When argv[0] Isn?t the Executable Name
4.1 Invocation via exec Functions
4.2 Symbolic Links
4.3 Shell Scripts and Aliases
4.4 Debuggers, Emulators, and Special Environments
********** End CoPilot Answer ************
= 1.
So is that a Yes or No?
My C compiler calls __getmainargs() in msvcrt.dll to get argc/argv.
__getmainargs() is also imported by programs compiled with Tiny C, and also with gcc 14.x from winlibs.com. (I assume it is actually called for the same purpose.)
The specs for __getmainargs() say that the returned argc value is always >= 1.
(I doubt whether msvcrt.dll, which is present because so many programs rely on it, is what is used by MSVC-compiled appls, but you'd have to look inside such an app to check. EXEs inside \windows\system tend to import DLLs with names like "api-ms-win...".)
In any case, it is easy enough to do a check on argc's value in your applications. (And on Windows, if it is 0 and you really need the path, you can get it with GetModuleFileNameA().)
On 31/12/2025 17:30, Paul wrote:s to
On Tue, 12/30/2025 10:37 PM, Keith Thompson wrote:
Lawrence D?Oliveiro <ldo@nz.invalid> writes:
On Wed, 31 Dec 2025 02:01:55 -0000 (UTC), Michael Sanders wrote:
*ISO C (C17 / C23)*:
C17, 5.1.2.2.1 "Program startup"
The value of argc shall be nonnegative.
argv[argc] shall be a null pointer.
If the value of argc is greater than zero, the array members
argv[0] through argv[argc?1] inclusive shall contain pointer
pecific answerstrings which are given implementation-defined values.
...
What say you?
Clearly on Windows, there are no guarantees about argc contains,
so you shouldn?t be relying on it.
That's not clear. Linux (since 2022) guarantees argc>=1. I don't
know whether Windows makes a similar guarantee, but it's entirely
plausible that it could.
But not all systems guarantee argc>=1, so any portable code
shouldn't make assumptions beyond what's guaranteed by ISO C.
The Windows guarantee is with MSVC, according to this answer.
********** CoPilot Question **********
Does the Windows operating system guarantee that argc >= 1 for an
EXE program ?
argv[0] usually contains the program absolute path.
********** CoPilot Answer ************
Here?s the clear, standards?grounded, Windows?s
ults.you?re looking for ? with citations from the search res
of# **Short Answer**
**Yes. On Windows, when using the Microsoft C runtime (MSVCRT), argc
is always guaranteed to be ò 1**, and ** argv[0] always
exists**. However, **Windows itself does *not* guarantee what
argv[0] contains** - only that the C runtime *treats* it as the
program name.
---
# **What Windows Guarantees vs. What the C Runtime Guarantees**
### **Microsoft C Runtime Guarantee**
Microsoft?s documentation explicitly states:
- *?Since the program name is considered an argument, the value
---------------------------|argc is at least one.?*
So if your EXE is built with MSVC and uses the standard main or
wmain , you can rely on:
- argc >= 1
- argv[0] is a valid null?terminated string
- argv[argc] == NULL
### **MSVC Startup Code Behavior**
MSVC startup code parses the command line and
**always inserts the program name as argv[0] ** (quoted if
needed).
---
# **What Windows *Does Not* Guarantee**
Windows itself does **not** enforce that the first token of the
command line is the executable name. This is a convention, not a
rule:
- * By convention the module name ... is the first argument ...
Note that this is only a convention and is not enforced in any way.
*
This means:
- A caller can technically pass *anything* as the first token.
- A process created via CreateProcess can have an arbitrary
lpCommandLine .
- argv[0] may not be the absolute path.
- It may not even be a valid path at all.
This is why portable code should not assume argv[0] is the
executable path.
---
# **Cross?Platform Note**
The C standard does **not** guarantee that argv[0] contains the
program name ? only that it exists. This is echoed in the search
results:
- * In short: No... this is more of a common convention than a
strict rule. *
---
# **Final Summary**
| Question | Answer
| |------------------------------------------------------|---------------
? if| **Does Windows guarantee argc >= 1 ?** | **Yes ?
the OS.** |using MSVC?s C runtime.** | | **Is argv[0] guaranteed to
exist?** | **Yes ? with MSVC.**
| | **Is argv[0] guaranteed to be the absolute path?** | **No.**
| | **Does Windows enforce any of
this?** | **No ? it?s the C runtime, not
iew=msvc-170&utm_source=copilot.comIf you need the *actual* executable path, use:
- GetModuleFileName(NULL, ...) (Windows API)
---
https://learn.microsoft.com/en-us/cpp/c-language/argument-description?v
-arguments?view=msvc-170&utm_source=copilot.comhttps://learn.microsoft.com/en-us/cpp/c-language/parsing-c-command-line
ws-parses-the-command-line?utm_source=copilot.comhttps://github-wiki-see.page/m/johnstevenson/winbox-args/wiki/How-Windo
-standard-or-just-a-common-convention/?utm_source=copilot.comhttps://www.codegenes.net/blog/is-argv-0-name-of-executable-an-accepted
When argv[0] Isn?t the Executable Name
4.1 Invocation via exec Functions
4.2 Symbolic Links
4.3 Shell Scripts and Aliases
4.4 Debuggers, Emulators, and Special Environments
********** End CoPilot Answer ************
So is that a Yes or No?
My C compiler calls __getmainargs() in msvcrt.dll to get argc/argv.
__getmainargs() is also imported by programs compiled with Tiny C,
and also with gcc 14.x from winlibs.com. (I assume it is actually
called for the same purpose.)
The specs for __getmainargs() say that the returned argc value is
always
= 1.
(I doubt whether msvcrt.dll, which is present because so many
programs rely on it, is what is used by MSVC-compiled appls, but
you'd have to look inside such an app to check. EXEs inside
\windows\system tend to import DLLs with names like "api-ms-win...".)
In any case, it is easy enough to do a check on argc's value in your applications. (And on Windows, if it is 0 and you really need the
path, you can get it with GetModuleFileNameA().)
... using exec() in caller sounds like a bad idea. It just not how
these systems work and not how people write programs on them.
I'd implement caller with spawn(). I suppose that even on POSIX it
is more idiomatic.
On Wed, 31 Dec 2025 03:10:52 -0000 (UTC)
Lawrence D?Oliveiro <ldo@nz.invalid> wrote:
Clearly on Windows, there are no guarantees about argc contains, so
you shouldn?t be relying on it.
How did you come to this conclusion?
In any case, it is easy enough to do a check on argc's value in your applications. (And on Windows, if it is 0 and you really need the
path, you can get it with GetModuleFileNameA().)
On Wed, 31 Dec 2025 18:42:45 +0000, bart wrote:
In any case, it is easy enough to do a check on argc's value in your
applications. (And on Windows, if it is 0 and you really need the
path, you can get it with GetModuleFileNameA().)
Remember that, on *nix systems, the contents of argv are arbitrary and caller-specified. And none of them need bear any relation to the
actual filename of the invoked executable.
In fact, it is quite common for utilities to behave differently based
on the name, as passed in argv[0], by which they are invoked.
On Tue, 30 Dec 2025 19:35:12 -0800[...]
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
For more information, see
<https://github.com/Keith-S-Thompson/argv0>.
If you are interested in behavior on non-POSIX systems, primarily
Windows, but possibly others as well (e.g. VMS) then using exec() in
caller sounds like a bad idea. It just not how these systems work and
not how people write programs on them.
Even when exec() *appears* to works in some environments (like
msys2) it likely emulated by spawn() followed by exit().
I'd implement caller with spawn(). I suppose that even on POSIX it is
more idiomatic.
On Wed, 31 Dec 2025 03:10:52 -0000 (UTC)[...]
Lawrence D?Oliveiro <ldo@nz.invalid> wrote:
Clearly on Windows, there are no guarantees about argc contains, so
you shouldn?t be relying on it.
How did you come to this conclusion?
Keith's test appears to show the opposite - he was not able to convince
the Windows system to call application with empty argv list.
Of course, he tried only one way out of many, but knowing how native
Windows system call works, it appears extremely likely that on Windows
argc < 1 is impossible.
On Wed, 31 Dec 2025 15:29:09 +0200, Michael S wrote:
On Wed, 31 Dec 2025 03:10:52 -0000 (UTC)
Lawrence D?Oliveiro <ldo@nz.invalid> wrote:
Clearly on Windows, there are no guarantees about argc contains, so
you shouldn?t be relying on it.
How did you come to this conclusion?
The fact that the C spec says so.
Is there any standard on Windows for
how different C compilers are supposed to handle argc/argv?
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:[...]
That's not clear. Linux (since 2022) guarantees argc>=1.
Does it? That seems to be up to the shell, since the exec()
manual pages on the latest Fedora Core release don't indicate
that argv[0] must be initialized or that argc be greater than zero.
# **Cross?Platform Note**[...]
The C standard does **not** guarantee that argv[0] contains the
program name ? only that it exists.
Paul <nospam@needed.invalid> writes:
That isn't quite correct, or is at least misleading. ISO C guarantees
that argv[0] exists, but not that it points to a string. On some
systems, it can contain be a null pointer.
Michael S <already5chosen@yahoo.com> writes:
On Tue, 30 Dec 2025 19:35:12 -0800[...]
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
For more information, see
<https://github.com/Keith-S-Thompson/argv0>.
If you are interested in behavior on non-POSIX systems, primarily
Windows, but possibly others as well (e.g. VMS) then using exec()
in caller sounds like a bad idea. It just not how these systems
work and not how people write programs on them.
Even when exec() *appears* to works in some environments (like
msys2) it likely emulated by spawn() followed by exit().
I'd implement caller with spawn(). I suppose that even on POSIX it
is more idiomatic.
If I were going to look into the behavior on Windows, I'd probably
want to use Windows native features. (I tried my test on Cygwin,
and the callee wasn't invoked.)
Apparently the Windows way to invoke a program is CreateProcessA().
But it takes the command line as a single string. There might not
be a Windows-native way to exercise the kind of control over argc
and argv provided by POSIX execve().
argv[0] merely returns what was typed on the command line to invoke the application.[...]
So if someone types:
C:\abc> prog
it may run a prog.exe found in, say, c:\programs\myapp, and return the full path as
"c:\programs\myapp\prog.exe".
args[0] will give you only "prog"; good luck with that!
On 1/1/26 12:29 AM, Keith Thompson wrote:
Paul <nospam@needed.invalid> writes:
That isn't quite correct, or is at least misleading. ISO C guarantees
that argv[0] exists, but not that it points to a string. On some
systems, it can contain be a null pointer.
I heard of this before.
Is it just theoretical, or do we have actual systems where
argv[0]==NULL? I never saw it happen in any modern operating system.
On Wed, 31 Dec 2025 15:00:24 -0800[...]
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
If I were going to look into the behavior on Windows, I'd probably
want to use Windows native features. (I tried my test on Cygwin,
and the callee wasn't invoked.)
That's likely because under Windows callee is named callee.exe.
I didn't try on cygwin, but that was the reason of failure under msys2.
Also, I am not sure if slash in the name is allowed. May be, backslash
is required.
But, there is a difference between argv[0] and GetModuleFileName().
Michael S <already5chosen@yahoo.com> writes:
On Tue, 30 Dec 2025 19:35:12 -0800[...]
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
For more information, see
<https://github.com/Keith-S-Thompson/argv0>.
If you are interested in behavior on non-POSIX systems, primarily
Windows, but possibly others as well (e.g. VMS) then using exec() in
caller sounds like a bad idea. It just not how these systems work and
not how people write programs on them.
Even when exec() *appears* to works in some environments (like
msys2) it likely emulated by spawn() followed by exit().
I'd implement caller with spawn(). I suppose that even on POSIX it is
more idiomatic.
If I were going to look into the behavior on Windows, I'd probably
want to use Windows native features. (I tried my test on Cygwin,
and the callee wasn't invoked.)
Apparently the Windows way to invoke a program is CreateProcessA().
But it takes the command line as a single string. There might not
be a Windows-native way to exercise the kind of control over argc
and argv provided by POSIX execve().
On Windows there is a command-line provided to applications via GetCommandLine() api. This is a single zero-terminated string.
Windows views parsing of the command-line string to be in the
application domain, I guess.
[...]
It creates executables with a ".exe" suffix,
but plays some tricks so that "foo.exe" also looks like "foo".
Are there any standards for how C argc/argv are supposed to behave on Windows?
On Wed, 31 Dec 2025 22:57:55 +0000, bart wrote:
But, there is a difference between argv[0] and GetModuleFileName().
So the latter cannot be used as a simple substitute for the former, as
you might have led us to believe.
In fact, it is quite common for utilities to behave differently basedon the name, as passed in argv[0], by which they are invoked.
(And on Windows, if it is 0 and you really need the path, you can getit with GetModuleFileNameA().)
On Wed, 31 Dec 2025 09:37:08 -0000 (UTC), Lawrence D?Oliveiro wrote:
Are there any standards for how C argc/argv are supposed to behave
on Windows?
Good question, some more ways to open things (that I know of), see
2nd example for 'sort of' argc/argv...
[examples omitted]
On 01/01/2026 01:03, Lawrence D?Oliveiro wrote:
On Wed, 31 Dec 2025 22:57:55 +0000, bart wrote:
But, there is a difference between argv[0] and
GetModuleFileName().
So the latter cannot be used as a simple substitute for the former,
as you might have led us to believe.
It depends on your needs.
All those are at the sending end. But what would C code see at the
receiving end?
On Thu, 1 Jan 2026 07:32:34 -0000 (UTC), Michael Sanders wrote:rote:
On Wed, 31 Dec 2025 09:37:08 -0000 (UTC), Lawrence D?Oliveiro w
Are there any standards for how C argc/argv are supposed to behave
on Windows?
Good question, some more ways to open things (that I know of), see
2nd example for 'sort of' argc/argv...
[examples omitted]
All those are at the sending end. But what would C code see at the
receiving end?
On Thu, 1 Jan 2026 14:05:29 +0000, bart wrote:
On 01/01/2026 01:03, Lawrence D?Oliveiro wrote:
On Wed, 31 Dec 2025 22:57:55 +0000, bart wrote:
But, there is a difference between argv[0] and
GetModuleFileName().
So the latter cannot be used as a simple substitute for the former,
as you might have led us to believe.
It depends on your needs.
You neglected to mention that when offering the substitute before
though, didn?t you?
On Thu, 1 Jan 2026 19:02:49 -0000 (UTC)
Lawrence D?Oliveiro <ldo@nz.invalid> wrote:
All those are at the sending end. But what would C code see at the
receiving end?
The first three cases look very simple.
On Thu, 1 Jan 2026 21:53:20 +0200, Michael S wrote:
On Thu, 1 Jan 2026 19:02:49 -0000 (UTC)
Lawrence D?Oliveiro <ldo@nz.invalid> wrote:
All those are at the sending end. But what would C code see at the
receiving end?
The first three cases look very simple.
Is there some spec in Windows which describes how that works?
More interesting and meaningful question is how to do the reverse.
I.e. how to convert an argv[] array into flat form in a way that
guarantees that CommandLineToArgvW() parses it back into original
form? Is it even possible in general case or there exist limitations (ignoring, for sake of brevity, 2*15-1 size limit)?
Microsoft certainly has reverse conversion implemented, e.g. here: https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/spawnv-wspawnv
Through the years you were told so, by different people, and shown
the spec may be 100 times. But being the trolll you are, you
continue to ask.
Still, for the benefit of more sincere readers and also for myself,
in order to have both pieces in one place: https://learn.microsoft.com/en-us/windows/win32/api/shellapi/nf-shellapi-commandlinetoargvw
https://learn.microsoft.com/en-us/cpp/c-language/parsing-c-command-line-arguments
Experimenting with _spawnv() shows that Microsoft made no effort in the direction of invertible serialization/de-serialization of argv[] lists.
That is, as long as there are no double quotes, everything works as
expected. But when there are double quotes in the original argv[] then
more often than not they can't be passed exactly.
On Thu, 1 Jan 2026 23:50:00 -0000 (UTC)
Lawrence D?Oliveiro <ldo@nz.invalid> wrote:
On Thu, 1 Jan 2026 21:53:20 +0200, Michael S wrote:
On Thu, 1 Jan 2026 19:02:49 -0000 (UTC)
Lawrence D?Oliveiro <ldo@nz.invalid> wrote:
All those are at the sending end. But what would C code see at the
receiving end?
The first three cases look very simple.
Is there some spec in Windows which describes how that works?
There is a spec that describes how that works in Microsoft's
implementation. That implementation is available free of charge to
other Windows compilers.
If vendor of Windows 'C' compiler decided to implement different
algorithm then nobody can stop him.
Through the years you were told so, by different people, and shown
the spec may be 100 times. But being the trolll you are, you continue
to ask.
Still, for the benefit of more sincere readers and also for myself, in
order to have both pieces in one place: https://learn.microsoft.com/en-us/windows/win32/api/shellapi/nf-shellapi-commandlinetoargvw
https://learn.microsoft.com/en-us/cpp/c-language/parsing-c-command-line-arguments
More interesting and meaningful question is how to do the reverse.
I.e. how to convert an argv[] array into flat form in a way that
guarantees that CommandLineToArgvW() parses it back into original form?
Is it even possible in general case or there exist limitations
(ignoring, for sake of brevity, 2*15-1 size limit)?
Microsoft certainly has reverse conversion implemented, e.g. here: https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/spawnv-wspawnv
But I am not aware of command line serialization part available as a
library call in isolation from process creation part.
I binged around and googled around, but all I was able to find was the
name of the function that performs the work: __acrt_pack_wide_command_line_and_environment
I was not able to find the source code of the function.
[O.T.]
I am sure that 15, 10 or even 5 years ago Google would give me link to
the source in a second. Or, may be, 5 years ago Google already
wouldn't, but Bing still would.
But today both search engines are hopelessly crippled with AI and do not appear to actually search the web. Instead, the try to guess the
answer I likely want to hear.
[/O.T.]
The argc/argv problem seemed easy enough in practice if we only need
to handle the "real" arguments argv[n] with n>0. (Involving CMD.EXE introduced much worse complications, as you might imagine. But
generally I always thought that MS wasn't really interested in
/documenting/ how programmers should do things like this, in the
same way they never bothered explaining exactly how CMD processing
worked. Probably because it was forever changing!... Put another
way, for many years they were really more focussed on admins
clicking buttons in some GUI!)
After years -- decades -- of conditioning its users to be allergic to
the command line, now suddenly the rise of Linux has made command
lines cool again. Leaving Microsoft in an awkward position ...
On 02/01/2026 12:32, Michael S wrote:
On Thu, 1 Jan 2026 23:50:00 -0000 (UTC)
Lawrence D?Oliveiro <ldo@nz.invalid> wrote:
On Thu, 1 Jan 2026 21:53:20 +0200, Michael S wrote:
On Thu, 1 Jan 2026 19:02:49 -0000 (UTC)
Lawrence D?Oliveiro <ldo@nz.invalid> wrote:
All those are at the sending end. But what would C code see at
the receiving end?
The first three cases look very simple.
Is there some spec in Windows which describes how that works?
There is a spec that describes how that works in Microsoft's implementation. That implementation is available free of charge to
other Windows compilers.
If vendor of Windows 'C' compiler decided to implement different
algorithm then nobody can stop him.
Through the years you were told so, by different people, and shown
the spec may be 100 times. But being the trolll you are, you
continue to ask.
Still, for the benefit of more sincere readers and also for myself,
in order to have both pieces in one place: https://learn.microsoft.com/en-us/windows/win32/api/shellapi/nf-shellapi-commandlinetoargvw
https://learn.microsoft.com/en-us/cpp/c-language/parsing-c-command-line-arguments
In the long distant past I investigated how MSVC converts a
command-line to its argc/argv input. There was an internal routine in
the CRT startup code that did pretty much what we would expect, and I reversed engineered that for my code (or did I just copy the code?
surely the former!). The MSVC code did not call CommandLineToArgvW
in those days, but reading the description of that api it all sounds
very familiar - the state flags for controlling "quoted/unquoted"
text, even vs odd numbers of backslashes and all that.
I didn't find it difficult to create command-line strings to call C
programs, given what I wanted those programs to see as argv[n] with
0. I think it was just a case of quoting all arguments, then
applying quoting rules as docuemented for CommandLineToArgvW to
handle nested quotes/backslashes.
But I can see a sticky problem - the MSVC parsing for argv[0] was
completely separate from thr main loop handling other arguments. The
logic was considerably simplified, assuming that argv[0] was the path
for the module being invoked. Since that is expected to be a valid
file system path, the logic did not handle nested quotes etc.. I
think the logic was just:
- if 1st char is a DQUOTE, copy chars for argv[0] up to next DQUOTE
or null terminator. (enclosing DQUOTE chars are not included)
- else copy chars for argv[0] up to next whitespace or null
terminator. (all chars are included, I think including DQUOTE should
it occur)
Given this, it would not be possible to create certain argv[0]
strings containing quotes etc., and I understand that the likes of
execve() allow that possibility. So I don't know what should happen
for this case. E.g. I don't see there is a command-line that gives
argv[0] the string "\" ". This was never a problem for me in
practice.
There would always be at least an argv[0] with this logic, so MSVC
ensures argc>0 and argv[0] != NULL. (Of course, MSVC is not
"Windows". Various posters in this thread seem to be asking "what
does /Windows/ do regarding argc/argv?" as though the OS is
responsible for setting them.)
More interesting and meaningful question is how to do the reverse.
I.e. how to convert an argv[] array into flat form in a way that
guarantees that CommandLineToArgvW() parses it back into original
form? Is it even possible in general case or there exist limitations (ignoring, for sake of brevity, 2*15-1 size limit)?
Yes, programmers need this if they need to create a process to invoke
some utility program which will see particular argv parameters.
Users are used to typing in command-lines as a string, e.g. at a
console, so I suppose they don't normally need to think about the
argv[] parsing; they can just build the required command-line and use
that. (But it's a problem in the general case.)
The argc/argv problem seemed easy enough in practice if we only need
to handle the "real" arguments argv[n] with n>0. (Involving CMD.EXE introduced much worse complications, as you might imagine. But
generally I always thought that MS wasn't really interested in
/documenting/ how programmers should do things like this, in the same
way they never bothered explaining exactly how CMD processing worked.
Probably because it was forever changing!... Put another way, for
many years they were really more focussed on admins clicking buttons
in some GUI!)
It's entirely possible that Windows goes beyond the ISO C
requirements and explicitly or implicitly guarantees argc>0.
It's also entirely possible that it doesn't. Do you have any
concrete information one way or the other
Michael Sanders <porkchop@invalid.foo> wrote:
Is it incorrect to use 0 (zero) to seed srand()?
int seed = (argc >= 2 && strlen(argv[1]) == 9)
? atoi(argv[1])
: (int)(time(NULL) % 900000000 + 100000000);
srand(seed);
I like to just read /dev/urandom when I need a random
number. Seem easier and more portable across Linux &
the *BSDs.
int s;
read(fd, &s, sizeof(int));
Pay attention that C Standard only requires for the same seed to always produces the same sequence. There is no requirement that different
seeds have to produce different sequences.
So, for generator in your example, implementation like below would be
fully legal. Personally, I wouldn't even consider it as particularly
poor quality:
void srand(unsigned seed ) { init = seed | 1;}
There is a paper "PCG: A Family of Simple Fast Space-Efficient
Statistically Good Algorithms for Random Number Generation"
by M. O?Neill where she gives a family of algorithms and runs
several statistical tests against known algorithms. Mersenne
Twister does not look good in tests. If you have enough (128) bits
LCGs do pass tests. A bunch of generators with 64-bit state also
passes tests. So the only reason to prefer Mersenne Twister is
that it is implemented in available libraries. Otherwise it is
not so good, have large state and needs more execution time
than alternatives.
Michael S <already5chosen@yahoo.com> writes:
[regarding rand() and srand()]
Pay attention that C Standard only requires for the same seed to
always produces the same sequence. There is no requirement that
different seeds have to produce different sequences.
So, for generator in your example, implementation like below would
be fully legal. Personally, I wouldn't even consider it as
particularly poor quality:
void srand(unsigned seed ) { init = seed | 1;}
It seems better to do, for example,
void srand(unsigned seed ) { init = seed - !seed;}
On Tue, 23 Dec 2025 17:54:05 -0000 (UTC)[...]
antispam@fricas.org (Waldek Hebisch) wrote:
There is a paper "PCG: A Family of Simple Fast Space-Efficient
Statistically Good Algorithms for Random Number Generation"
by M. O?Neill where she gives a family of algorithms and runs
several statistical tests against known algorithms. Mersenne
Twister does not look good in tests. If you have enough (128) bits
LCGs do pass tests. A bunch of generators with 64-bit state also
passes tests. So the only reason to prefer Mersenne Twister is
that it is implemented in available libraries. Otherwise it is
not so good, have large state and needs more execution time
than alternatives.
I don't know. Testing randomness is complicated matter.
How can I be sure that L'Ecuyer and Simard's TestU01 suite tests
things that I personally care about and that it does not test
things that are of no interest for me? Especially, the latter.
Also, the TestU01 suit is made for generators with 32-bit output.
M. O'Neill used ad hoc technique to make it applicable to
generators with 64-bit output. Is this technique right? Or may
be it put 64-bit PRNG at unfair disadvantage?
Besides, I strongly disagree with at least one assertion made by
O'Neill: "While security-related applications should use a secure
generator, because we cannot always know the future contexts in
which our code will be used, it seems wise for all applications to
avoid generators that make discovering their entire internal state
completely trivial."
No, I know exactly what I am doing/ I know exactly that for my
application easy discovery of complete state of PRNG is not a
defect.
Anyway, even if I am skeptical about her criticism of popular PRNGs, intuitively I agree with the constructive part of the article - medium-quality PRNG that feeds medium quality hash function can
potentially produce very good fast PRNG with rather small internal
state.
On related note, I think that even simple counter fed into high
quality hash function (not cryptographically high quality, far
less than that) can produce excellent PRNG with even smaller
internal state. But not very fast one. Although the speed
depends on specifics of used computer. I can imagine computer
that has low-latency Rijndael128 instruction. On such computer,
running counter through 3-4 rounds of Rijndael ill produce very
good PRNG that is only 2-3 times slower than, for example, LCG
128/64.
John McCue <jmclnx@gmail.com.invalid> writes:
Michael Sanders <porkchop@invalid.foo> wrote:
Is it incorrect to use 0 (zero) to seed srand()?
int seed = (argc >= 2 && strlen(argv[1]) == 9)
? atoi(argv[1])
: (int)(time(NULL) % 900000000 + 100000000);
srand(seed);
I like to just read /dev/urandom when I need a random
number. Seem easier and more portable across Linux &
the *BSDs.
int s;
read(fd, &s, sizeof(int));
Apples and oranges. Many applications that use random numbers
need a stream of numbers that is deterministic and reproducible,
which /dev/urandom is not.
Michael S <already5chosen@yahoo.com> writes:
On Tue, 23 Dec 2025 17:54:05 -0000 (UTC)[...]
antispam@fricas.org (Waldek Hebisch) wrote:
There is a paper "PCG: A Family of Simple Fast Space-Efficient
Statistically Good Algorithms for Random Number Generation"
by M. O?Neill where she gives a family of algorithms and runs
several statistical tests against known algorithms. Mersenne
Twister does not look good in tests. If you have enough (128) bits
LCGs do pass tests. A bunch of generators with 64-bit state also
passes tests. So the only reason to prefer Mersenne Twister is
that it is implemented in available libraries. Otherwise it is
not so good, have large state and needs more execution time
than alternatives.
I don't know. Testing randomness is complicated matter.
How can I be sure that L'Ecuyer and Simard's TestU01 suite tests
things that I personally care about and that it does not test
things that are of no interest for me? Especially, the latter.
Do you think any of the tests in the TestU01 suite are actually counter-indicated? As long as you don't think any TestU01 test
makes things worse, there is no reason not to use all of them.
You are always free to disregard tests you don't care about.
Also, the TestU01 suit is made for generators with 32-bit output.
M. O'Neill used ad hoc technique to make it applicable to
generators with 64-bit output. Is this technique right? Or may
be it put 64-bit PRNG at unfair disadvantage?
As long as the same mapping is applied to all 64-bit PRNGs under consideration I don't see a problem. The point of the test is to
compare PRNGs, not to compare test methods. If someone thinks a
different set of tests is called for they are free to run them.
Besides, I strongly disagree with at least one assertion made by
O'Neill: "While security-related applications should use a secure generator, because we cannot always know the future contexts in
which our code will be used, it seems wise for all applications to
avoid generators that make discovering their entire internal state completely trivial."
No, I know exactly what I am doing/ I know exactly that for my
application easy discovery of complete state of PRNG is not a
defect.
You and she are talking about different things. You are talking
about choosing a PRNG to be used only by yourself. She is talking
about choosing a PRNG to be made available to other people without
knowing who they are or what their needs are. In the second case
it's reasonable to raise the bar for the set of criteria that need
to be met.
Anyway, even if I am skeptical about her criticism of popular PRNGs, intuitively I agree with the constructive part of the article - medium-quality PRNG that feeds medium quality hash function can
potentially produce very good fast PRNG with rather small internal
state.
After looking at one of the example PCG generators, I would
describe it as a medium-quality PRNG that feeds a low-quality
hash. The particular combination I looked at produced good
results, but it isn't clear which combinations of PRNG and
hash would do likewise.
On related note, I think that even simple counter fed into high
quality hash function (not cryptographically high quality, far
less than that) can produce excellent PRNG with even smaller
internal state. But not very fast one. Although the speed
depends on specifics of used computer. I can imagine computer
that has low-latency Rijndael128 instruction. On such computer,
running counter through 3-4 rounds of Rijndael ill produce very
good PRNG that is only 2-3 times slower than, for example, LCG
128/64.
I think the point of her paper where she talks about determining
how much internal state is needed is to measure the efficacy of
the PRNG, not to try to reduce the amount of state needed. Based
on my own experience with various PRNGs I think it's a mistake to
try to minimize the amount of internal state needed. My own rule
of thumb is to allow at least a factor of four: for example, a
PRNG with a 32-bit output should have at least 128 bits of state.
My latest favorite has 256 bits of state to produce 32-bit
outputs (and so might also do well to produce 64-bit outputs, but
I haven't tested that).
[...]
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
John McCue <jmclnx@gmail.com.invalid> writes:
Michael Sanders <porkchop@invalid.foo> wrote:
Is it incorrect to use 0 (zero) to seed srand()?
int seed = (argc >= 2 && strlen(argv[1]) == 9)
? atoi(argv[1])
: (int)(time(NULL) % 900000000 + 100000000);
srand(seed);
I like to just read /dev/urandom when I need a random
number. Seem easier and more portable across Linux &
the *BSDs.
int s;
read(fd, &s, sizeof(int));
Apples and oranges. Many applications that use random numbers
need a stream of numbers that is deterministic and reproducible,
which /dev/urandom is not.
And neither is the non-conforming rand() on OpenBSD.
The rand(1) man page on OpenBSD 7.8 says:
Standards insist that this interface return deterministic
results. Unsafe usage is very common, so OpenBSD changed the
subsystem to return non-deterministic results by default.
To satisfy portable code, srand() may be called to initialize
the subsystem. In OpenBSD the seed variable is ignored,
and strong random number results will be provided from
arc4random(3). In other systems, the seed variable primes a
simplistic deterministic algorithm.
It does provide an srand_deterministic() function that behaves the way srand() is supposed to.
Michael S <already5chosen@yahoo.com> wrote:[...]
Anyway, even if I am skeptical about her criticism of popular PRNGs,
intuitively I agree with the constructive part of the article -
medium-quality PRNG that feeds medium quality hash function can
potentially produce very good fast PRNG with rather small internal
state.
She seem to care very much about having minimal possible state.
That is may be nice on embeded systems, but in general I would
happily accept slighty bigger state (say 256 bits). But if
we can get good properties with very small state, then why not?
[...]
On Wed, 07 Jan 2026 13:54:21 -0800, Keith Thompson wrote:[...]
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Apples and oranges. Many applications that use random numbers
need a stream of numbers that is deterministic and reproducible,
which /dev/urandom is not.
And neither is the non-conforming rand() on OpenBSD.
The rand(1) man page on OpenBSD 7.8 says:
Standards insist that this interface return deterministic
results. Unsafe usage is very common, so OpenBSD changed the
subsystem to return non-deterministic results by default.
To satisfy portable code, srand() may be called to initialize
the subsystem. In OpenBSD the seed variable is ignored,
and strong random number results will be provided from
arc4random(3). In other systems, the seed variable primes a
simplistic deterministic algorithm.
It does provide an srand_deterministic() function that behaves the way
srand() is supposed to.
So then clang would use:
#ifdef __OpenBSD__
srand_deterministic(seed);
#else
srand(seed);
#endif
But I don't know (yet) that gcc does as well under OpenBSD.
Michael Sanders <porkchop@invalid.foo> writes:
On Wed, 07 Jan 2026 13:54:21 -0800, Keith Thompson wrote:[...]
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Apples and oranges. Many applications that use random numbers
need a stream of numbers that is deterministic and reproducible,
which /dev/urandom is not.
And neither is the non-conforming rand() on OpenBSD.
The rand(1) man page on OpenBSD 7.8 says:
Standards insist that this interface return deterministic
results. Unsafe usage is very common, so OpenBSD changed the
subsystem to return non-deterministic results by default.
To satisfy portable code, srand() may be called to initialize
the subsystem. In OpenBSD the seed variable is ignored,
and strong random number results will be provided from
arc4random(3). In other systems, the seed variable primes a
simplistic deterministic algorithm.
It does provide an srand_deterministic() function that behaves the way
srand() is supposed to.
So then clang would use:
#ifdef __OpenBSD__
srand_deterministic(seed);
#else
srand(seed);
#endif
But I don't know (yet) that gcc does as well under OpenBSD.
I don't know what you mean when you say that clang "would use"
that code.
I'm not aware that either clang or gcc uses random numbers
internally. I don't know why they would.
You could certainly write the above code and compile it with either
gcc or clang (or any other C compiler on OpenBSD). I've confirmed
that gcc on OpenBSD does predefine the symbol __OpenBSD__. There
should be no relevant difference between gcc and clang; random
number generation is implemented in the library, not in the compiler.
If your point is that a programmer using either gcc or clang could
use the above code to get the required deterministic behavior
for rand(), I agree. (Though it shouldn't be necessary; IMHO the
OpenBSD folks made a very bad decision.)
Relatedly, the NetBSD implementation of rand() is conforming, but
of very low quality. The low-order bit alternates between 0 and
1 on successive rand() calls, the two low-order bits repeat with
a cycle of 4, and so on. Larry Jones wrote about it here in 2010:
The even/odd problem was caused at Berkeley by a well meaning
but clueless individual who increased the range of the generator
(which originally matched the sample implementation) by returning
the *entire* internal state rather than just the high-order
bits of it. BSD was very popular, so that defective generator
got around a lot, unfortunately.
And I've just discovered that the OpenBSD rand() returns alternating
odd and even results after a call to srand_determinstic().
It's disturbing that this has never been fixed.
On Thu, 08 Jan 2026 14:44:27 -0800, Keith Thompson wrote:[...]
Michael Sanders <porkchop@invalid.foo> writes:
So then clang would use:
#ifdef __OpenBSD__
srand_deterministic(seed);
#else
srand(seed);
#endif
But I don't know (yet) that gcc does as well under OpenBSD.
I don't know what you mean when you say that clang "would use"
that code.
I'm not aware that either clang or gcc uses random numbers
internally. I don't know why they would.
Well, I meant the macro itself is (I'm guessing) probably defined
by clang since its the default compiler.
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
John McCue <jmclnx@gmail.com.invalid> writes:
Michael Sanders <porkchop@invalid.foo> wrote:
Is it incorrect to use 0 (zero) to seed srand()?
int seed = (argc >= 2 && strlen(argv[1]) == 9)
? atoi(argv[1])
: (int)(time(NULL) % 900000000 + 100000000);
srand(seed);
I like to just read /dev/urandom when I need a random
number. Seem easier and more portable across Linux &
the *BSDs.
int s;
read(fd, &s, sizeof(int));
Apples and oranges. Many applications that use random numbers
need a stream of numbers that is deterministic and reproducible,
which /dev/urandom is not.
And neither is the non-conforming rand() on OpenBSD.
The rand(1) man page on OpenBSD 7.8 says:
Standards insist that this interface return deterministic
results. Unsafe usage is very common, so OpenBSD changed the
subsystem to return non-deterministic results by default.
To satisfy portable code, srand() may be called to initialize
the subsystem. In OpenBSD the seed variable is ignored,
and strong random number results will be provided from
arc4random(3). In other systems, the seed variable primes a
simplistic deterministic algorithm.
But your original statement implied that clang would *use* that
particular piece of code, which didn't make much sense. Were you
just asking about how the __OpenBSD__ macro is defined, without
reference to srand?
On Thu, 08 Jan 2026 22:46:42 -0800, Keith Thompson wrote:
But your original statement implied that clang would *use* that
particular piece of code, which didn't make much sense. Were you
just asking about how the __OpenBSD__ macro is defined, without
reference to srand?
Well, under OpenBSD I plan on using:
#ifdef __OpenBSD__
srand_deterministic(seed);
#else
srand(seed);
#endif
But what I was asking is whether or not gcc would recognize
the __OpenBSD__ macro (why wouldn't I'm assuming) since clang
is the default compiler.
But also about srand()... you've got me really wondering why
OpenBSD would deviate from the standard as they have. I get
that the those folks disagree because its deterministic, but
its the accepted standard to be deterministic with srand().
Only speaking for myself here, rather than srand_deterministic()
and srand() (that's not deterministic under OpenBSD) it
would've made more sense to've implemented srand_non_deterministic()
and left srand() alone. That design decision on their part only
muddies the waters in my thinking. Live & learn =)
On Thu, 08 Jan 2026 22:46:42 -0800, Keith Thompson wrote:
But your original statement implied that clang would *use* that
particular piece of code, which didn't make much sense. Were you
just asking about how the __OpenBSD__ macro is defined, without
reference to srand?
Well, under OpenBSD I plan on using:
#ifdef __OpenBSD__
srand_deterministic(seed);
#else
srand(seed);
#endif
But what I was asking is whether or not gcc would recognize
the __OpenBSD__ macro (why wouldn't I'm assuming) since clang
is the default compiler.
But also about srand()... you've got me really wondering why
OpenBSD would deviate from the standard as they have. I get
that the those folks disagree because its deterministic, but
its the accepted standard to be deterministic with srand().
Only speaking for myself here, rather than srand_deterministic()
and srand() (that's not deterministic under OpenBSD) it
would've made more sense to've implemented srand_non_deterministic()
and left srand() alone. That design decision on their part only
muddies the waters in my thinking. Live & learn =)
On Thu, 08 Jan 2026 22:46:42 -0800, Keith Thompson wrote:
But your original statement implied that clang would *use* that
particular piece of code, which didn't make much sense. Were you
just asking about how the __OpenBSD__ macro is defined, without
reference to srand?
Well, under OpenBSD I plan on using:
#ifdef __OpenBSD__
srand_deterministic(seed);
#else
srand(seed);
#endif
| Sysop: | Jacob Catayoc |
|---|---|
| Location: | Pasay City, Metro Manila, Philippines |
| Users: | 5 |
| Nodes: | 4 (0 / 4) |
| Uptime: | 24:11:10 |
| Calls: | 117 |
| Calls today: | 117 |
| Files: | 368 |
| D/L today: |
560 files (257M bytes) |
| Messages: | 70,913 |
| Posted today: | 26 |