Going all the way back to the earliest C compilers on DOS. There was a decision made to make “\n” just work on DOS for portability of Unix programs, and to make the examples from the C programming book just work.
But in Unix “\n” is a single byte, and in DOS it is 2. So they introduced text and binary modes for files on DOS. Behind the scenes the library will handle the extra byte. This is not necessary in Unix.
I used to have to be careful about importing files to DOS. Did the file come from Unix?
In binary mode. In text mode if you printf(“Hello World\n”) you get CRLF because that’s how text works on DOS. Unix had the convention of only requiring the LF for text. And Unix didn’t have text/binary modes. That’s the compatibility hack on DOS.
>These control codes go back to line printers.
Back to teletypes even. Believe me, I go back to line printers.
Annoyingly I actually think '\r\n' is the correct line ending here - advance the paper and return the carriage, but I suppose unix took the simpler implementation which makes looping over characters, words (split by ' ') and lines (split by '\n') simpler as each loop only has a single comparison
Yep, on Unixen the translation of CRLF to LF when printing to the terminal (and from CR to CRLF when reading input from the terminal) is done in the kernel, it's called "line discipline".
It's the other way around. It's the C runtime that treats text ("t") mode differently, because the C standard specifies \n as a line delimiter but the Windows convention is \r\n. In text mode C stdio translates between \n and \r\n. In binary mode it does no translation.
The article seems to be taking the position that the C runtime library is not part of "Windows", which feels like a rather odd view to me. What is the stable API that Windows offers to application developers if not that?
There is a very unfortunate situation in Unix systems in that the library named 'libc' is serving several simultaneous different roles. One of those roles--what it is named for--is serving as the C standard library. The more important role is that the library also provides the implementation of a different standard API, the POSIX API, which is the main API used to access system details. There's also yet another role of providing the stable system interface to the kernel in most Unix implementations. On Windows, these roles are provided by different libraries: ucrt (what used to be msvcrt), kernel32, and ntdll, respectively.
And for what it's worth, the actual C standard library tends to be fairly rarely used, especially if you consider the malloc/free interface to be part of the system library rather than the C standard library. The C stdio functionality, for example, is extremely underpowered compared to the capabilities of all major operating systems' I/O libraries, and so most applications--even those written entirely in C--will choose to avoid the C standard library and instead use the more direct primitives of the system API layer instead.
It wasn't until fairly recently that the C runtime was stably shipped with Windows. Previously you had to install the correct version of the C library alongside your application.
Which is called from what, if not C? Does windows really offer no API for writing text (rather than bytes) to files? Or does it rely on the application developer to manage line endings in their own code? Neither of those sounds very developer-friendly.
Calling it from C does not mean you need a full C standard library to exist. For example, much of the C standard library is itself written in C. But it's a "freestanding" C which assumes only a minimal set of library functions exist (e.g. functions for copying memory from one place to another, filling memory with zeroes, etc).
And you can of course use non-C languages to call the Win32 API. Or even directly using assembly code.
> you can of course use non-C languages to call the Win32 API. Or even directly using assembly code.
Is that a supported/official API though? On Linux you "can" put your arguments in registers and trigger the system call interrupt directly, and I think Go programs even do this, but it's not the official interface and they reserve the right to break your program in future updates, at least in theory.
Sure. C has never been the only language supported on Windows.
For instance, Delphi had a period of popularity for Windows application development, and AFAIK it has always used its own runtime library which is completely independent of the C runtime.
Go does not trigger low-level system call interrupts on Windows. (It does that on Linux, but Windows syscall numbers are not stable even across minor Windows updates, so if Go did that, its Windows binaries would be incredibly fragile.)
On Windows NT, Go uses the userspace wrappers provided in Windows system libraries such as NTDLL.DLL and KERNEL32.DLL. But those too are entirely separate from the C runtime.
Calling win32 from other languages is supported, calling it from assembly is supported (as long as you use the calling convention properly, obviously), using ntdll to bypass the win32 API is not supported.
Basically on Linux the syscalls are the equivalent of Win32 except much narrower in scope.
> The whole issue is specific to C and languages that copied C or use its runtime underneath in implementations (like Python)
So it's "specific to" almost all programming languages in actual use. That's a rather esoteric point.
> For reference, Unix has no API other than bytes either.
Unix does offer an API for writing C-standard in-memory text strings to Unix-standard on-disk text files, it just happens to be the same one as the API for writing in-memory binary strings to on-disk binary files.
From literally any language. The WriteFile function comes from kernel32.dll shared library, and follows the certain calling convention. You don't need to use this calling convention inside your own binary (and indeed, MinGW and MSYS use SysV ABI for everything except when calling Win32 API), or ask a random C runtime coming from God knows where to do this for you if you write something other than C.
In the UNIX world there is this strange notion that C language is somehow special and that the OS itself should provide its runtime (a single global version of it) for every program, even those written in other languages, to interact with the OS but... it's just silly.
> Does windows really offer no API for writing text (rather than bytes) to files? Or does it rely on the application developer to manage line endings in their own code? Neither of those sounds very developer-friendly.
No it doesn't. That logic belongs in the OS-specific layer in the runtimes/standard libraries of the implementations of the different programming languages. They may decide to re-use each other libraries, of course, or they may decide not to.
I believe that it technically belongs to Visual C++, not the operating system, but it needs to ship with the OS because the user space binaries are compiled with MSVC.
It's both. Originally Visual C++ binaries built for DLL-based C runtime relied on MSVCRT.DLL and that was installed by the redist. Starting with Visual Studio .NET 2002, separate CRT DLLs starting with MSVCR70.DLL were used. MSVCRT.DLL is now part of Windows to support parts of the OS itself and for compatibility with programs that still use it. I think some versions of MinGW also use MSVCRT.
Current versions of the OS ship with functions in MSVCRT.DLL that weren't in the last VC6 version, such as the updated C++ exception handler (__CxxFrameHandler4). AFAIK, there is no redistributable version of it, it's unique to the OS.
Indeed, C runtime is not part of windows API, and it's normal to have a program include few different copies of C runtime library due to different modules compiled with different compilers/options.
C runtime library being part of OS is accidental thing in Unix, 16bit and 32bit Windows API even does not use C-compatible ABI (instead, Pascal-compatible one is present)
Going all the way back to the earliest C compilers on DOS. There was a decision made to make “\n” just work on DOS for portability of Unix programs, and to make the examples from the C programming book just work.
But in Unix “\n” is a single byte, and in DOS it is 2. So they introduced text and binary modes for files on DOS. Behind the scenes the library will handle the extra byte. This is not necessary in Unix.
I used to have to be careful about importing files to DOS. Did the file come from Unix?
Linefeed (\n) is a single byte in DOS as well.
I think you are talking about carriage return linefeed pair (CRLF or \r\n),
These control codes go back to line printers. Linefeed advances the paper one line and carriage return moves the print head to the left.
>Linefeed (\n) is a single byte in DOS as well.
In binary mode. In text mode if you printf(“Hello World\n”) you get CRLF because that’s how text works on DOS. Unix had the convention of only requiring the LF for text. And Unix didn’t have text/binary modes. That’s the compatibility hack on DOS.
>These control codes go back to line printers.
Back to teletypes even. Believe me, I go back to line printers.
Annoyingly I actually think '\r\n' is the correct line ending here - advance the paper and return the carriage, but I suppose unix took the simpler implementation which makes looping over characters, words (split by ' ') and lines (split by '\n') simpler as each loop only has a single comparison
Yep, on Unixen the translation of CRLF to LF when printing to the terminal (and from CR to CRLF when reading input from the terminal) is done in the kernel, it's called "line discipline".
Interestingly, the IETF has several published RFCs for text protocols, all of which require \r\n line endings.
<https://www.rfc-editor.org/old/EOLstory.txt>
Note this does not apply to file formats (except for RFCs).
fopen(..., "wb") ?
It's C library taking care of the "b" part for you according to the article.
It's the other way around. It's the C runtime that treats text ("t") mode differently, because the C standard specifies \n as a line delimiter but the Windows convention is \r\n. In text mode C stdio translates between \n and \r\n. In binary mode it does no translation.
The article seems to be taking the position that the C runtime library is not part of "Windows", which feels like a rather odd view to me. What is the stable API that Windows offers to application developers if not that?
There is a very unfortunate situation in Unix systems in that the library named 'libc' is serving several simultaneous different roles. One of those roles--what it is named for--is serving as the C standard library. The more important role is that the library also provides the implementation of a different standard API, the POSIX API, which is the main API used to access system details. There's also yet another role of providing the stable system interface to the kernel in most Unix implementations. On Windows, these roles are provided by different libraries: ucrt (what used to be msvcrt), kernel32, and ntdll, respectively.
And for what it's worth, the actual C standard library tends to be fairly rarely used, especially if you consider the malloc/free interface to be part of the system library rather than the C standard library. The C stdio functionality, for example, is extremely underpowered compared to the capabilities of all major operating systems' I/O libraries, and so most applications--even those written entirely in C--will choose to avoid the C standard library and instead use the more direct primitives of the system API layer instead.
The Win32 API. E.g. using WriteFile to write files (https://learn.microsoft.com/en-us/windows/win32/api/fileapi/...)
It wasn't until fairly recently that the C runtime was stably shipped with Windows. Previously you had to install the correct version of the C library alongside your application.
> The Win32 API. E.g. using WriteFile to write files (https://learn.microsoft.com/en-us/windows/win32/api/fileapi/...)
Which is called from what, if not C? Does windows really offer no API for writing text (rather than bytes) to files? Or does it rely on the application developer to manage line endings in their own code? Neither of those sounds very developer-friendly.
Calling it from C does not mean you need a full C standard library to exist. For example, much of the C standard library is itself written in C. But it's a "freestanding" C which assumes only a minimal set of library functions exist (e.g. functions for copying memory from one place to another, filling memory with zeroes, etc).
And you can of course use non-C languages to call the Win32 API. Or even directly using assembly code.
> you can of course use non-C languages to call the Win32 API. Or even directly using assembly code.
Is that a supported/official API though? On Linux you "can" put your arguments in registers and trigger the system call interrupt directly, and I think Go programs even do this, but it's not the official interface and they reserve the right to break your program in future updates, at least in theory.
Sure. C has never been the only language supported on Windows.
For instance, Delphi had a period of popularity for Windows application development, and AFAIK it has always used its own runtime library which is completely independent of the C runtime.
Go does not trigger low-level system call interrupts on Windows. (It does that on Linux, but Windows syscall numbers are not stable even across minor Windows updates, so if Go did that, its Windows binaries would be incredibly fragile.)
On Windows NT, Go uses the userspace wrappers provided in Windows system libraries such as NTDLL.DLL and KERNEL32.DLL. But those too are entirely separate from the C runtime.
Calling win32 from other languages is supported, calling it from assembly is supported (as long as you use the calling convention properly, obviously), using ntdll to bypass the win32 API is not supported.
Basically on Linux the syscalls are the equivalent of Win32 except much narrower in scope.
The whole issue is specific to C and languages that copied C or use its runtime underneath in implementations (like Python)
For reference, Unix has no API other than bytes either.
> The whole issue is specific to C and languages that copied C or use its runtime underneath in implementations (like Python)
So it's "specific to" almost all programming languages in actual use. That's a rather esoteric point.
> For reference, Unix has no API other than bytes either.
Unix does offer an API for writing C-standard in-memory text strings to Unix-standard on-disk text files, it just happens to be the same one as the API for writing in-memory binary strings to on-disk binary files.
From literally any language. The WriteFile function comes from kernel32.dll shared library, and follows the certain calling convention. You don't need to use this calling convention inside your own binary (and indeed, MinGW and MSYS use SysV ABI for everything except when calling Win32 API), or ask a random C runtime coming from God knows where to do this for you if you write something other than C.
In the UNIX world there is this strange notion that C language is somehow special and that the OS itself should provide its runtime (a single global version of it) for every program, even those written in other languages, to interact with the OS but... it's just silly.
> Does windows really offer no API for writing text (rather than bytes) to files? Or does it rely on the application developer to manage line endings in their own code? Neither of those sounds very developer-friendly.
No it doesn't. That logic belongs in the OS-specific layer in the runtimes/standard libraries of the implementations of the different programming languages. They may decide to re-use each other libraries, of course, or they may decide not to.
It wasn't until fairly recently
By "recently" you mean Win95? MSVCRT.DLL has been there for at least that long.
I believe that it technically belongs to Visual C++, not the operating system, but it needs to ship with the OS because the user space binaries are compiled with MSVC.
It's both. Originally Visual C++ binaries built for DLL-based C runtime relied on MSVCRT.DLL and that was installed by the redist. Starting with Visual Studio .NET 2002, separate CRT DLLs starting with MSVCR70.DLL were used. MSVCRT.DLL is now part of Windows to support parts of the OS itself and for compatibility with programs that still use it. I think some versions of MinGW also use MSVCRT.
Current versions of the OS ship with functions in MSVCRT.DLL that weren't in the last VC6 version, such as the updated C++ exception handler (__CxxFrameHandler4). AFAIK, there is no redistributable version of it, it's unique to the OS.
It was there but mystery meat vs whatever version you might need for your binary.
They are backwards-compatible. I've written many tiny (few KB) utilities that work from Win95 through Win11, and of course WINE.
Indeed, C runtime is not part of windows API, and it's normal to have a program include few different copies of C runtime library due to different modules compiled with different compilers/options.
C runtime library being part of OS is accidental thing in Unix, 16bit and 32bit Windows API even does not use C-compatible ABI (instead, Pascal-compatible one is present)