In addition to having to pick a size for the length counter and then, later, having to differentiate between lengths in bytes, codepoints, and glyphs, you can't subdivide a Pascal string using pointer arithmetic. To pass just the end of a string into a function, you have to either copy the tail of one Pascal-style string to another with a smaller size value, or your string has to be a struct with an integer and a pointer to the actual data instead of just an integer stuck on the beginning of the string. The first is a lot of copying in some cases, the second raises the specter of structs with invalid pointers. That's not to mention the potential problems that would cause with caches.
I think it was NULL itself. It was a long way until we realised we don't want invalid values and could use the type system to help us use special values safely.
Meh, I think NULL is fine in C. It's an extra, valid state to represent pointers at no cost. Unlike the more hand holdy languages, it's quite rare for a pointer in C to have the ability to be NULL since, more often than not, it's pointing at something known. It's actually quite rare to see NULL checks unless it's API code or something like that. I can see this being more of a problem in a managed language where anything can be NULL at any time.
It was definitely an interesting way to allocate pointers.
I did once have a very large project where devs didnt understand this and resolved hundreds or more off by one and memory overwrites in C due to this feature.
But at the same time, I think blaming the software was kind of a cop out. Devs were in a hurry and simply didnt respect the rules. Given todays software engineer at large. Nerfing programming languages so they cant destroy things might not be a bad idea. But AI will nerf everything.
why is AI gonna nerf everything? sure it could be used as the easy button, but I just spent two hours this morning learning about the neuroscience of how memory works in the brain that I didn't mean to and now I want to run studies on how memory works.
Why do you assume that AI is gonna nerf everything?
The limitations were brutal. Initially you could only have 255 bytes in a string. The length of a string and the size of the allocation are now separate and you may need to think about that unused memory in your design. The problem now doubles with the introduction of UTF-8. Your string size is in bytes and you need to track characters separately.
If you want to create an array of strings you either need to specify the length of all strings and accept the memory overhead or have an array of pointers to strings. If you use an array of pointers you may end up choosing to use the 'nil' value as a sentinel that means "end of list." So we're right back where we started.
I worked on a Win32 app that used space-padded strings, i.e. the destination string was padded with spaces, but there was still a null on the last byte. You had to use special versions of the string functions for length, copy etc.
I’m not sure why this was - the source base was so old it might have had its origins in Pascal struct behaviour.
why would you start there instead of creating something from scratch ?if you can port drivers just as easily meaning you don't especially give a shit about hardware you're running on in the first place, why even deal with linux? The battle tested LRU cache system?
I've seen several workalike kernels in various stages of completion. at least one of them was able to run some pretty substantial applications (Postgres, nginx, that kind of thing), and that is still I guess around 250kloc. but it only really has drivers to support hypervisor devices.
unfortunately as time goes by, the linux api surface gets larger and more convoluted. so there's going to be some coverage you're just never going to get.
but in the abstract, definitely. linux is so bloated at this point that its not clear that it can ever be 'made safe'.
the zero terminated string is I think is computing's biggest mistake. Pascal style strings were much safer.
In addition to having to pick a size for the length counter and then, later, having to differentiate between lengths in bytes, codepoints, and glyphs, you can't subdivide a Pascal string using pointer arithmetic. To pass just the end of a string into a function, you have to either copy the tail of one Pascal-style string to another with a smaller size value, or your string has to be a struct with an integer and a pointer to the actual data instead of just an integer stuck on the beginning of the string. The first is a lot of copying in some cases, the second raises the specter of structs with invalid pointers. That's not to mention the potential problems that would cause with caches.
I think it was NULL itself. It was a long way until we realised we don't want invalid values and could use the type system to help us use special values safely.
Meh, I think NULL is fine in C. It's an extra, valid state to represent pointers at no cost. Unlike the more hand holdy languages, it's quite rare for a pointer in C to have the ability to be NULL since, more often than not, it's pointing at something known. It's actually quite rare to see NULL checks unless it's API code or something like that. I can see this being more of a problem in a managed language where anything can be NULL at any time.
It was definitely an interesting way to allocate pointers. I did once have a very large project where devs didnt understand this and resolved hundreds or more off by one and memory overwrites in C due to this feature.
But at the same time, I think blaming the software was kind of a cop out. Devs were in a hurry and simply didnt respect the rules. Given todays software engineer at large. Nerfing programming languages so they cant destroy things might not be a bad idea. But AI will nerf everything.
why is AI gonna nerf everything? sure it could be used as the easy button, but I just spent two hours this morning learning about the neuroscience of how memory works in the brain that I didn't mean to and now I want to run studies on how memory works.
Why do you assume that AI is gonna nerf everything?
> Pascal style strings were much safer.
The limitations were brutal. Initially you could only have 255 bytes in a string. The length of a string and the size of the allocation are now separate and you may need to think about that unused memory in your design. The problem now doubles with the introduction of UTF-8. Your string size is in bytes and you need to track characters separately.
If you want to create an array of strings you either need to specify the length of all strings and accept the memory overhead or have an array of pointers to strings. If you use an array of pointers you may end up choosing to use the 'nil' value as a sentinel that means "end of list." So we're right back where we started.
I worked on a Win32 app that used space-padded strings, i.e. the destination string was padded with spaces, but there was still a null on the last byte. You had to use special versions of the string functions for length, copy etc.
I’m not sure why this was - the source base was so old it might have had its origins in Pascal struct behaviour.
Perhaps prevent realocation when string size changes? Or aligning cpu cache lines?
I think this behavior has its roots in COBOL, not pascal.
Wonder when is someone going to brave and fork the linux kernel and try to ffwd it with automatic programming.
why would you start there instead of creating something from scratch ?if you can port drivers just as easily meaning you don't especially give a shit about hardware you're running on in the first place, why even deal with linux? The battle tested LRU cache system?
It's much easier to use something with all the edge cases already handled as a starting point.
I've seen several workalike kernels in various stages of completion. at least one of them was able to run some pretty substantial applications (Postgres, nginx, that kind of thing), and that is still I guess around 250kloc. but it only really has drivers to support hypervisor devices.
unfortunately as time goes by, the linux api surface gets larger and more convoluted. so there's going to be some coverage you're just never going to get.
but in the abstract, definitely. linux is so bloated at this point that its not clear that it can ever be 'made safe'.