I've been learning Zig, and needed a refresher on memory layout (@sizeOf and @alignOf).
Wrote this blog post to summarize what I think are the right ways to understand alignment and size for various data types in Zig, just through experimentation.
> CPUs fetch data from memory in fixed-size blocks of so-many bytes, and performance degrades when data is misaligned.
A memory bus supports memory transactions of various sizes, with the largest size supported being a function of how many data lines there are. The following two statements are true of every memory bus with which I'm familiar, and I probably every bus in popular use: (1) only power-of-two sizes are supported; (2) only aligned transactions are supported.
Arm, x86, and RISC-V are relatively unique among the multitude of CPU architectures in that if they are asked to make an unaligned memory transaction, they will compose that transaction from multiple aligned transactions. Or maybe service it in cache and it never has to hit a memory bus.
Most CPU architectures, including PPC, MIPS, Sparc, and ColdFire/68k, will raise an exception when asked to perform a misaligned memory transaction.
The tradition of aligning data originated when in popular CPU architectures, if you couldn't assume that data was aligned, you would need to use many CPU instructions to simulate misalinged access in software. It continued in compilers for Arm and x86 because even though those CPUs could make multiple bus transactions in response to a single mis-aligned memory read, that takes time and so it was much slower.
I don't know for sure, but I would expect that on modern x86 and high performance Arm, the performance penalty is quite small, if there's any at all.
It's small, but not unnoticeable... depending on the exact size of the workload and the amount of computation per element. In fact, for huge arrays it may be beneficial to have structs packed if that leads to less memory traffic.
i could be wrong but i believe the zig compiler reserves the right to lay things out differently depending on compilation mode? especially debug. unless it's extern or packed, in which case the layout will be defined.
`extern` and `packed` container types have well defined layouts. a regular `struct` is an "auto" layout - and the compiler can and will rearrange whenever it wants.
if you need a well defined layout, use `extern`. if your struct makes sense to represent as an integer, use `packed`. I think it is often ill advisable to use `packed` otherwise.
you can explore this yourself on the Type info returned from @TypeInfo(T):
> An extern struct has in-memory layout matching the C ABI for the target.
Zig is really good at speaking the C ABI of the target, but the upshot seems to be that it appears there is no stable Zig-native ABI.
If I'm correct, I wonder if there are plans to settle on a stable ABI at some point in the future. I do know that in other languages the lack of a stable ABI is brought up as a downside, and although I've been burned by C++ ABI stability too many times to agree, I can understand why people would want one.
I doubt zig will have stable abi any time soon. It may have some sort of "zig extern" when it gets mature. But stable abi isnt very usful if no-one else can talk it. I have project that uses codegen to effectively implement zig like ABI on top of the C abi.
What's interesting is that the scope of the proposal isn't a Zig-specific ABI, but a codified way of expressing certain Zig concepts using the existing C ABI.
in practice, as long as you match the version and release mode, it's fine (though you are playing with fire). I pass raw pointers to zig structs/unions/etc from the zig compiler into a dynamically loaded .so file (via dlload) and as long as my .so file is compiled with the same compiler as the parent (both LLVM, in my case) it's peachy keen.
You are still playing with fire as the data inside those pointers may be different even if they are the same type. Zig is free to optimize them in anyway it likes depending on the code that touches them (aka its free to assume they never leave the program).
Highly recommend not doing this in production code. If nothing else, there's no compiler protection against offset+size being > total size, but one could add it with a static assert! (I've done so in the godbolt link)
You might want to have a look at the unboxing and packing annotations that are proposed for Virgil. The unboxing mechanism is implemented and there was a prototype of the packing mechanism implemented by Bradley for his thesis. I am working on making a more robust implementation that I can land.
Bitfields are kind of a fake feature because they can't be individually addressed like variables can. So they just turn into inlined getters and setters. Old compilers could not inline arbitrary short functions so bitfields were required as an extra hack, but this is no longer the case today.
> I imagine just about any computer science major would have learned the rules of memory layout according to some kind of C-like compiler.
I have worked with a number of fresh grads over the last ten years. I can think of one who may have had a good handle on this. At best the rest range from “vague memory recall about this” to a blank stare.
On the flip hand, it’s something someone can pick up pretty quickly if motivated.
I also had to learn struct alignment the hard way working on WebGPU path tracer and struggling to understand why struct fields not aligning (ironically).
I've been learning Zig, and needed a refresher on memory layout (@sizeOf and @alignOf).
Wrote this blog post to summarize what I think are the right ways to understand alignment and size for various data types in Zig, just through experimentation.
Let me know any and all feedback!
> CPUs fetch data from memory in fixed-size blocks of so-many bytes, and performance degrades when data is misaligned.
A memory bus supports memory transactions of various sizes, with the largest size supported being a function of how many data lines there are. The following two statements are true of every memory bus with which I'm familiar, and I probably every bus in popular use: (1) only power-of-two sizes are supported; (2) only aligned transactions are supported.
Arm, x86, and RISC-V are relatively unique among the multitude of CPU architectures in that if they are asked to make an unaligned memory transaction, they will compose that transaction from multiple aligned transactions. Or maybe service it in cache and it never has to hit a memory bus.
Most CPU architectures, including PPC, MIPS, Sparc, and ColdFire/68k, will raise an exception when asked to perform a misaligned memory transaction.
The tradition of aligning data originated when in popular CPU architectures, if you couldn't assume that data was aligned, you would need to use many CPU instructions to simulate misalinged access in software. It continued in compilers for Arm and x86 because even though those CPUs could make multiple bus transactions in response to a single mis-aligned memory read, that takes time and so it was much slower.
I don't know for sure, but I would expect that on modern x86 and high performance Arm, the performance penalty is quite small, if there's any at all.
It's small, but not unnoticeable... depending on the exact size of the workload and the amount of computation per element. In fact, for huge arrays it may be beneficial to have structs packed if that leads to less memory traffic.
[0] https://jordivillar.com/blog/memory-alignment
[1] https://lemire.me/blog/2012/05/31/data-alignment-for-speed-m...
[2] https://lemire.me/blog/2025/07/14/dot-product-on-misaligned-...
i could be wrong but i believe the zig compiler reserves the right to lay things out differently depending on compilation mode? especially debug. unless it's extern or packed, in which case the layout will be defined.
`extern` and `packed` container types have well defined layouts. a regular `struct` is an "auto" layout - and the compiler can and will rearrange whenever it wants.
if you need a well defined layout, use `extern`. if your struct makes sense to represent as an integer, use `packed`. I think it is often ill advisable to use `packed` otherwise.
you can explore this yourself on the Type info returned from @TypeInfo(T):
https://ziglang.org/documentation/master/std/#std.builtin.Ty...
https://ziglang.org/documentation/master/std/#std.builtin.Ty...
https://ziglang.org/documentation/master/std/#std.builtin.Ty...
To wit: https://ziglang.org/documentation/master/#extern-struct
> An extern struct has in-memory layout matching the C ABI for the target.
Zig is really good at speaking the C ABI of the target, but the upshot seems to be that it appears there is no stable Zig-native ABI.
If I'm correct, I wonder if there are plans to settle on a stable ABI at some point in the future. I do know that in other languages the lack of a stable ABI is brought up as a downside, and although I've been burned by C++ ABI stability too many times to agree, I can understand why people would want one.
I doubt zig will have stable abi any time soon. It may have some sort of "zig extern" when it gets mature. But stable abi isnt very usful if no-one else can talk it. I have project that uses codegen to effectively implement zig like ABI on top of the C abi.
Heres the kind of code it generates https://zigbin.io/6dba68
It can also generate javascript, heres doom running on browser: https://cloudef.pw/sorvi/#doom.wasm
Andrew Kelley has said relatively recently that there are no plans to introduce a Zig ABI: https://github.com/ziglang/zig/issues/3786#issuecomment-2646...
What's interesting is that the scope of the proposal isn't a Zig-specific ABI, but a codified way of expressing certain Zig concepts using the existing C ABI.
That could be an interesting middle ground.
Yeah the new translate-c package already kind of does that.
in practice, as long as you match the version and release mode, it's fine (though you are playing with fire). I pass raw pointers to zig structs/unions/etc from the zig compiler into a dynamically loaded .so file (via dlload) and as long as my .so file is compiled with the same compiler as the parent (both LLVM, in my case) it's peachy keen.
You are still playing with fire as the data inside those pointers may be different even if they are the same type. Zig is free to optimize them in anyway it likes depending on the code that touches them (aka its free to assume they never leave the program).
I know this is a bit cursed; but, I always wanted a bitfield-on-steroids construct:
It is a bit cursed, but you can do this in C/C++.
https://godbolt.org/z/vPKEdnjan
The member types don't actually matter here so we can have a little fun and macro it without having to resort to templates to get "correct" types. Highly recommend not doing this in production code. If nothing else, there's no compiler protection against offset+size being > total size, but one could add it with a static assert! (I've done so in the godbolt link)Edit: if you're talking about Zig, sorry!
You might want to have a look at the unboxing and packing annotations that are proposed for Virgil. The unboxing mechanism is implemented and there was a prototype of the packing mechanism implemented by Bradley for his thesis. I am working on making a more robust implementation that I can land.
https://arxiv.org/abs/2410.11094
I'm not sure I understand your example; if I am looking at it right, it has overlapping bitfields.
But supposing you didn't want overlapping fields, you could write:
And the compiler would smash the bits together (highest order bits first).If you wanted more control, you can specify where every bit of every field goes using a bit pattern:
Where each of T, b, z, and r represent a bit of each respective field.Overlapping. I have my needs.
Bitfields are kind of a fake feature because they can't be individually addressed like variables can. So they just turn into inlined getters and setters. Old compilers could not inline arbitrary short functions so bitfields were required as an extra hack, but this is no longer the case today.
Are you saying you want foo and bar to completely overlap? And baz and foo / bar to partially overlap? And have lots of unused bits in there too?
C# can do this with structs. Its kind of very nice to unpack wire data.
I think you can do this with Virgil, but I'm having trouble finding the exact doc page at the moment: https://github.com/titzer/virgil
The description is in the paper, but not all of it is implemented.
https://arxiv.org/abs/2410.11094
Bradley implemented a prototype of the packing solver, but it doesn't do the full generality of what is proposed in the paper.
You can kinda do this with Zig’s packed structs and arbitrary-width integers
Look at Erlang bit syntax: https://www.erlang.org/doc/system/bit_syntax.html
It can even be used for pattern matching.
I don't know whether Gleam or Elixir inherited it.
> I imagine just about any computer science major would have learned the rules of memory layout according to some kind of C-like compiler.
I have worked with a number of fresh grads over the last ten years. I can think of one who may have had a good handle on this. At best the rest range from “vague memory recall about this” to a blank stare.
On the flip hand, it’s something someone can pick up pretty quickly if motivated.
Memory layout of a data structure in various programming languages: https://rosettacode.org/wiki/Memory_layout_of_a_data_structu...
I also had to learn struct alignment the hard way working on WebGPU path tracer and struggling to understand why struct fields not aligning (ironically).
useful!