I think some of the more academic discussions on CPUs can often be stuck in the late-1980s 5 stage classic RISC paradigm. Those really would cripple your performance for misaligned access, as it would raise an OS exception and make the kernel fix things rather than trying to fix it up to you in hardware.
x86 has always been forgiving to misaligned access, and high end ARM has evolved towards forgiving over time. So you get into far more complex scenarios of how much of a penalty the CPU has fixing up alignment vs the whopping penalty a cache miss gives you with modern memory speeds.
I think some of the more academic discussions on CPUs can often be stuck in the late-1980s 5 stage classic RISC paradigm. Those really would cripple your performance for misaligned access, as it would raise an OS exception and make the kernel fix things rather than trying to fix it up to you in hardware.
x86 has always been forgiving to misaligned access, and high end ARM has evolved towards forgiving over time. So you get into far more complex scenarios of how much of a penalty the CPU has fixing up alignment vs the whopping penalty a cache miss gives you with modern memory speeds.