Hello, Jessica Clarke, le mar. 30 janv. 2024 16:07:29 +0000, a ecrit: > On 30 Jan 2024, at 09:02, Samuel Thibault <samuel.thiba...@gnu.org> wrote: > > > > Jessica Clarke, le mar. 30 janv. 2024 02:32:07 +0000, a ecrit: > >> On 29 Jan 2024, at 10:20, Samuel Thibault <samuel.thiba...@gnu.org> wrote: > >>> > >>> Damien Zammit, le lun. 29 janv. 2024 10:07:30 +0000, a ecrit: > >>>> - ljmp $BOOT_CS, $M(0f) > >>>> + xorl %eax, %eax > >>>> + mov %cs, %ax > >>>> + shll $4, %eax > >>>> + addl $M(0f), %eax > >>>> + movl %eax, M(ljmp_offset32) > >>> > >>> This won't work with pipelined processors, which assume a complete > >>> separation between code and data, and will thus have already loaded > >>> the jmp instruction before your modify it. > >> > >> That’s true of most architectures, but not x86. It architecturally > >> guarantees that self-modifying code works, > > > > ?? It was a very common way to detect pentium processors, back in the > > time. > > Ok, so I went and read 12.6 Self-Modiyfing Code of the Intel SDM Volume > 3A (from December 2023), and it has this to say:
I also got a look at my (bit older) volume 3 (dated november 2020), and it says (section 8.1.3 Handling Self- and Cross-Modifying Code) “ IA-32 processors exhibit model-specific behavior when executing self-modified code, depending upon how far ahead of the current execution pointer the code has been modified. As processor microarchitectures become more complex and start to speculatively execute code ahead of the retire- ment point (as in P6 and more recent processor families), the rules regarding which code should execute, pre- or post-modification, become blurred. ” It also mentions the jmp trick as a way that is compliant with current and future versions of IA-32. But it also says that it “is not required for programs intended to run on the Pentium or Intel486 processors, but are recommended to ensure compatibility with the P6 and more recent processor families” (which is contrary to what we could infer from the piece you quoted...) > > A write to a memory location in a code segment that is currently cached > > in the processor causes the associated cache line (or lines) to be > > invalidated. That looks like significantly expensive snooping. I wouldn't be surprised that some cheap x86 clones don't implement this when self-modifying code is frowned upon nowadays anyway. > >>> Rather either perform the relocation from the C code, > >> > >> Were your statement true, that wouldn’t fix the problem, > > > > Isn't an IPI a synchronizing thing? > > Oh that’s true, Ok :) I'd really rather depend on that (since it does make sense in the happens-before questions). Samuel