Hi Ludovic, Ludovic Courtès <l...@gnu.org> writes: > Though an immediate, like a fixnum or an iflo, is still something > different from a tagged heap object like a pair, right? So I would > expect SCM_THOB_P to be a different test, not a drop-in replacement for > SCM_NIMP, is that correct?
That's right. It's not possible to create a drop-in replacement for SCM_NIMP, because it is being used to answer two different questions which used to be effectively equivalent, but no longer are: (1) Is X a pointer to a heap object with a heap tag in the first word? (2) Is X a reference to a heap object? Test (1) needs to be done before checking the heap tag, to implement type predicates for heap objects. Test (2) is needed in relatively few places, e.g. to decide whether to register disappearing links when adding an entry to a weak hash table. Actually, in my current branch I've removed the SCM_IMP and SCM_NIMP macros outright, because it seems to me they are likely to be misused. SCM_THOB_P implements test (1) and SCM_HEAP_OBJECT_P implements test (2). >> (2) Our existing VM instructions almost invariably specify offsets with >> a granularity of whole words. To support tagged pair pointers with >> good performance, I think we need a few new instructions that >> specify byte offsets, to avoid the expensive extra step of removing >> the tag before accessing the CAR or CDR of a pair. > > So instead of a pointer dereference, SCM_CAR becomes mask + dereference, > right? There's no masking involved. Rather, it is subtracted from the pointer, which allows the tag to be fused with the field offset. For example, on x86-64, whereas CAR and CDR were previously: 1c0: 48 8b 07 mov (%rdi),%rax ;old car and: 1d0: 48 8b 47 08 mov 0x8(%rdi),%rax ;old cdr Now they become: 1e0: 48 8b 47 fa mov -0x6(%rdi),%rax ;new car and: 1f0: 48 8b 47 02 mov 0x2(%rdi),%rax ;new cdr In the VM, I've added four new instructions: make-tagged-non-immediate dst:12 tag:12 offset:32 tagged-allocate-words/immediate dst:8 count:8 tag:8 tagged-scm-ref/immediate dst:8 obj:8 byte-offset:8 tagged-scm-set!/immediate obj:8 byte-offset:8 val:8 The last two instructions above are like 'scm-ref/immediate' and 'scm-set!/immediate' except that they accept byte offsets instead of word offsets. CAR and CDR become: (tagged-scm-ref/immediate DST SRC -6) ;CAR and: (tagged-scm-ref/immediate DST SRC 2) ;CDR (although at present the -6 prints as 250 in the disassembler). > I think we disable GC “interior pointer” scanning. I agree. > With this scheme, an SCM for a pair would actually point in the middle > of a pair; could this be an issue for GC? It is already the case that Guile has tagged pointers in the first word of every struct. The first word of a struct contains a pointer to the vtable, with scm_tc3_struct added. Fortunately, BDW-GC provides GC_REGISTER_DISPLACEMENT, which allows us to register a small offset K, such that BDW-GC should recognize pointers that point K bytes into a heap block. We've been using this in both 2.0 and 2.2 from scm_init_struct (), and in 'master' it's done in scm_storage_prehistory (). This new approach entails registering one additional displacement. In 2.0 and 2.2, we register displacements 0, 16, and 17. The last two are for struct vtables, which point 2 words into a heap block, even before adding scm_tc3_struct (which is 1). In current master, we register 0 and 1 (scm_tc3_struct). With this new approach, we would register 0, 1, and 6 (scm_pair_tag). What do you think? Thanks, Mark