rte_ring move head question for machines with relaxed MO (arm/ppc)
Wathsala Wathawana Vithanage
wathsala.vithanage at arm.com
Wed Oct 9 19:27:01 CEST 2024
> >
> > > > 1. rte_ring_generic_pvt.h:
> > > > =====================
> > > >
> > > > pseudo-c-code // related armv8 instructions
> > > > -------------------- --------------------------------------
> > > > head.load() // ldr [head]
> > > > rte_smp_rmb() // dmb ishld
> > > > opposite_tail.load() // ldr [opposite_tail]
> > > > ...
> > > > rte_atomic32_cmpset(head, ...) // ldrex[head];... stlex[head]
> > > >
> > > >
> > > > 2. rte_ring_c11_pvt.h
> > > > =====================
> > > >
> > > > pseudo-c-code // related armv8 instructions
> > > > -------------------- --------------------------------------
> > > > head.atomic_load(relaxed) // ldr[head]
> > > > atomic_thread_fence(acquire) // dmb ish
> > > > opposite_tail.atomic_load(acquire) // lda[opposite_tail]
> > > > ...
> > > > head.atomic_cas(..., relaxed) // ldrex[haed]; ... strex[head]
> > > >
> > > >
> > > > 3. rte_ring_hts_elem_pvt.h
> > > > ==========================
> > > >
> > > > pseudo-c-code // related armv8 instructions
> > > > -------------------- --------------------------------------
> > > > head.atomic_load(acquire) // lda [head]
> > > > opposite_tail.load() // ldr [opposite_tail]
> > > > ...
> > > > head.atomic_cas(..., acquire) // ldaex[head]; ... strex[head]
> > > >
> > > > The questions that arose from these observations:
> > > > a) are all 3 approaches equivalent in terms of functionality?
> > > Different, lda (Load with acquire semantics) and ldr (load) are different.
> >
> > I understand that, my question was:
> > lda {head]; ldr[tail]
> > vs
> > ldr [head]; dmb ishld; ldr [tail];
> >
> > Is there any difference in terms of functionality (memory ops
> ordering/observability)?
>
> To be more precise:
>
> lda {head]; ldr[tail]
> vs
> ldr [head]; dmb ishld; ldr [tail];
> vs
> ldr [head]; dmb ishld; lda [tail];
>
> what would be the difference between these 3 cases?
Case A: lda {head]; ldr[tail]
load of the head will be observed by the memory subsystem
before the load of the tail.
Case B: ldr [head]; dmb ishld; ldr [tail];
load of the head will be observed by the memory subsystem
Before the load of the tail.
Case C: ldr [head]; dmb ishld; lda [tail];
load of the head will be observed by the memory subsystem
before the load of the tail. In addition, any load or store program
order after lda[tail] will not be observed by the memory subsystem
before the load of the tail.
Essentially both cases A and B are the same.
They preserve following program orders.
LOAD-LOAD
LOAD-STORE
More information about the dev
mailing list