On Thu, Aug 28, 2014 at 4:39 PM, Jarno Rajahalme <jrajaha...@nicira.com> wrote:
>
> On Aug 28, 2014, at 9:57 AM, Gurucharan Shetty <shet...@nicira.com> wrote:
>> diff --git a/lib/ovs-atomic-msvc.h b/lib/ovs-atomic-msvc.h
>> new file mode 100644
>> index 0000000..f357545
>> --- /dev/null
>> +++ b/lib/ovs-atomic-msvc.h
>> @@ -0,0 +1,370 @@
>> +/*
>> + * Copyright (c) 2014 Nicira, Inc.
>> + *
>> + * Licensed under the Apache License, Version 2.0 (the "License");
>> + * you may not use this file except in compliance with the License.
>> + * You may obtain a copy of the License at:
>> + *
>> + *     http://www.apache.org/licenses/LICENSE-2.0
>> + *
>> + * Unless required by applicable law or agreed to in writing, software
>> + * distributed under the License is distributed on an "AS IS" BASIS,
>> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
>> + * See the License for the specific language governing permissions and
>> + * limitations under the License.
>> + */
>> +
>> +/* This header implements atomic operation primitives using pthreads. */
>> +#ifndef IN_OVS_ATOMIC_H
>> +#error "This header should only be included indirectly via ovs-atomic.h."
>> +#endif
>> +
>> +/* From msdn documentation: With Visual Studio 2003, volatile to volatile
>> +references are ordered; the compiler will not re-order volatile variable
>> +access. With Visual Studio 2005, the compiler also uses acquire semantics
>> +for read operations on volatile variables and release semantics for write
>> +operations on volatile variables (when supported by the CPU). */
>
> Is this still the same for MSVC 2012 or 2013?
There is no explicit documentation stating that, but looking at MSVC's
c++ c11 atomic implementation, it looks clear that it is the case. I
will add an explicit comment for this.


>> +
>> +static inline void
>> +atomic_compiler_barrier(memory_order order OVS_UNUSED)
>> +{
>> +    _ReadWriteBarrier();
>> +}
>
> You could do “if (order > memory_order_consume) {“ to avoid unnecessary 
> barriers.
Okay
>
>> +
>> +static inline void
>> +atomic_thread_fence(memory_order order)
>> +{
>> +    if (order == memory_order_seq_cst) {
>> +        MemoryBarrier();
>> +    }
>> +}
>
> This needs an “else { atomic_compiler_barrier(order); }"
Yes. I will fix it up.

>
>> +
>> +static inline void
>> +atomic_signal_fence(memory_order order)
>> +{
>> +    atomic_compiler_barrier(order);
>> +}
>> +
>> +/* 1, 2 and 4 bytes loads and stores are atomic on aligned memory. In 
>> addition,
>> + * since the compiler automatically takes acquire and release semantics on
>> + * volatile variables, for any order lesser than 'memory_order_seq_cst', we
>> + * can directly assign or read values. */
>> +
>> +#define atomic_store32(DST, SRC, ORDER)                                 \
>> +    if (ORDER == memory_order_seq_cst) {                                \
>> +        InterlockedExchange((int32_t volatile *) DST,                   \
>> +                               (int32_t ) SRC);                         \
>> +    } else {                                                            \
>> +        *DST = SRC;                                                     \
>> +    }
>> +
>> +/* 64 bit reads and write are not atomic on x86.
>
> Just noticed this:
>
> “8.1.1 Guaranteed Atomic Operations
>
> The Intel486 processor (and newer processors since) guarantees that the 
> following
> basic memory operations will always be carried out atomically:
> - Reading or writing a byte
> - Reading or writing a word aligned on a 16-bit boundary
> - Reading or writing a doubleword aligned on a 32-bit boundary
>
> The Pentium processor (and newer processors since) guarantees that the 
> following
> additional memory operations will always be carried out atomically:
> - Reading or writing a quadword aligned on a 64-bit boundary
> - 16-bit accesses to uncached memory locations that fit within a 32-bit data 
> bus
>
> The P6 family processors (and newer processors since) guarantee that the 
> following
> additional memory operation will always be carried out atomically:
> - Unaligned 16-, 32-, and 64-bit accesses to cached memory that fit within a 
> cache line”
>
> So, it might be worth it to limit the support for i586, and check alignment 
> at run time to avoid the lock on 64-bit load and store. Or it may be that 
> InterlockedExchange64() already does that?
Okay. I will include this file only for >= i586. It is unlikely that
anyone still uses older machines than that. So it shouldn't be a
problem. I also see with sample programs that 64 bit variables are 64
bit aligned even with 32 bit builds. But to make sure that it is
always the case, I will add an abort() if it is not 64 bit aligned.

>> +
>> +#define atomic_read32(SRC, DST, ORDER)                        \
>> +    if (ORDER == memory_order_seq_cst) {                      \
>> +        *DST = InterlockedOr((int32_t volatile *) SRC, 0);    \
>> +    } else {                                                  \
>> +        *DST = *SRC;                                          \
>> +    }
>> +
>
> On x86, since seq_cst stores are locked, the corresponding reads don’t need 
> to be:
>
> “Locked operations are atomic with respect to all other memory operations and 
> all externally visible events. Only instruction fetch and page table accesses 
> can pass locked instructions. Locked instructions can be used to synchronize 
> data written by one processor and read by another processor. For the P6 
> family processors, locked operations serialize all outstanding load and store 
> operations (that is, wait for them to complete). This rule is also true for 
> the Pentium 4 and Intel Xeon processors, with one exception. Load operations 
> that reference weakly ordered memory types (such as the WC memory type) may 
> not be serialized."
Okay. I will use this.

>
>> +/* 64 bit reads and writes are not atomic on x86. */
>> +#define atomic_read64(SRC, DST, ORDER)                        \
>> +    *DST = InterlockedOr64((int64_t volatile *) SRC, 0);
>> +
>
> On 586 this could be just an assignment, if SRC is quad-word (64-bit) aligned.
Okay. I changed it.
>
>> +/* For 8 and 16 bit variations. */
>> +#define atomic_readX(X, SRC, DST, ORDER)                                   \
>> +    if (ORDER == memory_order_seq_cst) {                                   \
>> +        *DST = InterlockedOr##X((int##X##_t volatile *) SRC, 0);           \
>> +    } else {                                                               \
>> +        *DST = *SRC;                                                       \
>> +    }
>> +
>
> Same here.
>
Okay.

Thanks for the review. I will send in a v2.
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to