On Thu, Aug 28, 2014 at 4:39 PM, Jarno Rajahalme <jrajaha...@nicira.com> wrote: > > On Aug 28, 2014, at 9:57 AM, Gurucharan Shetty <shet...@nicira.com> wrote: >> diff --git a/lib/ovs-atomic-msvc.h b/lib/ovs-atomic-msvc.h >> new file mode 100644 >> index 0000000..f357545 >> --- /dev/null >> +++ b/lib/ovs-atomic-msvc.h >> @@ -0,0 +1,370 @@ >> +/* >> + * Copyright (c) 2014 Nicira, Inc. >> + * >> + * Licensed under the Apache License, Version 2.0 (the "License"); >> + * you may not use this file except in compliance with the License. >> + * You may obtain a copy of the License at: >> + * >> + * http://www.apache.org/licenses/LICENSE-2.0 >> + * >> + * Unless required by applicable law or agreed to in writing, software >> + * distributed under the License is distributed on an "AS IS" BASIS, >> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. >> + * See the License for the specific language governing permissions and >> + * limitations under the License. >> + */ >> + >> +/* This header implements atomic operation primitives using pthreads. */ >> +#ifndef IN_OVS_ATOMIC_H >> +#error "This header should only be included indirectly via ovs-atomic.h." >> +#endif >> + >> +/* From msdn documentation: With Visual Studio 2003, volatile to volatile >> +references are ordered; the compiler will not re-order volatile variable >> +access. With Visual Studio 2005, the compiler also uses acquire semantics >> +for read operations on volatile variables and release semantics for write >> +operations on volatile variables (when supported by the CPU). */ > > Is this still the same for MSVC 2012 or 2013? There is no explicit documentation stating that, but looking at MSVC's c++ c11 atomic implementation, it looks clear that it is the case. I will add an explicit comment for this.
>> + >> +static inline void >> +atomic_compiler_barrier(memory_order order OVS_UNUSED) >> +{ >> + _ReadWriteBarrier(); >> +} > > You could do “if (order > memory_order_consume) {“ to avoid unnecessary > barriers. Okay > >> + >> +static inline void >> +atomic_thread_fence(memory_order order) >> +{ >> + if (order == memory_order_seq_cst) { >> + MemoryBarrier(); >> + } >> +} > > This needs an “else { atomic_compiler_barrier(order); }" Yes. I will fix it up. > >> + >> +static inline void >> +atomic_signal_fence(memory_order order) >> +{ >> + atomic_compiler_barrier(order); >> +} >> + >> +/* 1, 2 and 4 bytes loads and stores are atomic on aligned memory. In >> addition, >> + * since the compiler automatically takes acquire and release semantics on >> + * volatile variables, for any order lesser than 'memory_order_seq_cst', we >> + * can directly assign or read values. */ >> + >> +#define atomic_store32(DST, SRC, ORDER) \ >> + if (ORDER == memory_order_seq_cst) { \ >> + InterlockedExchange((int32_t volatile *) DST, \ >> + (int32_t ) SRC); \ >> + } else { \ >> + *DST = SRC; \ >> + } >> + >> +/* 64 bit reads and write are not atomic on x86. > > Just noticed this: > > “8.1.1 Guaranteed Atomic Operations > > The Intel486 processor (and newer processors since) guarantees that the > following > basic memory operations will always be carried out atomically: > - Reading or writing a byte > - Reading or writing a word aligned on a 16-bit boundary > - Reading or writing a doubleword aligned on a 32-bit boundary > > The Pentium processor (and newer processors since) guarantees that the > following > additional memory operations will always be carried out atomically: > - Reading or writing a quadword aligned on a 64-bit boundary > - 16-bit accesses to uncached memory locations that fit within a 32-bit data > bus > > The P6 family processors (and newer processors since) guarantee that the > following > additional memory operation will always be carried out atomically: > - Unaligned 16-, 32-, and 64-bit accesses to cached memory that fit within a > cache line” > > So, it might be worth it to limit the support for i586, and check alignment > at run time to avoid the lock on 64-bit load and store. Or it may be that > InterlockedExchange64() already does that? Okay. I will include this file only for >= i586. It is unlikely that anyone still uses older machines than that. So it shouldn't be a problem. I also see with sample programs that 64 bit variables are 64 bit aligned even with 32 bit builds. But to make sure that it is always the case, I will add an abort() if it is not 64 bit aligned. >> + >> +#define atomic_read32(SRC, DST, ORDER) \ >> + if (ORDER == memory_order_seq_cst) { \ >> + *DST = InterlockedOr((int32_t volatile *) SRC, 0); \ >> + } else { \ >> + *DST = *SRC; \ >> + } >> + > > On x86, since seq_cst stores are locked, the corresponding reads don’t need > to be: > > “Locked operations are atomic with respect to all other memory operations and > all externally visible events. Only instruction fetch and page table accesses > can pass locked instructions. Locked instructions can be used to synchronize > data written by one processor and read by another processor. For the P6 > family processors, locked operations serialize all outstanding load and store > operations (that is, wait for them to complete). This rule is also true for > the Pentium 4 and Intel Xeon processors, with one exception. Load operations > that reference weakly ordered memory types (such as the WC memory type) may > not be serialized." Okay. I will use this. > >> +/* 64 bit reads and writes are not atomic on x86. */ >> +#define atomic_read64(SRC, DST, ORDER) \ >> + *DST = InterlockedOr64((int64_t volatile *) SRC, 0); >> + > > On 586 this could be just an assignment, if SRC is quad-word (64-bit) aligned. Okay. I changed it. > >> +/* For 8 and 16 bit variations. */ >> +#define atomic_readX(X, SRC, DST, ORDER) \ >> + if (ORDER == memory_order_seq_cst) { \ >> + *DST = InterlockedOr##X((int##X##_t volatile *) SRC, 0); \ >> + } else { \ >> + *DST = *SRC; \ >> + } >> + > > Same here. > Okay. Thanks for the review. I will send in a v2. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev