On Mon, Jun 11, 2012 at 1:31 PM, Harsh Bora <ha...@linux.vnet.ibm.com> wrote: > On 06/07/2012 08:02 PM, Stefan Hajnoczi wrote: >> >> On Thu, May 24, 2012 at 10:50 AM, Harsh Prateek Bora >> <ha...@linux.vnet.ibm.com> wrote: >>> @@ -75,16 +96,22 @@ static char *trace_file_name = NULL; >>> * >>> * Returns false if the record is not valid. >>> */ >>> -static bool get_trace_record(unsigned int idx, TraceRecord *record) >>> +static bool get_trace_record(unsigned int idx, TraceRecord **recordptr) >>> { >>> - if (!(trace_buf[idx].event& TRACE_RECORD_VALID)) { >>> >>> + uint8_t temp_rec[ST_REC_HDR_LEN]; >>> + TraceRecord *record = (TraceRecord *) temp_rec; >>> + read_from_buffer(idx, temp_rec, ST_REC_HDR_LEN); >>> + >>> + if (!(record->event& TRACE_RECORD_VALID)) { >>> >>> return false; >>> } >>> >>> __sync_synchronize(); /* read memory barrier before accessing record >>> */ >> >> >> The need for the memory barrier is no longer clear. Previously we >> were directly accessing the trace ring buffer, and therefore needed to >> ensure fields were settled before accessing them. Now we use >> read_from_buffer() which copies the data into our temporary struct on >> the stack. >> >> I think the best way of doing it is to read the event field first in a >> separate step, then do the read memory barrier, and then read the rest >> of the record. This ensures that the event field is at least as "old" >> as the other fields we read. > > > Currently, it does a two step read: > > read header (which includes event field and length of record as well) > sync > read rest of record (using length from header) > > Are you suggesting this: > > read event field only > sync (if event valid) > read header (for length) > sync > read rest of record (using length) > > or > > read event field only > sync (if event valid) > read header (for length) > read rest of record
header, sync, rest of header + payload The reason for the read barrier is to ensure that we don't read stale header fields (e.g. length) together with an up-to-date "valid" event field. >>> - *record = trace_buf[idx]; >>> - record->event&= ~TRACE_RECORD_VALID; >>> >>> + *recordptr = g_malloc(record->length); >>> + /* make a copy of record to avoid being overwritten */ >>> + read_from_buffer(idx, (uint8_t *)*recordptr, record->length); >>> + (*recordptr)->event&= ~TRACE_RECORD_VALID; >>> return true; >>> } >>> > > [....] > > >>> >>> -void trace3(TraceEventID event, uint64_t x1, uint64_t x2, uint64_t x3) >>> +int trace_alloc_record(TraceBufferRecord *rec, TraceEventID event, >>> uint32_t datasize) >>> { >>> - trace(event, x1, x2, x3, 0, 0, 0); >>> + unsigned int idx, rec_off; >>> + uint32_t rec_len = ST_REC_HDR_LEN + datasize; >>> + uint64_t timestamp_ns = get_clock(); >>> + >>> + if ((rec_len + trace_idx - writeout_idx)> TRACE_BUF_LEN) { >>> + /* Trace Buffer Full, Event dropped ! */ >>> + dropped_events++; >>> + return 1; >> >> >> -ENOSPC? A negative errno is clearer than 0 - success, 1 - failure. >> >>> + } >>> + idx = g_atomic_int_exchange_and_add((gint *)&trace_idx, rec_len) % >>> TRACE_BUF_LEN; >> >> >> How does this deal with the race condition of two or more threads >> running trace_alloc_record() concurrently when the buffer reaches max >> size? I think two or more threads could think they have enough space >> because trace_idx hasn't been updated yet between the buffer length >> check and the atomic update. > > > Are you suggesting to use locking here ? I couldnt find a test-and-set > alternative in glib2. Does qemu have access to any such interface ? How about: http://developer.gnome.org/glib/2.32/glib-Atomic-Operations.html#g-atomic-int-compare-and-exchange I think this becomes: while True: old_idx = trace_idx rmb() new_idx = old_idx + rec_len if new_idx - write_idx > TRACE_BUF_LEN: dropped_events++ return if g_atomic_int_compare_and_exchange((gint *)&trace_idx, old_idx, new_idx): break Now we've reserved [new_idx, new_idx + rec_len). >>> +void trace_record_finish(TraceBufferRecord *rec); >>> +uint32_t safe_strlen(const char* str); >> >> >> Given that safe_strlen() is only used once and a generic utility (not >> specific to the simple trace backend), I suggest dropping it an just >> open coding the tristate operator: s ? strlen(s) : 0. >> > > safe_strlen is used multiple times in auto-generated code by tracetool in > expression for calculating the sum of size of trace arguments which as of > now looks like: > > 8 + 8 + 4 + safe_strlen(str1) + 8 + 4 + safe_strlen(str2) ... > > for tracing events with string arguments. For trace events with multiple > strings, this expression is more readable as compared to having an > expression with tristate operator like this: > > 8 + 8 + 4 + (str1 ? strlen(str1) : 0) + 8 + 4 + (str2 ? strlen(str2) : 0) > ... > > > I agree that its a generic function and need not be placed in tracing code, > let me know the right place where to put it if you think its worth keeping > it. The generic place to put it is cutils.c. It's up to you if you want to keep it. Stefan