I can't speak for others but I do normal development with a debug build & UBSAN turned on. I haven't had any problems using gdb in this setup. Usually if I get a release-only bug it's because of timing or memory being reused more aggressively in which case I would at least start with ASAN and TSAN and move on to print statements from there. However, release-only bugs have been quite rare in my experience.
If you are going to use gdb then you should read [1] as there is a very helpful gdb extension for debugging Arrow code. [1] https://arrow.apache.org/docs/cpp/gdb.html On Tue, Apr 19, 2022 at 8:34 AM Li Jin <ice.xell...@gmail.com> wrote: > > I see. Thanks Weston. This is a nice tracing utils. I will give it a shot. > Although it might be more information that I actually want, I might just > use a print statement. > > As a side question - what do most of Arrow dev use for debugging compute > related code? I am new to this and tried to pdb but ended up seeing > incorrect data (I observed an ExecBatch with negative length in gdb, but > couldn't observe it when using print statements). Someone suggests that it > could be because compiling with optimization can lead to anomalous gdb > behavior, so I am just curious what other people do. > > Li > > On Tue, Apr 19, 2022 at 12:22 PM Weston Pace <weston.p...@gmail.com> wrote: > > > The EVENT macro is specific to open telemetry tracing. So if `side` > > is only used to populate the event then I think you will need to > > surround the entire block with: > > > > ``` > > #ifdef ARROW_WITH_OPENTELEMETRY > > int side = ... > > EVENT(span_, "InputReceived", {{"batch.length", batch.length}, {"side", > > side}}); > > #endif > > ``` > > > > If you want to see it in action then you can enable open telemetry by > > turning on ARROW_WITH_OPENTELEMETRY in the cmake options. However, to > > actually get output you will need to tell OT where to send output. > > The simplest way to do this is to use the ARROW_TRACING_BACKEND > > environment variable. You can see all the options we have at the > > moment in src/arrow/util/tracing_internal.cc but a simple choice is > > "ostream" which dumps everything to (I think) stdout. > > > > Example: > > > > ``` > > ARROW_TRACING_BACKEND=ostream ./debug/arrow-dataset-scanner-test \ > > > > --gtest_filter=TestScannerThreading/TestScanner.FilteredScanNested/2Threaded1d1b1024r > > ``` > > > > Yields something like: > > > > ``` > > { > > name : SinkNode: > > trace_id : 1e39508fe9fe74bfc1c39cfbe9b63d55 > > span_id : afb1b87450748124 > > tracestate : > > parent_span_id: 4b46e64fb1469f90 > > start : 1650385199799132739 > > duration : 9878711 > > description : > > span kind : Internal > > status : Unset > > attributes : > > thread.id: 140287607824768 > > node.detail: :SinkNode{} > > node.kind: SinkNode > > node.label: > > events : > > { > > name : InputFinished > > timestamp : 1650385199804563001 > > attributes : > > batches.length: 1 > > } > > { > > name : InputReceived > > timestamp : 1650385199807960461 > > attributes : > > batch.length: 512 > > } > > links : > > resources : > > service.name: unknown_service > > telemetry.sdk.version: 1.3.0 > > telemetry.sdk.name: opentelemetry > > telemetry.sdk.language: cpp > > instr-lib : arrow > > } > > ``` > > > > To get more complete output from OT you will eventually want to use > > the http exporter and export the data to some kind of tool like Jaeger > > which can do visualizations of the data and offer flame charts. > > > > On Tue, Apr 19, 2022 at 5:39 AM Li Jin <ice.xell...@gmail.com> wrote: > > > > > > Hello! > > > > > > I am trying to implement a new type of join in Arrow Compute engine (asof > > > join). I have been looking at code of HashJoinNode and found some debug > > > code that seems to be useful: > > > > > > e.g.: > > > EVENT(span_, "InputReceived", {{"batch.length", batch.length}, > > {"side", > > > side}}); > > > > > > But when I try to use similar code in my ExecNode, I got an error: > > > > > > > > /home/icexelloss/workspace/arrow/cpp/src/arrow/compute/exec/asof_join_node.cc:67:9: > > > error: unused variable ‘side’ [-Werror=unused-variable] > > > 67 | int side = (input == inputs_[0]) ? 0 : 1; > > > | ^~~~ > > > > > > (here is my code): > > > void InputReceived(ExecNode* input, ExecBatch batch) override { > > > int side = (input == inputs_[0]) ? 0 : 1; > > > EVENT(span_, "InputReceived", {{"batch.length", batch.length}, > > {"side", > > > side}}); > > > } > > > > > > I wonder: > > > (1) Is there a special cmake flag I need to pass in to enable the EVENT > > > marco? > > > (2) What does the EVENT marco do and where does it output to? > > > > > > Thanks! > > > Li > >