I’m trying to understand how to parse a Buffer into a Schema, but using using pdb with Python and reading the TS/Python/C++ Arrow source hasn’t really cleared much up for me. Nor has studying https://arrow.apache.org/docs/ipc.html
Here’s are the steps of what I’ve tried (the code is Julia, but only because I’m trying to do this natively, rather than wrap the Arrow C code): # Thrift API method returning a struct (sm_buf, sm_size, df_buf, df_size) (works as expected) julia> tdf = sql_execute_df(conn, "select * from flights_2008_7m limit 1000", 0, 0, 1000) MapD.TDataFrame(UInt8[0xba, 0x58, 0x1b, 0x3d], 93856, UInt8[0xab, 0xd7, 0x7e, 0x50], 188880) # Wrap shared memory into julia array, based on handle and size (works as expected) julia> sm_buf = MapD.load_buffer(tdf.sm_handle, tdf.sm_size) #wrapper using shmget/shmat 93856-element Array{UInt8,1}: 0x2c 0x16 0x00 0x00 0x14 0x00 0x00 0x00 0x00 0x00 ⋮ 0x20 0x74 0x6f 0x20 0x4d 0x66 0x72 0x00 0x00 At this point, walking through an similar Python process, I know that sm_buf represents - type: Schema - metadata length: 5676 - body_length: 0 Where I’m confused is how to proceed. I am getting metadata_length by reinterpreting the first 4-bytes as Int32. julia> mlen = reinterpret(Int32, sm_buf[1:4])[1] 5676 I then assumed that I could start at byte 5 and take the next `mlen-1` bytes: julia> metadata = sm_buf[5:5+mlen-1] 5676-element Array{UInt8,1}: 0x14 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x0c 0x00 ⋮ 0x79 0x65 0x61 0x72 0x00 0x00 0x00 0x00 0x00 Am I on the right track here? I *think* that my `metadata` variable above is a FlatBuffer, but how do I know what its structure is? Additionally, what am I supposed to do with all of the bytes that haven’t been read from `sm_buf` yet? `sm_buf` is 93856 bytes and I’ve only read the first 4 bytes + metadata length, leaving some 88,000 bytes not processed yet. Any help would be greatly appreciated here. Please note that I’m not asking for julia coding help, but rather what the Arrow bytes actually mean/their structure and how to process them further. Thanks, Randy Zwitch