Thanks for your help, indeed duplicating the result sounds cleaner than
duplicating before escaping. However, you patch still fails, when
allocating in "pool_data_stack_malloc" this time. I get the following
stack trace:

Panic: pool_data_stack_malloc(): stack frame changed
Error: Raw backtrace: /usr/local/lib/dovecot/libdovecot.so.0(+0x91662)
[0x7f4106ba1662] -> /usr/local/lib/dovecot/libdovecot.so.0(+0x916e9)
[0x7f4106ba16e9] -> /usr/local/lib/dovecot/libdovecot.so.0(i_fatal+0)
[0x7f4106b3aae1] -> /usr/local/lib/dovecot/libdovecot.so.0(+0xac14e)
[0x7f4106bbc14e] ->
/usr/local/lib/dovecot/libdovecot.so.0(p_strdup+0x28) [0x7f4106bcbd88]
-> /usr/local/lib/dovecot/libdovecot.so.0(+0x5ef0c) [0x7f4106b6ef0c] ->
/usr/local/lib/dovecot/libdovecot.so.0(+0x5f30f) [0x7f4106b6f30f] ->
/usr/local/lib/dovecot/libdovecot.so.0(+0x5ebef) [0x7f4106b6ebef] ->
/usr/local/lib/dovecot/libdovecot.so.0(connection_input_default+0xb1)
[0x7f4106b9ee81] ->
/usr/local/lib/dovecot/libdovecot.so.0(io_loop_call_io+0x5f)
[0x7f4106bb5fdf] ->
/usr/local/lib/dovecot/libdovecot.so.0(io_loop_handler_run_internal+0x109)
[0x7f4106bb7499] ->
/usr/local/lib/dovecot/libdovecot.so.0(io_loop_handler_run+0x25)
[0x7f4106bb6085] ->
/usr/local/lib/dovecot/libdovecot.so.0(io_loop_run+0x38)
[0x7f4106bb6228] -> /usr/local/lib/dovecot/libdovecot.so.0(+0x5ef6c)
[0x7f4106b6ef6c] -> /usr/local/lib/dovecot/libdovecot.so.0(+0x5fafc)
[0x7f4106b6fafc] ->
/home/kaiyou/projects/dovecot/src/doveadm/.libs/doveadm(+0x1b4b0)
[0x55f11cd3e4b0] ->
/home/kaiyou/projects/dovecot/src/doveadm/.libs/doveadm(doveadm_cmd_ver2_to_cmd_wrapper+0x23a)
[0x55f11cd5ae6a] ->
/home/kaiyou/projects/dovecot/src/doveadm/.libs/doveadm(doveadm_cmd_run_ver2+0x555)
[0x55f11cd5bc15] ->
/home/kaiyou/projects/dovecot/src/doveadm/.libs/doveadm(doveadm_cmd_try_run_ver2+0x37)
[0x55f11cd5bc67] ->
/home/kaiyou/projects/dovecot/src/doveadm/.libs/doveadm(main+0x1da)
[0x55f11cd3c31a] -> /usr/lib/libc.so.6(__libc_start_main+0xf1)
[0x7f4106792291] ->
/home/kaiyou/projects/dovecot/src/doveadm/.libs/doveadm(_start+0x2a)
[0x55f11cd3c6fa]

The trace is missing some symbols, I will debug tomorrow and see where
the call comes from exactly.

Regards,


On 10/17/2016 06:23 PM, Aki Tuomi wrote:
> Hi!
> 
> Looking at the code, I think the bug is that it just copies pointer to the 
> value, it should, instead, duplicate the value to some memory region. Can you 
> see if this following patch fixes it?
> 
> Aki
> 
>> On October 17, 2016 at 4:14 PM Pierre Jaury <pie...@jaury.eu> wrote:
>>
>>
>> Okay, it seems to me that the bug is due to "t_str_tabunescape" using
>> the unsafe datastack ("t_strdup_noconst") while the string is actually
>> returned in an async callback.
>>
>> Before it is handled by "client_dict_lookup", "client_dict_wait"
>> actually fires some IO loops that are more than likely to call "t_pop"
>> and free/flush the result string (I checked, it does call "t_pop" a
>> couple times indeed). Maybe it would be safer to use a standard
>> datastack pool when unescaping a string in that context.
>>
>> I could work on a patch that would:
>>
>>  - add an optional "pool" attribute to the "client_dict_cmd" structure,
>>  - pass the pool to the async lookup function,
>>  - use the pool when escaping strings that should survive the callback
>>    chain
>>
>> What do you think?
>>
>> Regards,
>> kaiyou
>>
>>
>> On 10/17/2016 09:52 AM, Pierre Jaury wrote:
>>> While trying to isolate properly and reproduce, I was able to trigger
>>> the same bug with the following code:
>>>
>>>   struct dict *dict;
>>>   char* dict_uri = "proxy::sieve";
>>>   char* key = "priv/key";
>>>   char* username = "ad...@domain.tld";
>>>   char* value, error;
>>>
>>>   dict_drivers_register_builtin();
>>>   dict_init(dict_uri, DICT_DATA_TYPE_STRING, username,
>>>     doveadm_settings->base_dir, &dict, &error);
>>>   dict_lookup(dict, pool_datastack_create(), key, &value);
>>>   printf(">%s\n", value); // outputs an empty string
>>>   dict_deinit(&dict);
>>>
>>> I trimmed it to the bare minimal string manipulation functions involved
>>> but cannot reproduce in that case:
>>>
>>>   pool_t pool = pool_datastack_create();
>>>
>>>   char* s1 = "test\001n\001rtest";
>>>   char* s2 = t_str_tabunescape(s1);
>>>   char* s3 = p_strdup(pool, s2);
>>>
>>>   printf("1>%s\n", s1);
>>>   printf("2>%s\n", s2);
>>>   printf("3>%s\n", s3); // all three output the string with NL and CR
>>>
>>> Maybe I am missing a function call in the process or maybe the issue is
>>> related to the way unescaping is performed in the async callback
>>> function in "dict-client.c", or maybe even some other edge case.
>>>
>>> Finally, I was able to run the first snippet without bug by removing the
>>> string duplication in "t_str_tabunescape" (which I realize is not a
>>> proper solution), or by explicitely using the following pool:
>>>
>>>   return str_tabunescape(p_strdup(pool_datastack_create(), str));
>>>
>>>
>>> Hope this helps.
>>>
>>> kaiyou
>>>
>>>
>>> On 10/17/2016 07:51 AM, Aki Tuomi wrote:
>>>> Hi!
>>>>
>>>> This does sound like a bug, we'll have look.
>>>>
>>>> Aki
>>>>
>>>>
>>>> On 17.10.2016 01:26, Pierre Jaury wrote:
>>>>> I dived a little bit further into the rabbit hole, up to the point where
>>>>> debugging has become unpracticle but I still haven't found the root
>>>>> cause for sure.
>>>>>
>>>>> I read most of the code for "p_strdup" based on datastack memory pools
>>>>> (which are used for dictionary lookups both with doveadm and by extdata)
>>>>> and it seems ok. Still, after "t_malloc_real" is called in "t_malloc0",
>>>>> the allocated buffer has the same address as the source string.
>>>>>
>>>>> The only sensible explanation I can come up with is that during
>>>>> unescaping, strings are not allocated properly, leading to the memory
>>>>> pool reusing the string address and zeroing it in the process before the
>>>>> string copy operation.
>>>>>
>>>>> I will follow on this path tomorrow, any lead is more than welcome.
>>>>>
>>>>> kaiyou.
>>>>>
>>>>> On 10/16/2016 11:16 PM, Pierre Jaury wrote:
>>>>>> Hello,
>>>>>>
>>>>>> I am using a dict proxy for my sieve extdata plugin to access some
>>>>>> fields from an SQLite database (autoreply text and other
>>>>>> database-configured items).
>>>>>>
>>>>>> All tests are performed against version 2.2.25.
>>>>>>
>>>>>>   $ dovecot --version
>>>>>>   2.2.25 (7be1766)
>>>>>>
>>>>>> My configuration looks like:
>>>>>>
>>>>>>   dict {
>>>>>>     sieve = sqlite:/etc/dovecot/pigeonhole-sieve.dict
>>>>>>   }
>>>>>>
>>>>>>   [...]
>>>>>>   sieve_extdata_dict_uri = proxy::sieve
>>>>>>
>>>>>> I am able to read pretty much any attribute without any issue, except
>>>>>> when the value contains a special character like "\r" or "\n". By using
>>>>>> the doveadm dict client, I narrowed it down to the dictionary management
>>>>>> part (either server, protocol or client).
>>>>>>
>>>>>> I was suspecting escaping functions from "lib/strescape.c" (mostly
>>>>>> str_tabescape and its counterpart, used by "lib-dict/client.c"), so I
>>>>>> monitored socket communications. It seems that escaping is done properly
>>>>>> on the server and the socket is not an issue either.
>>>>>>
>>>>>> The following strace dump results from running "doveadm dict get"
>>>>>> against the dict socket:
>>>>>>
>>>>>>   connect(8, {sa_family=AF_UNIX, sun_path="..."}, 110) = 0
>>>>>>   fstat(8, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0
>>>>>>   [...]
>>>>>>   write(8, "H2\t0\t0\tad...@domain.tld\tsieve\n", 30) = 30
>>>>>>   [...]
>>>>>>   read(8, "Otest\1r\1ntest\n", 8192)      = 14
>>>>>>
>>>>>> Indeed "\1r" and "\1n" are the escape sequences used by
>>>>>> "lib/strescape.c". I went deeped and debugged the call to "dict_lookup"
>>>>>> performed by doveadm. Indeed the client gets the proper string from the
>>>>>> socket and to my surprise, it is properly unescaped.
>>>>>>
>>>>>> Then, in "client_dict_lookup" ("lib-dict/dict-client.c"), the call to
>>>>>> "p_strdup" returns an empty string (null byte set at the target address).
>>>>>>
>>>>>> Before the call to the dict "->lookup" attribute (client_dict_lookup):
>>>>>>
>>>>>>    RAX: 0x7ffff73a37c0 (push   r14)
>>>>>>    RBX: 0x6831b8 ("priv/reply_body")
>>>>>>    RCX: 0x7fffffffe240 --> 0x682a60 --> 0x6831b8 ("priv/reply_body")
>>>>>>    RDX: 0x6831b8 ("priv/reply_body")
>>>>>>    RSI: 0x683288 --> 0x7ffff7653120 --> 0x7ffff73ea620 ([...])
>>>>>>    RDI: 0x690ad0 --> 0x7ffff7400713 --> 0x75250079786f7270 ('proxy')
>>>>>>
>>>>>>    0x7ffff73a1f10 <dict_lookup+32>:      mov    rcx,r11 (value_r)
>>>>>>    0x7ffff73a1f13 <dict_lookup+35>:      mov    rdx,r8 (key)
>>>>>>    0x7ffff73a1f16 <dict_lookup+38>:      mov    rsi,r10 (pool)
>>>>>>    0x7ffff73a1f19 <dict_lookup+41>:      mov    rdi,r9 (dict)
>>>>>>    0x7ffff73a1f1c <dict_lookup+44>:      add    rsp,0x8
>>>>>> => 0x7ffff73a1f20 <dict_lookup+48>:      jmp    rax
>>>>>>
>>>>>> Before the call to p_strdup in "client_dict_lookup":
>>>>>>
>>>>>>    RSI: 0x6832d8 ("test\r\ntest") (lookup.result.value)
>>>>>>    RDI: 0x683288 --> 0x7ffff7653120 --> [...] (pool)
>>>>>>    RAX: 0x0 (result)
>>>>>>
>>>>>>    0x7ffff73a384f:       nop
>>>>>>    0x7ffff73a3850:       mov    rsi,QWORD PTR [rsp+0x8]
>>>>>>    0x7ffff73a3855:       mov    rdi,r14
>>>>>> => 0x7ffff73a3858:       call   0x7ffff736d3c0 <p_strdup@plt>
>>>>>>    0x7ffff73a385d:       mov    QWORD PTR [r13+0x0],rax
>>>>>>    0x7ffff73a3861:       mov    rsi,QWORD PTR [rsp+0x18]
>>>>>>    0x7ffff73a3866:       xor    rsi,QWORD PTR fs:0x28
>>>>>>    0x7ffff73a386f:       mov    eax,ebx
>>>>>>
>>>>>> After the call:
>>>>>>
>>>>>>    0x7ffff73a3850:       mov    rsi,QWORD PTR [rsp+0x8]
>>>>>>    0x7ffff73a3855:       mov    rdi,r14
>>>>>>    0x7ffff73a3858:       call   0x7ffff736d3c0 <p_strdup@plt>
>>>>>> => 0x7ffff73a385d:       mov    QWORD PTR [r13+0x0],rax
>>>>>>    0x7ffff73a3861:       mov    rsi,QWORD PTR [rsp+0x18]
>>>>>>    0x7ffff73a3866:       xor    rsi,QWORD PTR fs:0x28
>>>>>>    0x7ffff73a386f:       mov    eax,ebx
>>>>>>    0x7ffff73a3871:       jne    0x7ffff73a38da
>>>>>>
>>>>>>    RSI: 0x0
>>>>>>    RDI: 0x6832d8 --> 0x0
>>>>>>    RAX: 0x6832d8 --> 0x0 (result)
>>>>>>
>>>>>> It is worth noting that I can reproduce the exact same execution flow
>>>>>> with a non-multiline result string (lookup.result.value) that is
>>>>>> properly copied by "p_strdup" and returned in RAX, then displayed by
>>>>>> doveadm.
>>>>>>
>>>>>> I am not familiar with the pooling mechanism hidden behind the call to
>>>>>> p_strdump and not quite sure why this behaviour is emerging. Maybe I am
>>>>>> even miles away from an understanding of the issue here, but it sounds
>>>>>> to me like something is wrong in the way "p_strdup" performs the copy.
>>>>>>
>>>>>> Hope this helps,
>>>>>> kaiyou.
>>>>>>
>>>>>>
>>>>> >

Reply via email to