On Sun, Feb 12, 2023 at 03:31:08PM +0200, Yonatan Shtarkman wrote:
> Hey,
> When downloading a file whose path contains multi-byte utf-8, libguestfs
> sometimes crashes.
> This reproduces when using python, and not when using guestfish.
> 
> Code to reproduce:
> for i in range(2000):
>     g.download ('/xxxó', '/tmp/1')

'i' is not used inside the loop?  Or is this error intermittent?

> #0  raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:50
> #1  0x00007ffff7fac140 in <signal handler called> () at /lib/x86_64-linux-gnu/
> libpthread.so.0
> #2  0x00007ffff6f77701 in _Py_INCREF (op=<optimized out>) at /usr/include/
> python3.9/object.h:408
> #3  guestfs_int_py_event_callback_wrapper
>     (g=<optimized out>, flags=<optimized out>, array_len=0, array=0x0, 
> buf_len=
> 47, buf=0x113b8a0 "gs=0x0\r\ncommandrvf: udevadm --debug settle -E \303by",
> event_handle=0, event=16, callback=0x7ffff2516600) at handle.c:137
> #4  guestfs_int_py_event_callback_wrapper
>     (g=<optimized out>, callback=0x7ffff2516600, event=16, event_handle=0,
> flags=<optimized out>, buf=0x113b8a0 "gs=0x0\r\ncommandrvf: udevadm --debug
> settle -E \303by", buf_len=47, array=0x0, array_len=0) at handle.c:104
> #5  0x00007ffff6e0076a in guestfs_int_call_callbacks_message (g=0xf31290, 
> event
> =16, buf=0x113b8a0 "gs=0x0\r\ncommandrvf: udevadm --debug settle -E \303by",
> buf_len=47)
>     at events.c:117
> #6  0x00007ffff6e1702e in guestfs_int_log_message_callback
>     (g=g@entry=0xf31290, buf=0x113b8a0 "gs=0x0\r\ncommandrvf: udevadm --debug
> settle -E \303by", len=len@entry=47) at proto.c:145
> #7  0x00007ffff6dfb759 in handle_log_message (g=g@entry=0xf31290, conn=
> conn@entry=0x110e280) at conn-socket.c:395
> #8  0x00007ffff6dfbd63 in read_data (len=4, bufv=<optimized out>, connv=
> <optimized out>, g=<optimized out>) at conn-socket.c:179
> #9  read_data (g=0xf31290, connv=0x110e280, bufv=<optimized out>, len=4) at
> conn-socket.c:142
> #10 0x00007ffff6e1764a in recv_from_daemon (buf_rtn=0x7fffffffd858, size_rtn=
> 0x7fffffffd854, g=0xf31290) at proto.c:545
> #11 guestfs_int_recv_from_daemon (g=g@entry=0xf31290, size_rtn=size_rtn@entry=
> 0x7fffffffd854, buf_rtn=buf_rtn@entry=0x7fffffffd858) at proto.c:623
> #12 0x00007ffff6e17a5a in guestfs_int_recv
>     (g=g@entry=0xf31290, fn=fn@entry=0x7ffff6e3b3e8 "download", hdr=hdr@entry=
> 0x7fffffffd920, err=err@entry=0x7fffffffd8f0, xdrp=xdrp@entry=0x0, ret=
> ret@entry=0x0)
>     at proto.c:668
> 
> I debugged this issue and noticed that the appliance logs from commandrvf are
> truncated, leading to parse failure (missing utf-8 additional bytes):
> https://github.com/libguestfs/libguestfs/blob/master/python/handle.c#L134
> UnicodeDecodeError: 'utf-8' codec can't decode byte 0x84 in position 0: 
> invalid
> start byte

So I thought we'd fixed this in:

https://github.com/libguestfs/libguestfs/commit/0ee02e0117527b86a31b2a88a14994ce7f15571f

This is specifically a Python API problem or would it affect
the C API too?

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org
_______________________________________________
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs

Reply via email to