On Apr 9, 2014, at 2:06 PM, "John Dill" <john.d...@greenfieldeng.com> wrote:
> I have several character data fields that happen to contain sections of > non-ascii binary data including nul characters. I'd like to get a string > display that shows all of the characters according to the length of the > field, i.e. > > 20 20 20 20 20 20 01 00 01 00 48 31 20 20 20 20 > > produces > > " \001\000\001\000H1 " > > In proto.c, I see that all of the format_text calls use strlen(bytes) as the > length. > > case FT_STRING: > case FT_STRINGZ: > case FT_UINT_STRING: > bytes = (guint8 *)fvalue_get(&fi->value); > label_fill(label_str, hfinfo, format_text(bytes, strlen(bytes))); > > What is the recommended way of creating a text string that uses the octal > encoding '\xxx' for non-ASCII data including nul characters that uses the > 'length' field of 'proto_tree_add_item'? The right short-term way would be to use proto_tree_add_string_format_value() to add the field, and format the string's value yourself, using format_text() with a byte count rather than strlen(). The right long-term way is to modify Wireshark so that this works. The way we handle strings should probably be changed so that we: store the raw string octets as a counted array, along with the string encoding; convert the octets from the encoding to UTF-8 *with invalid octets and sequences shown as escapes* when displaying the strings; convert the octets from the encoding to UTF-8 with invalid octets and sequences shown as Unicode REPLACEMENT CHARACTERS when making the string available for processing by other software (e.g., "-T fields", etc.) (or somehow saying "this isn't a valid string in this encoding); somehow arrange that strings with invalid octets or sequences are *always* unequal to any character string in packet-matching expressions (display/read filters, color "filters", etc.), and perhaps allow strings to be compared against octet sequences (e.g. "foobar.name = 20:20:20:20:20:20:01:00:01:00:48:31:20:20:20:20" matches the raw octets of the string), and use that with "Prepare As Filter" etc.. Alternatively, if they're *not* really character strings, display them as a set of subfields, with the text part shown as strings and the binary data shown as whatever it is, e.g. Frobozz text 1: {blanks} Frobozz count 1: 1 Frobozz count 2: 1 Frobozz text 2: H1{and more blanks} or whatever it is. ___________________________________________________________________________ Sent via: Wireshark-dev mailing list <wireshark-dev@wireshark.org> Archives: http://www.wireshark.org/lists/wireshark-dev Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev mailto:wireshark-dev-requ...@wireshark.org?subject=unsubscribe