On Apr 4, 2014, at 7:30 AM, Hadriel Kaplan <hadriel.kap...@oracle.com> wrote:

> I might be overlooking something, but I don’t see a tvb_get_* function to get 
> a uint8/16/32/64 that was encoded as a ascii or utf-8 string in the packet. 
> Is there such a thing?

No.

I've occasionally also thought there should be such a routine.

Note, though, that, whilst tvb_get_guint8() and tvb_get_{n,le}tohXXX() can 
never fail, because every possible sequence of octets is a valid 2's complement 
integral value, routines to get a number encoded as a string *can* fail, e.g. 
0123xyzw is not a valid number in bases 8, 10, or 16.

There are other cases where a tvb_get_ routine can return "you lose", e.g. 
tvb_get_string_enc() can fail if there are invalid octet sequences (about the 
only encodings I know of where *every* octet sequence is a valid string are 
some of the ISO 8859-n encodings), and at least some floating-point formats 
probably have invalid values (I guess an IEEE NaN is "valid", at least to the 
extent that if we try to format it it'll show up as "NaN", but if we try to do 
calculations with it we might get a floating-point exception.

> Instead, it seems the dissectors that deal with string messages do a 
> tvb_get_string_enc() or tvb_format_text(), and then a strtol() or atoi(). But 
> in my way of thinking, the fact that it’s in a string-encoded form in the tvb 
> isn’t that much different from it being encoded as little-endian vs. 
> network-order.
> 
> Likewise, it’s not clear if there’s a way to define a protocol field that is 
> encoded as a string in the packet but is internally a uint8/16/32/64 (e.g., 
> for filtering purposes, val_string lookup, etc.). For example such that 
> proto_tree_add_item() would work. Instead, it seems some dissectors use the 
> returned strtol/atoi to then add the field to the tree as a true uint type, 
> or add it as a FT_STRING field type.

One advantage of that is that, if the routine to fetch the value also adds an 
item to the protocol tree, it could, in the cases where the value is invalid, 
also add an expert item indicating that the value isn't valid.

And I'd like to see proto_tree_add_XXX_item() routines that add an item with a 
particular type *and* take a pointer argument and return the value for the item 
through that pointer; that could replace

        xxx = tvb_get_XXX();
        proto_tree_add_XXX(..., xxx);

combinations and

        xxx = tvb_get_XXX();
        proto_tree_add_item(...);       /* re-fetches the item value */

with

        proto_tree_add_XXX_item(..., &xxx);

> And if we had common functions handle ascii and utf-8 string-encoded numbers, 
> they could avoid creating temporary strings as well.

The only real encoding issues are "ASCII superset" (so that "0123456789", for 
example, is encoded the same as in ASCII) vs. "2 or more bytes per ASCII 
character" (e.g., UCS-2, UTF-16, and UCS-4) vs. "one of those 7-bit GSM 
character encodings" vs. "EBCDIC".
___________________________________________________________________________
Sent via:    Wireshark-dev mailing list <wireshark-dev@wireshark.org>
Archives:    http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
             mailto:wireshark-dev-requ...@wireshark.org?subject=unsubscribe

Reply via email to