[Harbour] Re: About HB_TokenGet() function

Przemyslaw Czerpak Wed, 17 Dec 2008 07:41:54 -0800

Hi Juan,

> Is there some technical reason so that the behavior of HB_TokenGet() is 
> different when the delimitador is the space character ?


The only one reason is backward compatibility. Current code for
hb_token*() functions is created by me and when I was working on it
I replicated existing behavior for " " delimiter.

> If yes ... Could it be changed to be an optional behavior ?
> ? HB_TokenGet( ";2;3;4", 1, ";" )    // It shows ""
> ? HB_TokenGet( ";2;3;4", 2, ";" )    // It shows "2"
> ? HB_TokenGet( " 2 3 4", 1, " " )    // It shows "2"
> ? HB_TokenGet( " 2 3 4", 2, " " )    // It shows "3"

Seems that you are not the 1-st person who is confused by it so probably
we should change it. But if we want to touch it then I suggest to make
also other modifications.
Now hb_[a]Token*() functions accept the following parameters:

   hb_[a]Token*( <cStr> [, <nPos> ], [ <cDelim> ], ;
                 [ <lQutoedString> ], [ <lOnlyDblQuoting> ]  ) -> <xResult>

The default <cDelim> is single space " ".
When <cDelim> is single space then it has special meaning and means that
tokens are only non empty substrings, f.e. "  What   a   nice  day  " is
divided into { "What", "a", "nice", "day" }

<lQutoedString> when it's .T. enables respecting string quting in tokens so
this string [1, "2, 3", "4", '5, 6'] with "," as delimiter is divided into:
   [1]
   ["2,3"]
   ["4"]
   ['5,6']
<lOnlyDblQuoting> when it's .T. reduces <lQutoedString> to only (")
characters and (') does not have special meaning so for the above string
instead of the last 4-th token ['5,6'] we will have two ones:
   ['5]
   [6']

I suggest to make following modifications:

1. Keep current behavior of <cDelim> == " " only when delimiter is not given
   as function parameter or it's empty string "" so default value is used.
   So if soemone wants to keep current behavior then he can make:
      HB_TokenGet( " 2 3 4", 2 ) // "3"
   or:
      HB_TokenGet( " 2 3 4", 2, "" ) // "3"
   but if he want strict delimiter behavior just like for other delimiter
   values then he can make:
      HB_TokenGet( " 2 3 4", 2, " " ) // "2"
   so we will define that word delimiting is reserved for empty string
   delimiters ("") and that default delimiter when not given is empty string.

2. Replace two parameters: <lQutoedString> and <lOnlyDblQuoting> by single
   one <lQutoedString> which will have the following meaning:
      .T. - tokens can be quoted by (") and (') characters
      .F. - tokens can be quoted only by (") character
      NIL or any other value - no quoting

3. add new parameter <lStripQuotig> - when it's .T. then leading and trailing
   quote characters will be stripped from returned tokens, f.e. in above
   example ["2,3"] will give [2,3], ["4"] -> [4] and ['5,6'] -> [5,6]

If other developers will not have anything against above then I'll make and
commit such modification.

best regards,
Przemek
_______________________________________________
Harbour mailing list
Harbour@harbour-project.org
http://lists.harbour-project.org/mailman/listinfo/harbour

[Harbour] Re: About HB_TokenGet() function

Reply via email to