g_once_init*() (Re: Performance implications of GRegex structure)

2007-03-20 Thread Tim Janik
On Sat, 17 Mar 2007, Owen Taylor wrote: > On Sat, 2007-03-17 at 16:14 +0100, Marco Barisione wrote: >> Owen: what should do exactly G_STATIC_REGEX_INIT? > > I was imagining: > > struct _GStaticRegex { >GOnce once; >GStaticRegex *regex; >const gchar *pattern; >GRegexCompileFlags fl

Re: Performance implications of GRegex structure

2007-03-18 Thread Yevgen Muntyan
Gustavo J. A. M. Carneiro wrote: > I can't resist to not state my opining on this :P > > I think it's OK to have a single GRegex object, with no separate match > or matcher, IF g_regex_copy is basically a lightweight copy[1]. > It is. > I think this matches well with the rest of the GLib A

Re: Performance implications of GRegex structure

2007-03-18 Thread Gustavo J. A. M. Carneiro
I can't resist to not state my opining on this :P I think it's OK to have a single GRegex object, with no separate match or matcher, IF g_regex_copy is basically a lightweight copy[1]. I think this matches well with the rest of the GLib APIs wrt. thread safety. None[2] of the other GLib da

Re: Performance implications of GRegex structure

2007-03-18 Thread Freddie Unpenstein
> When you evaluate an API, you have to look at a number of things: > - Is the API complete? Can it do what is needed > - Does the API allow getting common things done in a few lines of > code? > - Is the API easy to figure out? > - Is the resulting code legible and easy to read? > - Does the AP

Re: Performance implications of GRegex structure

2007-03-17 Thread Yevgen Muntyan
Owen Taylor wrote: > On Sat, 2007-03-17 at 16:08 -0500, Yevgen Muntyan wrote: > >> Yevgen Muntyan wrote: >> >>> [snip] >>> To me here the only good argument in favor of separate Match objects is >>> multi-thread uses. >>> Simply because we already have Match object, just hidden. If the bes

Re: Performance implications of GRegex structure

2007-03-17 Thread Owen Taylor
On Sat, 2007-03-17 at 16:08 -0500, Yevgen Muntyan wrote: > Yevgen Muntyan wrote: > > [snip] > > To me here the only good argument in favor of separate Match objects is > > multi-thread uses. > > Simply because we already have Match object, just hidden. If the best > > way to fix GRegex > > for mu

Re: Performance implications of GRegex structure

2007-03-17 Thread Owen Taylor
On Sat, 2007-03-17 at 15:45 -0500, Yevgen Muntyan wrote: > Owen Taylor wrote: [...] > > If we can identify the most common patterns of usage, I think we can > > add convenience functions that make usage of an immutable pattern object > > almost as convenient as the current GRegex. > > > > You can

Re: Performance implications of GRegex structure

2007-03-17 Thread Yevgen Muntyan
Yevgen Muntyan wrote: > [snip] > To me here the only good argument in favor of separate Match objects is > multi-thread uses. > Simply because we already have Match object, just hidden. If the best > way to fix GRegex > for multi-threading is a separate match object, then it should be a > separa

Re: Performance implications of GRegex structure

2007-03-17 Thread Yevgen Muntyan
Owen Taylor wrote: > On Fri, 2007-03-16 at 21:30 +0100, Marco Barisione wrote: > >> Il giorno gio, 15/03/2007 alle 10.18 -0400, Owen Taylor ha scritto: >> >>> But looking over the header file, there is something that puzzles me >>> about the way that it's set up: there is no distinction bet

Re: Performance implications of GRegex structure

2007-03-17 Thread Owen Taylor
On Sat, 2007-03-17 at 16:19 +0100, Marco Barisione wrote: > Il giorno sab, 17/03/2007 alle 10.07 -0400, Matthias Clasen ha scritto: > > Btw, one thing we might want to consider doing (regardless if we go > > with separate pattern and matcher objects) is to make the pattern > > optimization an optio

Re: Performance implications of GRegex structure

2007-03-17 Thread Owen Taylor
On Sat, 2007-03-17 at 16:14 +0100, Marco Barisione wrote: > I opened bug #419368[1] to track this issue, the API used by Owen in the > examples could be inefficient in some cases, so in the next days I'm > going to think to a usable and efficient API. > How can I call the match object? GRegexMatche

Re: Performance implications of GRegex structure

2007-03-17 Thread Marco Barisione
Il giorno sab, 17/03/2007 alle 10.07 -0400, Matthias Clasen ha scritto: > Btw, one thing we might want to consider doing (regardless if we go > with separate pattern and matcher objects) is to make the pattern > optimization an optional part of the constructor rather than a > separate > function. T

Re: Performance implications of GRegex structure

2007-03-17 Thread Marco Barisione
I opened bug #419368[1] to track this issue, the API used by Owen in the examples could be inefficient in some cases, so in the next days I'm going to think to a usable and efficient API. How can I call the match object? GRegexMatcher? GMatcher? GMatch? GRegexMatch? Owen: what should do exactly G_

Re: Performance implications of GRegex structure

2007-03-17 Thread Owen Taylor
On Fri, 2007-03-16 at 21:15 -0500, Yevgen Muntyan wrote: > Matthias Clasen wrote: > > On 3/16/07, Marco Barisione <[EMAIL PROTECTED]> wrote: > > > > > >> BTW if you want I can split GRegex in two separate objects. > >> > > > > Since that seems to be the overwhelming preference, > "overwhelm

Re: Performance implications of GRegex structure

2007-03-17 Thread Owen Taylor
On Fri, 2007-03-16 at 21:30 +0100, Marco Barisione wrote: > Il giorno gio, 15/03/2007 alle 10.18 -0400, Owen Taylor ha scritto: > > But looking over the header file, there is something that puzzles me > > about the way that it's set up: there is no distinction between a > > "pattern/regular express

Re: Performance implications of GRegex structure

2007-03-17 Thread mark
On Sat, Mar 17, 2007 at 12:48:15AM -0500, Yevgen Muntyan wrote: > I am suggesting something which is currently used in real code. > Simple, nice, and working. *If* it's not as good as it should be, > *then* it should be changed. It's not as good as it should be. :-) 1) In terms of interface,

Re: Performance implications of GRegex structure

2007-03-17 Thread Matthias Clasen
On 3/17/07, Yevgen Muntyan <[EMAIL PROTECTED]> wrote: > > Why should I > > have to search the entire search before I can display the first > > match? > > > You can't do the contrary - find all matches and display them. > (I guess Marco should know better, I've never done stuff like > this) > > In

Re: Performance implications of GRegex structure

2007-03-16 Thread Yevgen Muntyan
Hey, I looked at gtksourceview and its patterns, the syntax highlighting engine uses regular expressions with up to 56 subpatterns (length of patterns was the reason for egg_regex_ref()), which amounts to 670 bytes array to store offsets. The match structure in this case is some 40 bytes + those 6

Re: Performance implications of GRegex structure

2007-03-16 Thread Yevgen Muntyan
[Mark, I apologize, I accidentally sent it to you in private] [EMAIL PROTECTED] wrote: > On Fri, Mar 16, 2007 at 09:15:37PM -0500, Yevgen Muntyan wrote: > >> I do understand that a separate match object is a good idea. >> But "separate match object in C API is a good idea" is questionable. >> W

Re: Performance implications of GRegex structure

2007-03-16 Thread mark
On Fri, Mar 16, 2007 at 09:15:37PM -0500, Yevgen Muntyan wrote: > I do understand that a separate match object is a good idea. > But "separate match object in C API is a good idea" is questionable. > While thread-safety is important, it doesn't sound feasible a single > GRegex object will be used f

Re: Performance implications of GRegex structure

2007-03-16 Thread Yevgen Muntyan
Matthias Clasen wrote: > On 3/16/07, Marco Barisione <[EMAIL PROTECTED]> wrote: > > >> BTW if you want I can split GRegex in two separate objects. >> > > Since that seems to be the overwhelming preference, "overwhelming"? > that might > be a good idea. I hope this shouldn't be too bad, sin

Re: Performance implications of GRegex structure

2007-03-16 Thread Matthias Clasen
On 3/16/07, Marco Barisione <[EMAIL PROTECTED]> wrote: > BTW if you want I can split GRegex in two separate objects. Since that seems to be the overwhelming preference, that might be a good idea. I hope this shouldn't be too bad, since GRegex is already split into pattern and match objects, inter

Re: Performance implications of GRegex structure

2007-03-16 Thread Marco Barisione
Il giorno gio, 15/03/2007 alle 10.18 -0400, Owen Taylor ha scritto: > But looking over the header file, there is something that puzzles me > about the way that it's set up: there is no distinction between a > "pattern/regular expression" object and a match/matcher object. The internal code in GReg

Re: Performance implications of GRegex structure

2007-03-16 Thread mark
On Fri, Mar 16, 2007 at 08:20:11AM +0100, Mathieu Lacage wrote: > On Thu, 2007-03-15 at 10:56 -0400, Owen Taylor wrote: > > Well, I could imagine (maybe, barely) that someone could show me numbers > > that showed that with a variety of long and complicated regular > > expressions, compiling them wa

Re: Performance implications of GRegex structure

2007-03-16 Thread mark
On Fri, Mar 16, 2007 at 02:18:23PM -0400, Owen Taylor wrote: > On Fri, 2007-03-16 at 10:57 -0700, David Moffatt wrote: > > char * > > get_leading_digits(const char *str) > > { > > static GRegex *regex = NULL; > > char *result = NULL; > > > > if (!regex) > > regex = g_

RE: Performance implications of GRegex structure

2007-03-16 Thread Owen Taylor
On Fri, 2007-03-16 at 10:57 -0700, David Moffatt wrote: > char * > get_leading_digits(const char *str) > { > static GRegex *regex = NULL; > char *result = NULL; > > if (!regex) > regex = g_regex_new("^\\d+", 0, 0, NULL); > > if (g_regex_match(str, 0)) >

RE: Performance implications of GRegex structure

2007-03-16 Thread David Moffatt
char * get_leading_digits(const char *str) { static GRegex *regex = NULL; char *result = NULL; if (!regex) regex = g_regex_new("^\\d+", 0, 0, NULL); if (g_regex_match(str, 0)) result = g_regex_fetch(regex, 0); return result; } That code bothers

Re: Performance implications of GRegex structure

2007-03-16 Thread Behdad Esfahbod
On Fri, 2007-03-16 at 09:36 -0400, Morten Welinder wrote: > Is there a guarantee that for GRegex (unlike, say, GDate) multiple > threads can use > the same object at the same time? > > I.e., two threads cannot call g_date_get_weekday on the same date, so why are > we > expect that two threads can

Re: Performance implications of GRegex structure

2007-03-16 Thread Dimi Paun
On Fri, March 16, 2007 08:28, Owen Taylor wrote: > char * > get_leading_digits(const char *str) > { > GStaticRegex regex = G_STATIC_REGEX_INIT("^\\d+", 0); > GMatcher *matcher; > char *result = NULL; > > matcher = g_matcher_new_static(®ex, str, 0); > if (g_matcher_ma

Re: Performance implications of GRegex structure

2007-03-16 Thread Morten Welinder
Is there a guarantee that for GRegex (unlike, say, GDate) multiple threads can use the same object at the same time? I.e., two threads cannot call g_date_get_weekday on the same date, so why are we expect that two threads can call g_regex_copy or anything like it? Morten _

Re: Performance implications of GRegex structure

2007-03-16 Thread Owen Taylor
On Thu, 2007-03-15 at 14:16 -0500, Yevgen Muntyan wrote: > [Owen, I apologize, I hit Reply instead of Reply All] > > Owen Taylor wrote: > > So, the regular expression code has been committed to CVS finally. Yay! > > > > But looking over the header file, there is something that puzzles me > > about

Re: Performance implications of GRegex structure

2007-03-16 Thread Nikolai Weibull
On 3/16/07, Mathieu Lacage <[EMAIL PROTECTED]> wrote: > On Thu, 2007-03-15 at 10:56 -0400, Owen Taylor wrote: > > > Well, I could imagine (maybe, barely) that someone could show me numbers > > that showed that with a variety of long and complicated regular > > expressions, compiling them was still

Re: Performance implications of GRegex structure

2007-03-15 Thread Mathieu Lacage
On Thu, 2007-03-15 at 10:56 -0400, Owen Taylor wrote: > Well, I could imagine (maybe, barely) that someone could show me numbers > that showed that with a variety of long and complicated regular > expressions, compiling them was still 10x as fast as matching them > against very short strings. > >

Re: Performance implications of GRegex structure

2007-03-15 Thread Yevgen Muntyan
[Owen, I apologize, I hit Reply instead of Reply All] Owen Taylor wrote: > So, the regular expression code has been committed to CVS finally. Yay! > > But looking over the header file, there is something that puzzles me > about the way that it's set up: there is no distinction between a > "pattern

Re: Performance implications of GRegex structure

2007-03-15 Thread mark
On Thu, Mar 15, 2007 at 10:56:57AM -0400, Owen Taylor wrote: > The compiled form of a regular expression is not altered during matching, > so the same compiled pattern can safely be used by several threads at once. > ... > Well, I could imagine (maybe, barely) that someone could show me numbers >

Re: Performance implications of GRegex structure

2007-03-15 Thread Owen Taylor
On Thu, 2007-03-15 at 10:38 -0400, Morten Welinder wrote: > [Re PCRE] > > > (There is no match[er] object here, but the equivalent is in all the in > > and out parameters ...) > > Is it? If PCRE is as glibc, there is lots of state in the compiled expression > and you cannot use it threaded. How

Re: Performance implications of GRegex structure

2007-03-15 Thread Morten Welinder
[Re PCRE] > (There is no match[er] object here, but the equivalent is in all the in > and out parameters ...) Is it? If PCRE is as glibc, there is lots of state in the compiled expression and you cannot use it threaded. However, once the match call is done, another thread can use the compiled r

Performance implications of GRegex structure

2007-03-15 Thread Owen Taylor
So, the regular expression code has been committed to CVS finally. Yay! But looking over the header file, there is something that puzzles me about the way that it's set up: there is no distinction between a "pattern/regular expression" object and a match/matcher object. GRegex*g_regex_ne