Re: [fpc-pascal] Fast HTML Parser
On 2014-08-06 21:54, Marcos Douglas wrote: > I know the tokens to search, but the HTML could be very different each other. > I can't use a external tool. Need to be a application (that already exists). Take a look at POWtils (aka PWU or PSP or Pascal Server Pages) created by somebody known as Z505. There has been various locations for the source code, but I think the latest is at: https://code.google.com/p/powtils/ It has (or at least had) a very simple to use HTML parser that was very fast. If you don't come write with the above URL, I have some release archives I know contains the code. Just let me know and I can make it available. Regards, - Graeme - -- fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal http://fpgui.sourceforge.net/ ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] Fast HTML Parser
On 2014-08-06 21:54, Marcos Douglas wrote: > I know the tokens to search, but the HTML could be very different each other. > I can't use a external tool. Need to be a application (that already exists). It seems a copy of the Fast HTML Parser unit I spoke of has made its way into the FPC source code tree. See /packages/chm/src/fasthtmlparser.pas Attached is the original one I got from powtils release. It includes the parser, a utility unit and a demo program showing the parser in action with some stats output. Regards, - Graeme - -- fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal http://fpgui.sourceforge.net/ fasthtmlparser.tar.gz Description: GNU Zip compressed data ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] Fast HTML Parser
On 08/06/2014 07:54 PM, Rainer Stratmann wrote: It's not that difficult to write yourself. In fact, my son once did write (using Delphi) a parser that creates a list of hierarchically linked objects from HTML code and also can write a HTML file from this structure. So you can read a file, use straight forward programming to modify the content, and write it back. As the HTML format is not very strict and is a moving target, the parser unit is far from perfect, but it is in daily use and does a rather nice job. OTOH, I would not say it's fast, anyway :-( . -Michael ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
[fpc-pascal] why dynamic array created in constructor not automatically free in destructor
I have a class like; TMyClass=class public ar : array of integer; constructor Create; destructor destroy;override; end; TMyClass.Create; begin inherited; SetLength(ar, 100); end; TMyClass.Destroy; begin ar := nil;//<--- this is needed otherwise a memory leak is reported! inherited; end; -- I would expect the compiler would automatically insert this ar := nil on my behalf because it seems like it does it for strings. Am I missing some compiler directives? Dennis ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] best? safest? fastest?
waldo kitty wrote: On 8/6/2014 4:08 AM, Mark Morgan Lloyd wrote: waldo kitty wrote: i suspect this is going to be like the long-standing joke of cheap, fast, stable: choose two over the years, i've seen two schools of code for dealing with dates... years, specifically... one school is string based and the other is math based... both have their faults and pluses... eg: string based fault : prepend '19' to single digit year value math based fault : 2003 - 1900 = 103 (3 is intended result) I'd be inclined to start off using your method 1, i.e. text manipulation until the format is consistent. i don't understand "until the format is consistent"... the format has been in use since the 60s at least (AFAIK) ;) What I mean is, while you're doing the initial processing to e.g. add century digits to the date and possibly to check number of decimal places etc. FWIW: i have taken some time and reworked things to be math based while still taking the required text format into account... i've seen a very nice increase in processing speed and now need to just make sure that i don't run into any of the basic and well known flaws that math processing of date strings seem to have ;) Flatten the original record and save it in a database, create a new flat text this appears that you are speaking of a sql database or similar? that may be a later feature but for now, everything is/has to be done with the raw TLE files... Databases, even for plain-text records, can be incredibly useful. i'm not sure what you mean by "flatten", either... currently i break down the TLEs into their major records for storage in the in-memory ""database""... the processing i posted is done before that storage takes place... I was thinking that the first thing you could do was convert the two lines into a single one for processing, but on reflection it would be better to save the original with as little modification as possible- possibly with any accession info you had (i.e. what body had provided that particular TLE). the goal of the program is to build the in-memory database from all specified TLE files and then to write out new TLE files which may be filtered on a selection property so that only certain matching TLE records are saved... The problem there being that once the program stops you've then got to rebuild the next time. What I normally do when handling large bodies of tabular info is to either have a series of database tables or a series of text files, where ideally the text files are absolutely predictable (all fields a known length and appropriately padded). What I'm normally looking for is rate-of-change over multiple records with irregular timestamps, which is an awkward job however it's done. -- Mark Morgan Lloyd markMLl .AT. telemetry.co .DOT. uk [Opinions above are the author's, not those of his employers or colleagues] ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] why dynamic array created in constructor not automatically free in destructor
Am 2014-08-07 10:21, schrieb Dennis Poon: > TMyClass=class > public > ar : array of integer; > constructor Create; > destructor destroy;override; > end; > TMyClass.Create; > begin >inherited; >SetLength(ar, 100); > end; > TMyClass.Destroy; > begin >ar := nil;//<--- this is needed otherwise a memory leak is reported! > inherited; > end; > I would expect the compiler would automatically insert this ar := nil on my behalf because it seems like it does it for strings. > Am I missing some compiler directives? I think strings are a very special case because they are treated diffently in many cases. Strings can be freed without harm because they are well defined and no parts of them point to other objects on the heap. They have a reference counter where the compiler logs how many instances are pointing to the string and only removes it when it has reached zero. If you have a dynamic array then elements may point to other structures (which again may point to a structure that) which again need to be freed. If you have allocated memory yourself then the compiler does not know how (and when) to free these objects. They may be used in other arrays or elsewhere. ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] why dynamic array created in constructor not automatically free in destructor
On Thu, 7 Aug 2014, Jürgen Hestermann wrote: Am 2014-08-07 10:21, schrieb Dennis Poon: TMyClass=class public ar : array of integer; constructor Create; destructor destroy;override; end; TMyClass.Create; begin inherited; SetLength(ar, 100); end; TMyClass.Destroy; begin ar := nil;//<--- this is needed otherwise a memory leak is reported! inherited; end; I would expect the compiler would automatically insert this ar := nil on my behalf because it seems like it does it for strings. Am I missing some compiler directives? I think strings are a very special case because they are treated diffently in many cases. Strings can be freed without harm because they are well defined and no parts of them point to other objects on the heap. They have a reference counter where the compiler logs how many instances are pointing to the string and only removes it when it has reached zero. If you have a dynamic array then elements may point to other structures (which again may point to a structure that) which again need to be freed. If you have allocated memory yourself then the compiler does not know how (and when) to free these objects. They may be used in other arrays or elsewhere. The compiler frees dynamic arrays. The following program: Type TMyClass=class public ar : array of integer; constructor Create; destructor destroy;override; end; Constructor TMyClass.Create; begin inherited; SetLength(ar, 100); end; Destructor TMyClass.Destroy; begin //ar := nil;//<--- this is needed otherwise a memory leak is reported! inherited; end; begin With TMyClass.Create do Free; end. Reports 2 allocated blocks, and 2 free blocks. Exactly as you would expect: home: >fpc -S2 -gh ta.pp /usr/bin/ld: warning: link.res contains output sections; did you forget -T? home: >./ta Heap dump by heaptrc unit 2 memory blocks allocated : 432/432 2 memory blocks freed : 432/432 0 unfreed memory blocks : 0 True heap size : 294912 True free heap : 294912 What can happen is that you did somewhere a B:=MyClass.ar; and as long as B is in scope, the reference count of A is not 0 when the object is freed, and the array is also not freed. (the ar:=nil will not change that) Another possibility is a forgotten 'inherited' somewhere in a destructor. Michael.___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] why dynamic array created in constructor not automatically free in destructor
Am 2014-08-07 11:32, schrieb Michael Van Canneyt: > The compiler frees dynamic arrays. Does this happen too if you declare a dynamic array locally in a function and leave this function? ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] why dynamic array created in constructor not automatically free in destructor
> Does this happen too if you declare a dynamic array locally in a function and leave this function? The global rule: as soon as the reference count reaches 0, the array gets unallocated, wherever it's declared and as long as it's accessed normally without any hack. -- View this message in context: http://free-pascal-general.1045716.n5.nabble.com/why-dynamic-array-created-in-constructor-not-automatically-free-in-destructor-tp5719897p5719902.html Sent from the Free Pascal - General mailing list archive at Nabble.com. ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] Fast HTML Parser
On Wed, Aug 6, 2014 at 6:51 PM, Graeme Geldenhuys wrote: > On 2014-08-06 21:54, Marcos Douglas wrote: >> I know the tokens to search, but the HTML could be very different each other. >> I can't use a external tool. Need to be a application (that already exists). > > Take a look at POWtils (aka PWU or PSP or Pascal Server Pages) created > by somebody known as Z505. There has been various locations for the > source code, but I think the latest is at: > > https://code.google.com/p/powtils/ > > It has (or at least had) a very simple to use HTML parser that was very > fast. If you don't come write with the above URL, I have some release > archives I know contains the code. Just let me know and I can make it > available. But the fasthtmlparser, your tip before, is a powtils' source, don't? I have the code -- for many years -- but I did not know about fasthtmlparser. It's very simple. I did not found everything I want but it is a good start. Best regards, Marcos Douglas PS: Like you I use FPC in real applications in production. So I have a deadline - always short - to fulfill. So finding good code to help in our projects is very good because it makes us save time. Thanks. ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] Fast HTML Parser
In our previous episode, Marcos Douglas said: > > It has (or at least had) a very simple to use HTML parser that was very > > fast. If you don't come write with the above URL, I have some release > > archives I know contains the code. Just let me know and I can make it > > available. > > But the fasthtmlparser, your tip before, is a powtils' source, don't? > I have the code -- for many years -- but I did not know about > fasthtmlparser. It's very simple. I did not found everything I want > but it is a good start. Yes it is. The CHM parser is also based on it, but there z505 is not listed as author but as contributor: AUTHOR : James Azarja http://www.jazarsoft.com/ CONTRIBUTORS : L505 http://z505.com ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
[fpc-pascal] Scheduled Downtime of main FPC server
Hello, Saturday and sunday, the main FPC server will be out for a short time as mail, html (main FPC website and bugtracker) and SVN services are moved to a new machine. The plan is to move SVN/HTML first, (this should take little time) and move SMTP (mail) later in the weekend. The WIKI, Forum and mailing lists are located on other machines, they will not be affected. As soon as a service is moved, the appropriate DNS records will be updated. I will post the new IP address as soon as the move is completed. When the new SVN/HTML machine is up, we'll see about setting up an official git mirror on the new machine. Michael. ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] Fast HTML Parser
You can try http://www.benibela.de/sources_en.html#internettools Luiz 2014-08-07 10:20 GMT-03:00 Marco van de Voort : > In our previous episode, Marcos Douglas said: > > > It has (or at least had) a very simple to use HTML parser that was very > > > fast. If you don't come write with the above URL, I have some release > > > archives I know contains the code. Just let me know and I can make it > > > available. > > > > But the fasthtmlparser, your tip before, is a powtils' source, don't? > > I have the code -- for many years -- but I did not know about > > fasthtmlparser. It's very simple. I did not found everything I want > > but it is a good start. > > Yes it is. The CHM parser is also based on it, but there z505 is not listed > as author but as contributor: > > AUTHOR : James Azarja > http://www.jazarsoft.com/ > > CONTRIBUTORS : L505 > http://z505.com > > > ___ > fpc-pascal maillist - fpc-pascal@lists.freepascal.org > http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal > ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal