Re: [fpc-pascal] Fast HTML Parser

2014-08-07 Thread Graeme Geldenhuys
On 2014-08-06 21:54, Marcos Douglas wrote:
> I know the tokens to search, but the HTML could be very different each other.
> I can't use a external tool. Need to be a application (that already exists).

Take a look at POWtils (aka PWU or PSP or Pascal Server Pages) created
by somebody known as Z505. There has been various locations for the
source code, but I think the latest is at:

  https://code.google.com/p/powtils/

It has (or at least had) a very simple to use HTML parser that was very
fast. If you don't come write with the above URL, I have some release
archives I know contains the code. Just let me know and I can make it
available.


Regards,
  - Graeme -

-- 
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
http://fpgui.sourceforge.net/
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] Fast HTML Parser

2014-08-07 Thread Graeme Geldenhuys
On 2014-08-06 21:54, Marcos Douglas wrote:
> I know the tokens to search, but the HTML could be very different each other.
> I can't use a external tool. Need to be a application (that already exists).

It seems a copy of the Fast HTML Parser unit I spoke of has made its way
into the FPC source code tree.

See /packages/chm/src/fasthtmlparser.pas

Attached is the original one I got from powtils release. It includes the
parser, a utility unit and a demo program showing the parser in action
with some stats output.


Regards,
  - Graeme -

-- 
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
http://fpgui.sourceforge.net/


fasthtmlparser.tar.gz
Description: GNU Zip compressed data
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] Fast HTML Parser

2014-08-07 Thread Michael Schnell

On 08/06/2014 07:54 PM, Rainer Stratmann wrote:

It's not that difficult to write yourself.

In fact, my son once did write (using Delphi) a parser that creates a 
list of hierarchically linked objects from HTML code and also can write 
a HTML file from this structure.


So you can read a file, use straight forward programming to modify the 
content, and write it back.


As the HTML format is not very strict and is a moving target, the parser 
unit is far from perfect, but it is in daily use and does a rather nice 
job.


OTOH, I would not say it's fast, anyway :-( .

-Michael
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


[fpc-pascal] why dynamic array created in constructor not automatically free in destructor

2014-08-07 Thread Dennis Poon

I have a class like;

TMyClass=class
public
  ar : array of integer;
  constructor Create;
 destructor destroy;override;
end;

TMyClass.Create;
begin
   inherited;
   SetLength(ar, 100);
end;

TMyClass.Destroy;
begin
   ar := nil;//<--- this is needed otherwise a memory leak is reported!
  inherited;
end;
--

I would expect the compiler would automatically insert this ar := nil on 
my behalf because it seems like it does it for strings.

Am I missing some compiler directives?

Dennis



___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] best? safest? fastest?

2014-08-07 Thread Mark Morgan Lloyd

waldo kitty wrote:

On 8/6/2014 4:08 AM, Mark Morgan Lloyd wrote:

waldo kitty wrote:

i suspect this is going to be like the long-standing joke of

  cheap, fast, stable: choose two


over the years, i've seen two schools of code for dealing with 
dates... years,
specifically... one school is string based and the other is math 
based... both

have their faults and pluses...

eg: string based fault : prepend '19' to single digit year value
math based fault : 2003 - 1900 = 103 (3 is intended result)


I'd be inclined to start off using your method 1, i.e. text 
manipulation until

the format is consistent.


i don't understand "until the format is consistent"... the format has 
been in use since the 60s at least (AFAIK) ;)


What I mean is, while you're doing the initial processing to e.g. add 
century digits to the date and possibly to check number of decimal 
places etc.


FWIW: i have taken some time and reworked things to be math based while 
still taking the required text format into account... i've seen a very 
nice increase in processing speed and now need to just make sure that i 
don't run into any of the basic and well known flaws that math 
processing of date strings seem to have ;)


Flatten the original record and save it in a database, create a new 
flat text


this appears that you are speaking of a sql database or similar? that 
may be a later feature but for now, everything is/has to be done with 
the raw TLE files...


Databases, even for plain-text records, can be incredibly useful.

i'm not sure what you mean by "flatten", either... currently i break 
down the TLEs into their major records for storage in the in-memory 
""database""... the processing i posted is done before that storage 
takes place...


I was thinking that the first thing you could do was convert the two 
lines into a single one for processing, but on reflection it would be 
better to save the original with as little modification as possible- 
possibly with any accession info you had (i.e. what body had provided 
that particular TLE).


the goal of the program is to build the in-memory database from all 
specified TLE files and then to write out new TLE files which may be 
filtered on a selection property so that only certain matching TLE 
records are saved...


The problem there being that once the program stops you've then got to 
rebuild the next time. What I normally do when handling large bodies of 
tabular info is to either have a series of database tables or a series 
of text files, where ideally the text files are absolutely predictable 
(all fields a known length and appropriately padded).


What I'm normally looking for is rate-of-change over multiple records 
with irregular timestamps, which is an awkward job however it's done.


--
Mark Morgan Lloyd
markMLl .AT. telemetry.co .DOT. uk

[Opinions above are the author's, not those of his employers or colleagues]
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] why dynamic array created in constructor not automatically free in destructor

2014-08-07 Thread Jürgen Hestermann

Am 2014-08-07 10:21, schrieb Dennis Poon:
> TMyClass=class
> public
>   ar : array of integer;
>   constructor Create;
>  destructor destroy;override;
> end;
> TMyClass.Create;
> begin
>inherited;
>SetLength(ar, 100);
> end;
> TMyClass.Destroy;
> begin
>ar := nil;//<--- this is needed otherwise a memory leak is reported!
>   inherited;
> end;
> I would expect the compiler would automatically insert this ar := nil on my 
behalf because it seems like it does it for strings.
> Am I missing some compiler directives?


I think strings are a very special case because they are treated diffently in 
many cases.
Strings can be freed without harm because they are well defined and no parts of 
them
point to other objects on the heap. They have a reference counter where the 
compiler
logs how many instances are pointing to the string and only removes it when it 
has
reached zero.

If you have a dynamic array then elements may point to other structures (which 
again
may point to a structure that) which again need to be freed.
If you have allocated memory yourself then the compiler does not know
how (and when) to free these objects. They may be used in other arrays or 
elsewhere.


___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] why dynamic array created in constructor not automatically free in destructor

2014-08-07 Thread Michael Van Canneyt



On Thu, 7 Aug 2014, Jürgen Hestermann wrote:


Am 2014-08-07 10:21, schrieb Dennis Poon:

TMyClass=class
public
  ar : array of integer;
  constructor Create;
 destructor destroy;override;
end;
TMyClass.Create;
begin
   inherited;
   SetLength(ar, 100);
end;
TMyClass.Destroy;
begin
   ar := nil;//<--- this is needed otherwise a memory leak is reported!
  inherited;
end;
I would expect the compiler would automatically insert this ar := nil on my 

behalf because it seems like it does it for strings.

Am I missing some compiler directives?



I think strings are a very special case because they are treated diffently in 
many cases.
Strings can be freed without harm because they are well defined and no parts 
of them
point to other objects on the heap. They have a reference counter where the 
compiler
logs how many instances are pointing to the string and only removes it when 
it has

reached zero.

If you have a dynamic array then elements may point to other structures 
(which again

may point to a structure that) which again need to be freed.
If you have allocated memory yourself then the compiler does not know
how (and when) to free these objects. They may be used in other arrays or 
elsewhere.


The compiler frees dynamic arrays.

The following program:

Type
 TMyClass=class
 public
   ar : array of integer;
   constructor Create;
  destructor destroy;override;
 end;
 Constructor TMyClass.Create;
 begin
inherited;
SetLength(ar, 100);
 end;
Destructor TMyClass.Destroy;
 begin
//ar := nil;//<--- this is needed otherwise a memory leak is reported!
   inherited;
 end;


begin
  With TMyClass.Create do
Free;
end.

Reports 2 allocated blocks, and 2 free blocks. Exactly as you would expect:
home: >fpc -S2 -gh ta.pp
/usr/bin/ld: warning: link.res contains output sections; did you forget -T?
home: >./ta
Heap dump by heaptrc unit
2 memory blocks allocated : 432/432
2 memory blocks freed : 432/432
0 unfreed memory blocks : 0
True heap size : 294912
True free heap : 294912


What can happen is that you did somewhere a
  B:=MyClass.ar;

and as long as B is in scope, the reference count of A is not 0 when the object 
is freed, and the array is also not freed.
(the ar:=nil will not change that)

Another possibility is a forgotten 'inherited' somewhere in a destructor.

Michael.___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] why dynamic array created in constructor not automatically free in destructor

2014-08-07 Thread Jürgen Hestermann

Am 2014-08-07 11:32, schrieb Michael Van Canneyt:
> The compiler frees dynamic arrays.

Does this happen too if you declare a dynamic array
locally in a function and leave this function?

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] why dynamic array created in constructor not automatically free in destructor

2014-08-07 Thread leledumbo
> Does this happen too if you declare a dynamic array locally in a function
and leave this function? 

The global rule: as soon as the reference count reaches 0, the array gets
unallocated, wherever it's declared and as long as it's accessed normally
without any hack.



--
View this message in context: 
http://free-pascal-general.1045716.n5.nabble.com/why-dynamic-array-created-in-constructor-not-automatically-free-in-destructor-tp5719897p5719902.html
Sent from the Free Pascal - General mailing list archive at Nabble.com.
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] Fast HTML Parser

2014-08-07 Thread Marcos Douglas
On Wed, Aug 6, 2014 at 6:51 PM, Graeme Geldenhuys
 wrote:
> On 2014-08-06 21:54, Marcos Douglas wrote:
>> I know the tokens to search, but the HTML could be very different each other.
>> I can't use a external tool. Need to be a application (that already exists).
>
> Take a look at POWtils (aka PWU or PSP or Pascal Server Pages) created
> by somebody known as Z505. There has been various locations for the
> source code, but I think the latest is at:
>
>   https://code.google.com/p/powtils/
>
> It has (or at least had) a very simple to use HTML parser that was very
> fast. If you don't come write with the above URL, I have some release
> archives I know contains the code. Just let me know and I can make it
> available.

But the fasthtmlparser, your tip before, is a powtils' source, don't?
I have the code -- for many years -- but I did not know about
fasthtmlparser. It's very simple. I did not found everything I want
but it is a good start.

Best regards,
Marcos Douglas

PS: Like you I use FPC in real applications in production. So I have a
deadline - always short - to fulfill. So finding good code to help in
our projects is very good because it makes us save time. Thanks.
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] Fast HTML Parser

2014-08-07 Thread Marco van de Voort
In our previous episode, Marcos Douglas said:
> > It has (or at least had) a very simple to use HTML parser that was very
> > fast. If you don't come write with the above URL, I have some release
> > archives I know contains the code. Just let me know and I can make it
> > available.
> 
> But the fasthtmlparser, your tip before, is a powtils' source, don't?
> I have the code -- for many years -- but I did not know about
> fasthtmlparser. It's very simple. I did not found everything I want
> but it is a good start.

Yes it is. The CHM parser is also based on it, but there z505 is not listed
as author but as contributor:

 AUTHOR   : James Azarja
http://www.jazarsoft.com/

 CONTRIBUTORS : L505
http://z505.com


___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


[fpc-pascal] Scheduled Downtime of main FPC server

2014-08-07 Thread Michael Van Canneyt


Hello,

Saturday and sunday, the main FPC server will be out for a short time as mail, 
html (main FPC website and bugtracker)
and SVN services are moved to a new machine.

The plan is to move SVN/HTML first, (this should take little time) and move 
SMTP (mail) later in the weekend.

The WIKI, Forum and mailing lists are located on other machines, they will not 
be affected.

As soon as a service is moved, the appropriate DNS records will be updated.

I will post the new IP address as soon as the move is completed.

When the new SVN/HTML machine is up, we'll see about setting up an official git 
mirror on the new machine.

Michael.

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] Fast HTML Parser

2014-08-07 Thread luiz americo pereira camara
You can try http://www.benibela.de/sources_en.html#internettools

Luiz


2014-08-07 10:20 GMT-03:00 Marco van de Voort :

> In our previous episode, Marcos Douglas said:
> > > It has (or at least had) a very simple to use HTML parser that was very
> > > fast. If you don't come write with the above URL, I have some release
> > > archives I know contains the code. Just let me know and I can make it
> > > available.
> >
> > But the fasthtmlparser, your tip before, is a powtils' source, don't?
> > I have the code -- for many years -- but I did not know about
> > fasthtmlparser. It's very simple. I did not found everything I want
> > but it is a good start.
>
> Yes it is. The CHM parser is also based on it, but there z505 is not listed
> as author but as contributor:
>
>  AUTHOR   : James Azarja
> http://www.jazarsoft.com/
>
>  CONTRIBUTORS : L505
> http://z505.com
>
>
> ___
> fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
> http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
>
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal