Since you only try to index your client's pages, I think it should be doable
to use regular expressions or similar to find out the meta info. Or you can
ask your clients to expose some XML or RSS that you can process more easily.
But still, accessing database directly will save you tons of time to
>Why not index their database directly?
I should have provided about this in my first mail. Anyway, clients are ready
to allow for indexing their DB, but they have some confidential data as well as
information about their clients and all data are so much tightly coupled, it is
difficult for them
Why not index their database directly?
--
Chris Lu
-
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Database Search in 3 minutes:
http://wiki.dbsight.com/index.php?title=Create_Lucene_Databa
I was just looking into couple of search engines like indeed.com or bixee.com
and I really got surprised the accuracy of information they have built in their
indexes and also they provide for search result.
I have same sort of requirement to build indexes for all my cleints site and
provide