Re: DataImportHandler not indexing all the records

2008-11-15 Thread Ahmed Hammad
Thanks Shalin, I have added a new field type in my schema as following: and added my field After restarting Solr, and making full-import, everything work just fine. Many thanks. Regards, Ahmed My best wishes, Regards, Ahmed Hammad On Sat, Nov 1

Re: DataImportHandler not indexing all the records

2008-11-15 Thread Jon Baer
Ive also had the same issues here but when trying to switch to HTMLStripWhitespaceTokenizerFactor I found that it only removes the tags but when it comes to all forms of javascript includes in a document it keeps it all intact so I ended up w/ scripts in the document text, is there any easy

Re: DataImportHandler not indexing all the records

2008-11-15 Thread Shalin Shekhar Mangar
I think the problem is that DIH catches Exception but not Error so a StackOverFlowError will slip past it. Normally, the SolrDispatchFilter will log such errors but the import is performed in a new thread, so the error is not logged anywhere. However, DIH will not commit documents in this case (and

Re: DataImportHandler not indexing all the records

2008-11-15 Thread Ahmed Hammad
I had a similar problem like Giri. I have 17,000 record in one table and DIH can import only 12464. After some investigation, I found my problem. I have a regular expression to strip off html tags form input text, as following: The DIH RegEx have stack overflow on the record 17,000 due to erro

Re: DataImportHandler not indexing all the records

2008-11-14 Thread Noble Paul നോബിള്‍ नोब्ळ्
There is no obvious problem I can be reasonably sure that the query select * from climatedata.ws_record limit 100 would have fetched only 615360 rows. This is a very reliable pice of information 615360 On Sat, Nov 15, 2008 at 12:41 AM, Giri <[EMAIL PROTECTED]> wrote: > Hi Noble, > thanks f

Re: DataImportHandler not indexing all the records

2008-11-14 Thread Giri
Hi Noble, thanks for the help, here are the details: the field "id" is unique, when I did a select distinct(id), it returned 1 million rows. --- db-data-config.xml note: I limit the resultset to 1 million in the select query -

Re: DataImportHandler not indexing all the records

2008-11-12 Thread Noble Paul നോബിള്‍ नोब्ळ्
the fact that it got committed in the end suggests there was no error in between look at the status url and see the no:of rows returned etc. It gives a clue as to what would have really happened. or you can paste your dataconfig and status xmls and we may be able to suggest something On Thu, Nov

Re: DataImportHandler not indexing all the records

2008-11-12 Thread Giri
Hi Noble, thanks for reply, my comments are below >>why is the id field multivalued? I was just trying various options, yes, this ID is unique, and I check for duplicates, when I did a distinct (id) query to the MySQL database, it returned almost 2 million. >> look at the status host:post/dataim

Re: DataImportHandler not indexing all the records

2008-11-11 Thread Noble Paul നോബിള്‍ नोब्ळ्
why is the id field multivalued? is there a uniqueKey in the schema ? Are you sure there are no duplicates? look at the status host:post/dataimport gives you the status it can give you some clue --Noble On Wed, Nov 12, 2008 at 4:53 AM, Giri <[EMAIL PROTECTED]> wrote: > Hi, > > I have about ~ 2

DataImportHandler not indexing all the records

2008-11-11 Thread Giri
Hi, I have about ~ 2 million records in a mySQL database table (about 9 fields from a single table), and I am trying to load it to the solr using DataImportHandler using the command=full-import option. it only indexed about 615360 records out of 2 millions. here is my db-data-config.xml