input XSLT

2009-03-09 Thread CIF Search
Just as you have an xslt response writer to convert Solr xml response to make it compatible with any application, on the input side do you have an xslt module that will parse xml documents to solr format before posting them to solr indexer. I have gone through dataimporthandler, but it works in dat

Re: Querying DB indexed data

2009-03-09 Thread Ashish P
Hi Shalin, Got the answer. I had uniquekey defined in schema.xml but that was not present in any columns hence problem for indexing. Thanks a lot for your help buddy. Cheers, Ashish Ashish P wrote: > > yes I did full import. so previous docs are gone as you said. > But when I do http://localhos

Re: Querying DB indexed data

2009-03-09 Thread Ashish P
yes I did full import. so previous docs are gone as you said. But when I do http://localhost:8080/solr/dataimport I get following response - 0 0 - - my-dataConfig.xml idle - 1 119 0 2009-03-10 14:49:58 Indexing completed. Added/Updated: 0 documents. D

Re: indexing multiple schemas Vs extending existing schema

2009-03-09 Thread Otis Gospodnetic
Hi, If you don't need to search all of the data in a single query, use separate indices. You don't need to run separate Solr instances - you can simply use Solr multi-core functionality. Sticking different types of data into a single index and then searching only a subset could mess with you

Re: Querying DB indexed data

2009-03-09 Thread Shalin Shekhar Mangar
On Tue, Mar 10, 2009 at 11:01 AM, Ashish P wrote: > > Oh looks like some other big problem, Now I am not able to see other text > data I indexed before adding DB data to index... > Can not search any data...But I am sure I was able to search before adding > DB to index > Any pointers??? > > So yo

Re: Querying DB indexed data

2009-03-09 Thread Ashish P
Oh looks like some other big problem, Now I am not able to see other text data I indexed before adding DB data to index... Can not search any data...But I am sure I was able to search before adding DB to index Any pointers??? Shalin Shekhar Mangar wrote: > > On Tue, Mar 10, 2009 at 10:48 AM, A

Re: Querying DB indexed data

2009-03-09 Thread Shalin Shekhar Mangar
On Tue, Mar 10, 2009 at 10:48 AM, Ashish P wrote: > > > In schema xml, I have defined following... > > stored="true" /> > stored="true" /> > Thanks, > Ashish > > If you search for *:* from the admin, do you see documents with user_name name1 present? -- Regards, Shalin Shekhar

Re: Querying DB indexed data

2009-03-09 Thread Ashish P
In schema xml, I have defined following... Thanks, Ashish Shalin Shekhar Mangar wrote: > > On Tue, Mar 10, 2009 at 10:31 AM, Ashish P > wrote: > >> now I am able to view data that is indexed using URL >> http://localhost:8080/solr/admin/dataimport.jsp to see the data as >>

Re: Querying DB indexed data

2009-03-09 Thread Shalin Shekhar Mangar
On Tue, Mar 10, 2009 at 10:31 AM, Ashish P wrote: > now I am able to view data that is indexed using URL > http://localhost:8080/solr/admin/dataimport.jsp to see the data as > > - > user1 > > - > 0 > > - > CN=user1,OU=R&D > > > > But when I search user_name:user1 then the result is

Querying DB indexed data

2009-03-09 Thread Ashish P
Hi, I performed steps given in http://wiki.apache.org/solr/DataImportHandler to index data from database. the data-config.xml is now I am able to view data that is indexed using URL http://localhost:8080/solr/admi

Re: Embed my webapp in solr jetty

2009-03-09 Thread Noble Paul നോബിള്‍ नोब्ळ्
there is no harm in hosting Solr alongwith other webapps On Tue, Mar 10, 2009 at 5:14 AM, jlist9 wrote: > Is it a bad idea to embed my webapp in solr jetty? Or is it always > better to use a separate web server if I'm serving the result from a > web server? > > Thanks > -- --Noble Paul

Re: a new DIH manifestEnityProcessor

2009-03-09 Thread Noble Paul നോബിള്‍ नोब्ळ्
Hi Fergus open a JIRA issue anyway. put in your thoughts and we can refine the requirements as a part of the discussion. Basically the requirements are , 1)read a file line by line 2) filter out lines (include or exclude ) based on a regex 3) extract parts (named parts) from the line using another

Organizing POJO's in a heirarchy in Solr

2009-03-09 Thread Praveen_Kumar_J
Hi I just upload simple POJOs into Solr by creating custom types and dynamic fields in Solr schema as shown below, ... But I need to organize these POJOs in a hierarchy which can be navigated easily (something like explorer). Am not sure whether this feature is supported by

Re: DataImportHandler Robustness For Imports That Take A Long Time

2009-03-09 Thread Noble Paul നോബിള്‍ नोब्ळ्
I recommend writing a simple transformer which can write an entry into db after n documents (say 1000). and modify your query to take to consider that entry so that subsequent imports will start from there. DIH does not write the last_index_time unless the import completes successfully. On Tue, M

Re: DocSet implementation for around 300K documents - clarification regarding the memory size

2009-03-09 Thread Chris Hostetter
: I am curious what docset implementation would be chosen to store the : docset result. (Does it automatically select the right one based on : the density of the docset , for eg - if the number of set bits in : bitset is > 1/8th then may be storing as a BitDocSet might be ok - : but for storing a

Re: Solr and Zend Lucene

2009-03-09 Thread Chris Hostetter
: We will be using sqllite for db.This can be used for a cd version where we : need to provide search i'm not really sure how that answers grant's question -- what is it about running Solr that seems problematic and makes you want to put the index in a database? http://people.apache.org/~hoss

Re: Spatial search using R-tree for indexed bounding boxes

2009-03-09 Thread Chris Hostetter
: Patrick (of local lucene fame) thinks it is possible to do extent queries with : the cartesian grid method -- essentially you select the "best fit" level and : cell, and that should be set for anything within the extent. The advantage of : this approach is that it is super-fast and scaleable.

RE: Query Boosting using both BQ and BF

2009-03-09 Thread Dean Missikowski (Consultant), CLSA
I found similar results when trying to use negative boost with values < 1. Chris mentioned in this thread http://markmail.org/message/d2dc4oocrynx7wj2 a way to implement negative boost is to add positive boost to everything except the case you want to demote. So, -type:story^2.0. -Original

Re: up/down sides to using compound file format for index?

2009-03-09 Thread Yonik Seeley
The compound file format used to have more issues with multi-threaded contention when searching, but these have been fixed. When the underlying directory implementation is non-blocking (NIO), there shouldn't be a difference for searching. I know that improvements have been made on the indexing si

indexing multiple schemas Vs extending existing schema

2009-03-09 Thread Deo, Shantanu
Hi, We have had some success in indexing our catalog proof-of-concept project, Now we want to add information such as packages and accessories which have a slightly different schema. I am wondering if its possible to add multiple indexes in the same solr instance or should we think about redef

Embed my webapp in solr jetty

2009-03-09 Thread jlist9
Is it a bad idea to embed my webapp in solr jetty? Or is it always better to use a separate web server if I'm serving the result from a web server? Thanks

Re: DIH with a list of changed documents?

2009-03-09 Thread Otis Gospodnetic
Re file vs. URL - can't both be hidden behind an URL object (file:// vs. http:// schema)? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Fergus McMenemie > To: solr-user@lucene.apache.org > Sent: Monday, March 9, 2009 7:00:43 PM > Subje

Re: DIH with a list of changed documents?

2009-03-09 Thread Fergus McMenemie
>Le 09-mars-09 à 22:29, Fergus McMenemie a écrit : >>> how would I implement entity-processor if I were able to get the list >>> of recently changed documents of our sites? >> >> H, this sounds like a job for my manifestEnityProcessor >> see if you can find the thread titled:- >> >> "a new DI

Re: DIH with a list of changed documents?

2009-03-09 Thread Paul Libbrecht
Le 09-mars-09 à 22:29, Fergus McMenemie a écrit : how would I implement entity-processor if I were able to get the list of recently changed documents of our sites? H, this sounds like a job for my manifestEnityProcessor see if you can find the thread titled:- "a new DIH manifestEnityPro

Re: DIH with a list of changed documents?

2009-03-09 Thread Fergus McMenemie
>Hello List, > >how would I implement entity-processor if I were able to get the list >of recently changed documents of our sites? > >thanks for hints. > >paul > >Attachment converted: OSX:smime 65.p7s (/) (00213A09) H, this sounds like a job for my manifestEnityProcessor see if yo

DIH with a list of changed documents?

2009-03-09 Thread Paul Libbrecht
Hello List, how would I implement entity-processor if I were able to get the list of recently changed documents of our sites? thanks for hints. paul smime.p7s Description: S/MIME cryptographic signature

DataImportHandler Robustness For Imports That Take A Long Time

2009-03-09 Thread Chris Harris
I have a dataset (7M-ish docs each of which is maybe 1-100K) that, with my current indexing process, takes a few days or maybe a week to put into Solr. I'm considering maybe switching to indexing with the DataImportHandler, but I'm concerned about the impact of this on indexing robustness: If I u

Re: a new DIH manifestEnityProcessor

2009-03-09 Thread Fergus McMenemie
>Hi Fergus, >The idea is that we have something generic which can be applicable to >a large set of users. If the manifest is a text file it can be read in >somestandard way (say line by line). So we can have an EntityProcessor >which reads a text file line and filer it by a regex like the way >'gre

Re: Verbose(r) logging in DIH?

2009-03-09 Thread Noble Paul നോബിള്‍ नोब्ळ्
it is really not available. probably we can have a LogTransformer which can Log using slf4j On Mon, Mar 9, 2009 at 11:55 PM, Jon Baer wrote: > Hi, > > Is there currently anything in DIH to allow for more verbose logging? >  (something more than status) ... was there a way to hook in your own

Re: a new DIH manifestEnityProcessor

2009-03-09 Thread Noble Paul നോബിള്‍ नोब्ळ्
Hi Fergus, The idea is that we have something generic which can be applicable to a large set of users. If the manifest is a text file it can be read in somestandard way (say line by line). So we can have an EntityProcessor which reads a text file line and filer it by a regex like the way 'grep' wor

Verbose(r) logging in DIH?

2009-03-09 Thread Jon Baer
Hi, Is there currently anything in DIH to allow for more verbose logging? (something more than status) ... was there a way to hook in your own for debugging purposes? I can't seem to locate the options in the Wiki or remember if it was available. Thanks. - Jon

Re: a new DIH manifestEnityProcessor

2009-03-09 Thread Fergus McMenemie
>manifest processing has a very limited usecase. Why can't it be >processed using a PlainTextEntityProcessor and write a Tranformer to >read lines using regex? > Ehmmm Ok. The PlainTextEntityProcessor docs do not give me enough insight to see how this could be used to index each of the files listed

Re: Query Boosting using both BQ and BF

2009-03-09 Thread Peter Wolanin
This doesn't seem to match what I'm seeing in terms of using bq - using any value > 0 increases the score. For example, with no bq: solr title,score,type 2.2 1.6885357 Building a killer search for Drupal wikipage 1.5547959 New Solr module available for testing story

up/down sides to using compound file format for index?

2009-03-09 Thread Peter Wolanin
Trying to set up a server to host multiple Solr cores, we have run into the issue of too many open files a few times. The 2nd ed "Lucene in Action" book suggests using the compound file format to reduce the required number of files when having multiple indexes, but mentions a possible ~10% slow-do

Re: a new DIH manifestEnityProcessor

2009-03-09 Thread Noble Paul നോബിള്‍ नोब्ळ्
manifest processing has a very limited usecase. Why can't it be processed using a PlainTextEntityProcessor and write a Tranformer to read lines using regex? --Noble On Mon, Mar 9, 2009 at 8:30 PM, Fergus McMenemie wrote: > Hello, > > I have almost finished a new DIH EntityProcessor which > I a

Re: Re[2]: the time factor

2009-03-09 Thread sunnyfr
Hi Hoss, How come if bq doesn't influence what matches -- that's q -- bq only influence the scores of existing matches if they also match the bq when I put : as bq=(country:FR)^2 (status_official:1 status_new:1)^2.5 Ive no result if I put just bq=(country:FR)^2 Or bq=(status_official:1 stat

a new DIH manifestEnityProcessor

2009-03-09 Thread Fergus McMenemie
Hello, I have almost finished a new DIH EntityProcessor which I am calling the manifestEnityProcessor. It is designed around the idea that whatever demon is used to maintain your set of a few 100,000 xml documents it is likely to drop a report or log file explaining what has been changed within yo

Re: passing parameters into the XSLTResponseWriter: particularly hostname

2009-03-09 Thread Fergus McMenemie
>: I was wondering if there was a way of passing parameters into >: the XSLTResponseWriter writer. > >I don't think there's anyway to pass input in the traditional >sense, but you can set default/invariant params along with echoParams=all >to get the values you want into the XML doc itself wher

custom hitcollector example

2009-03-09 Thread Ron Chan
Hi Can someone point to or provide an example of how to incorporate a custom hitcollector when using Solr? Thanks Ron

Re: Really weird behabiour with the indexer with cronjobs

2009-03-09 Thread Marc Sturlese
Could this have something to do with the hardlinks of the snapshots? I mean... the snapcleaner removes the snapshots but... maybe somthing remains in there until tomcat is restarted? Marc Sturlese wrote: > > Hey there, > Something really weird happened with my indexer... I have an index of 2G >

multicore file path

2009-03-09 Thread Gargate, Siddharth
I am trying out multicore environment with single schema and solrconfig file. Below is the folder structure Solr/ conf/ schema.xml solrconfig.xml core0/ data/ core1/ data/ tomcat/ The solrhome property is set in tomcat as -Dsolr.solr.home=../.. And the solr.

Re: How can I configure different types in Solr?

2009-03-09 Thread Praveen_Kumar_J
Hi, Thanks for the reply. I did not get your idea. Am very new to Solr. Please give me some sample schema. Regards, Praveen Walter Underwood wrote: > > Or you can add a "type" field and filter on that. I do that with > type:movie and type:people. --wunder > > > On 3/6/09 9:10 AM, "Cheng Zha

muticore setup with tomcat

2009-03-09 Thread revas
Hi, I am trying to do amulticore set up.. I added the following from the 1.3 solr download to new dir called multicore core0 ,core1,solr.xml and solr.war in the tomcat context fragment i have defined as http://localhost:8080/multicore/admin http://localhost:8080/multicore/admin/core0 Th

Really weird behabiour with the indexer with cronjobs

2009-03-09 Thread Marc Sturlese
Hey there, Something really weird happened with my indexer... I have an index of 2G more or less and I am running cron jobs to keep it updated every 15 min (the size of the index keeps always the same more or less as I just update docs). Every time the update is done I optimize the index and send

Re: DataImportHandler that uses JNDI lookup

2009-03-09 Thread Noble Paul നോബിള്‍ नोब्ळ्
you can overide the init() method of JdbcDataSource say JndiJdbcDataSource and override the init() method. and use that like this " ensure that you initialize the field private Callable factory; we will have to make this field protected. can you raise an issue and we shall fix it soon --Noble

Re: Distributed search

2009-03-09 Thread Shalin Shekhar Mangar
On Mon, Mar 9, 2009 at 2:32 PM, Gargate, Siddharth wrote: > Hi, >I am trying distributed search and multicore but not able to fire a > query. I tried > > http://localhost:8080/solr/select/?shards=localhost:8080/solr/core0,localhost:8080/solr/core1&q=solr > I am getting following error: "M

RE: Distributed search

2009-03-09 Thread Gargate, Siddharth
Hi, I am trying distributed search and multicore but not able to fire a query. I tried http://localhost:8080/solr/select/?shards=localhost:8080/solr/core0,localhost:8080/solr/core1&q=solr I am getting following error: "Missing solr core name in path". Should I use particular core to fir

DataImportHandler that uses JNDI lookup

2009-03-09 Thread The Flight Captain
I am using the DataImporterHandler to get database connection details from a data-config.xml file. I would like to do a JNDI lookup to get database connection details from a container. So that when I deploy my embedded Solr Instance I don't have to deploy a data-config.xml for each environment. A

allowLeadingWildcard is possible or?

2009-03-09 Thread Julian Davchev
Hi, I am reading http://issues.apache.org/jira/browse/SOLR-218 and see leading wildcard search is not possible. Any pointers how I can implement/use search kinda search. Using solr 1.3

hi. allowLeadingWildcard is it possible or not yet?

2009-03-09 Thread Julian Davchev
Hi folks, I am reading this issue and from what I see it's not possible yet to search with first char wildcard. http://issues.apache.org/jira/browse/SOLR-218 Are there any workarounds or anyway at all I could allow such search. I looked into whole 2008,2009 mail archive but couldn't find anything.

Re: problem using dataimporthandler

2009-03-09 Thread Noble Paul നോബിള്‍ नोब्ळ्
this is what I got when I googled it http://forums.sun.com/thread.jspa?threadID=465472 are you sure it is not the same On Mon, Mar 9, 2009 at 10:59 AM, Ashish P wrote: > > Hi > Following is the data-config.xml > > >     driver="com.microsoft.sqlserver.jdbc.SQLServerDriver" > url="jdbc:sqlserver