An UpdateHandler to run following a MySql DataImport

2013-11-14 Thread Dileepa Jayakody
Hi All, I have written a custom update request handler to do some custom processing of documents and configured the /update handler to use my custom handler in the default: update.chain. The same requestHandler should be configured for the data-import-handler when it loads documents to solr index

Re: exceeded limit of maxWarmingSearchers ERROR

2013-11-14 Thread Loka
Hi Erickson, Thanks for your reply, basically, I used commitWithin tag as below in solrconfig.xml file dedupe true id false name,features,cat org.apache.solr.update.processor.Lookup3Signature

Solr spatial search within the polygon

2013-11-14 Thread Dhanesh Radhakrishnan
Hi, I'm experimenting with solr spatial search, with plotting points in the map (Latitude and longitude) and based on the value I need to get the result. As the first step I've defined the filed type as And then added the field *location* as type *location_rpt* Indexed the location filed as $

Re: Document Security Model Question

2013-11-14 Thread Rajani Maski
Hi, For the case: *"it requires *constant reindexing if a value in this field changes" If the acl for documents keep changing, Solr PostFilter is one of the option. We use it in our system. We have almost near to billion documents and 5000 approx users. But it is important to check whether the

Re: queries including time zone

2013-11-14 Thread Eric Katherman
We're still not seeing the proper result.I've included a gist of the query and its debug result. This was run on a clean index running 4.4.0 with just one document. That document has a date of 11/15/2013 yet the date in the included TZ it is the 14th but I still get that document returned.

Re: SOLR DIH not indexing NFS share

2013-11-14 Thread tegryan
Hi Erick, I appreciate the answer. I just found out that it's failing on a .mov file with that error. I also noticed that I load the log4j.jar's twice, so I'm wondering if the wrong class loader is loading the logging and that's why it's giving me an unhelpful message. I've excluded .mov files

Re: SOLR DIH not indexing NFS share

2013-11-14 Thread Erick Erickson
At a quick glance at the very first error: java.lang.ClassCastException: java.lang.NoClassDefFoundError cannot be cast to java.lang.Exception Looks like you have some weird jars in your classpath and/or are using a strange version of Java. But that's just a guess. Erick On Thu, Nov 14, 2013 at

Document Security Model Question

2013-11-14 Thread kchellappa
I had earlier posted a similar discussion in LinkedIn and David Smiley rightly advised me that solr-user is a better place for technical discussions -- Our product which is hosted supports searching on educational resources. Our customers can choose to make specifi

Re: Date range faceting with various gap sizes?

2013-11-14 Thread Chris Hostetter
: I'm experimenting with date range faceting, and would like to use : different gaps depending on how old the date is. But I am not sure on : how to do that. What you are trying to do is possible, but the SolrJ helper methods you are using predates the ability and doesn't currently work the w

Group and Field Collapsing in SOLR "More like this"

2013-11-14 Thread balaji
Hi I have two types of profile : Shadow and DO and I am trying to use MLT to bring related recommendation of a userID In the result I get both the types but I want to restrict the results of document through a field (type) I pass it on. Currently grouping and field collapsing does not seem to wo

RE: My setup - init script and other info

2013-11-14 Thread Boogie Shafer
its worth pointing out there are init scripts for jetty which can be pulled from its regular distribution site and added to a solr installation with only minor modifications i do this with my rpm build process (i just pushed the updates for 4.5.1 release) https://github.com/boogieshafer/jetty-

Re: queries including time zone

2013-11-14 Thread Chris Hostetter
I've beefed up the ref guide page on dates to include more info about all of this... https://cwiki.apache.org/confluence/display/solr/Working+with+Dates -Hoss

Re: My setup - init script and other info

2013-11-14 Thread Shawn Heisey
On 11/14/2013 7:43 AM, Erick Erickson wrote: Shawn: Would you be willing to put this on the Wiki? I think it'd be really useful to have it there... I'm pretty sure you have edit rights to the wiki, but they're free for the asking if not... Done. To make it more obvious that it's not an offic

Re: Boosting documents by categorical preferences

2013-11-14 Thread Chris Hostetter
: I have a question around boosting. I wanted to use the &boost= to write a : nested query that will boost a document based on categorical preferences. You have no idea how stoked I am to see you working on this in a real world application. : Currently I have the weights set to the z-score equi

SOLR DIH not indexing NFS share

2013-11-14 Thread tegryan
I have SOLR with DIH using TIKA running fine on a local directory. It imports the data fine. I need it to work on an NFS mounted directory however, and it fails when I change it to use that. The tomcat6 user has access to the NFS mount (ls returns all files any way). The mount is NFS v3, if that ma

Re: facet method=enum and uninvertedfield limitations

2013-11-14 Thread Yonik Seeley
On Thu, Nov 14, 2013 at 12:03 PM, Lemke, Michael SZ/HZA-ZSW wrote: > I am running into performance problems with faceted queries. > If I do a > > q=word&facet.field=CONTENT&facet=true&facet.limit=10&facet.mincount=1&facet.method=fc&facet.prefix=a&rows=0 > > I am getting an exception: > org.apache

Re: Using data-config.xml from DIH in SolrJ

2013-11-14 Thread P Williams
Hi, I just discovered UpdateProcessorFactory in a big way. How did this completely slip by me? Working on two ideas. 1. I have used the DIH in a local EmbeddedSolrServer previously. I could writ

Re: queries including time zone

2013-11-14 Thread Chris Hostetter
: Can anybody provide any insight about using the tz param? The behavior : of this isn't affecting date math and /day rounding. What format does : the tz variables need to be in? Not finding any documentation on this. it's not "tz" it's "TZ" The input/output format is always in UTC, but TZ w

facet method=enum and uninvertedfield limitations

2013-11-14 Thread Lemke, Michael SZ/HZA-ZSW
I am running into performance problems with faceted queries. If I do a q=word&facet.field=CONTENT&facet=true&facet.limit=10&facet.mincount=1&facet.method=fc&facet.prefix=a&rows=0 I am getting an exception: org.apache.solr.common.SolrException: Too many values for UnInvertedField faceting on fie

Re: Query on multi valued field

2013-11-14 Thread Jack Krupansky
s/work/word/ "word delimiter filter" -- Jack Krupansky -Original Message- From: Jack Krupansky Sent: Thursday, November 14, 2013 11:34 AM To: solr-user@lucene.apache.org Subject: Re: Query on multi valued field I suppose you could define the field as tokenized text with the work deli

Re: Query on multi valued field

2013-11-14 Thread Jack Krupansky
I suppose you could define the field as tokenized text with the work delimiter filter and with autogeneratePhraseQueries="false" and the default query operator set to OR, and then queries might just work close enough to what you want. Otherwise... You could do a custom update processor that p

Re: Optimizing cores in SolrCloud

2013-11-14 Thread Walter Underwood
Earlier, you said that optimize is the only way that deleted documents are expunged. That is false. They are expunged when the segment they are in is merged. A forced merge (optimize) merges all segments, so will expunge all deleted document. But those documents will be expunged by merges eventu

Re: Solr xml img parsing exception

2013-11-14 Thread Jack Krupansky
The actual error appears to be: Caused by: org.xml.sax.SAXParseException; lineNumber: 91; columnNumber: 105; The element type "img" must be terminated by the matching end-tag "". So, check the input document at line 91, column 105. There should be an tag there, but SAX is complaining that ther

Re: Solr xml img parsing exception

2013-11-14 Thread Marcello Lorenzi
Hi Erik, but in this case the custom loader receives an HTTP Error 500 by SOLR? Thanks, Marcello On 11/14/2013 04:29 PM, Erik Hatcher wrote: Also there's a custom loader here that is the culprit: com.lsegroup.solr.handler.CwsExtractingDocumentLoader On Nov 14, 2013, at 10:20, Erick Erickson

Re: Query on multi valued field

2013-11-14 Thread Upayavira
On Thu, Nov 14, 2013, at 03:45 PM, giridhar wrote: > Hi, > > I want to search in a multivalued field. > > For example, my field FormIds contains (1,2,3) as comma separated. > > If i search for 1 or (1,2) or (1,3) or (2,3) or (1,2,3) any combination > like > this should work. > > How to define t

Document routing question.

2013-11-14 Thread yriveiro
Hi, I read this post http://searchhub.org/2013/06/13/solr-cloud-document-routing and I have some questions. When a tenant is too large to fit on one shard, we can specify the number of bit from the shardKey that we want to use. If we set a doc's key as "tenant1/4!docXXX" we are saying to spread

Re: Query on multi valued field

2013-11-14 Thread giridhar
Hi, I want to search in a multivalued field. For example, my field FormIds contains (1,2,3) as comma separated. If i search for 1 or (1,2) or (1,3) or (2,3) or (1,2,3) any combination like this should work. How to define this multivalued integer field type. Thankyou. -- View this message in

Re: Optimizing cores in SolrCloud

2013-11-14 Thread michael.boom
Thanks Erick! That's a really interesting idea, i'll try it! Another question would be, when does the merging actually happens? Is it triggered or conditioned by something? Currently I have a core with ~13M maxDocs and ~3M deleted docs, and although I see a lot of merges in SPM, deleted documents

Re: Solr xml img parsing exception

2013-11-14 Thread Erik Hatcher
Also there's a custom loader here that is the culprit: com.lsegroup.solr.handler.CwsExtractingDocumentLoader On Nov 14, 2013, at 10:20, Erick Erickson wrote: > It looks like bad data. The XML you're sending to Solr looks mal-formed, so > I > suspect this is completely outside of Solr's purview

Re: Solr xml img parsing exception

2013-11-14 Thread Erick Erickson
It looks like bad data. The XML you're sending to Solr looks mal-formed, so I suspect this is completely outside of Solr's purview. Best, Erick On Thu, Nov 14, 2013 at 9:26 AM, Marcello Lorenzi wrote: > Hi, > I have installed a Solr 4.3 instance and we have configured manifoldcf to > pass web c

Re: solrcloud - forward update to a shard failed

2013-11-14 Thread Erick Erickson
Here's a writeup on the interactions between a number of the parameters for soft/hard commits, NRT, and transaction logs. FWIW. http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ Best, Erick On Thu, Nov 14, 2013 at 8:22 AM, Aileen wrote: > Thank

Re: Optimizing cores in SolrCloud

2013-11-14 Thread Erick Erickson
I'm going to answer with something completely different First, though, optimization happens in the background, so it shouldn't have too big an impact on query performance outside of I/O contention. There also "shouldn't" be any problem with one shard being optimized and one not. Second, have

Re: exceeded limit of maxWarmingSearchers ERROR

2013-11-14 Thread Erick Erickson
CommitWithin is either configured in solrconfig.xml for the or tags as the maxTime tag. I recommend you do use this. The other way you can do it is if you're using SolrJ, one of the forms of the server.add() method takes a number of milliseconds to force a commit. You really, really do NOT want

Re: queries including time zone

2013-11-14 Thread Erick Erickson
IMO you will save yourself endless grief just biting the bullet and working with UTC at all times. The instant you have uses in even adjacent but different time zones, you'll have to deal with this anyway. FWIW, Erick On Thu, Nov 14, 2013 at 12:26 AM, Jack Krupansky wrote: > I believe it is the

Re: field collapsing performance in sharded environment

2013-11-14 Thread Erick Erickson
bq: Of the 10k docs, most have a unique near duplicate hash value, so there are about 10k unique values for the field that I'm grouping on. I suspect (but don't know the grouping code well) that this is the issue. You're getting the top N groups, right? But in the general case, you can't insure

Re: Using data-config.xml from DIH in SolrJ

2013-11-14 Thread Erick Erickson
There's nothing that I know of that takes a DIH configuration and uses it through SolrJ. You can use Tika directly in SolrJ if you need to parse structured documents though, see: http://searchhub.org/2012/02/14/indexing-with-solrj/ Yep, you're going to be kind of reinventing the wheel a bit I'm af

Re: Atomic Update at Solrj For a Newly Added Schema Field

2013-11-14 Thread Erick Erickson
I don't think this is a problem, what are you seeing? Have you tried it and get an error? The only reason you need to have fields stored is so _existing_ documents with _existing_ data gets into the new doc. Since you've just added a field, you should be fine. It's just that updating documents alr

Re: My setup - init script and other info

2013-11-14 Thread Erick Erickson
Shawn: Would you be willing to put this on the Wiki? I think it'd be really useful to have it there... I'm pretty sure you have edit rights to the wiki, but they're free for the asking if not... Erick On Wed, Nov 13, 2013 at 1:07 PM, Shawn Heisey wrote: > In the hopes that it will help someo

Solr xml img parsing exception

2013-11-14 Thread Marcello Lorenzi
Hi, I have installed a Solr 4.3 instance and we have configured manifoldcf to pass web content to the shard collection, but during the crawling we have noticed a lot of this exception: ERROR - 2013-11-14 15:13:57.954; org.apache.solr.common.SolrException; org.apache.solr.common.SolrException:

Solr Release Management Process

2013-11-14 Thread Furkan KAMACI
Hi; I've asked the same question at dev-list but I could not get an answer. This question is related to Solr contributers too and I wanted to ask it here. solr-user list. My question was that: "I've resolved 2 issues last week. One of them is created by me and one of them was an existence issue.

Re: Updating Document Score With Payload of Multivalued Field?

2013-11-14 Thread Furkan KAMACI
Any ideas? 2013/11/13 Furkan KAMACI > PS: I use Solr 4.5.1 > > > 2013/11/13 Furkan KAMACI > >> Here is my case; >> >> I have a field at my schema named *elmo_field*. I want that *elmo_field* >> should >> have multiple values and multiple payloads. i.e. >> >> dorothy|0.46 >> sesame|0.37 >> big

Re: solrcloud - forward update to a shard failed

2013-11-14 Thread Aileen
Thanks Michael. Followed your advice - no commits from indexing clients; let auto commit takes care of things. It worked, so far no errors. The config params needs some more tweaking to get the right balance, specifically maxTime, maxDocs and the soft commit interval, but otherwise sold is a

Re: Solr Synonym issue

2013-11-14 Thread Rafał Kuć
Hello! Could you please describe the issue you are having? -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ Hi Team, I had implemented solr with my magento enterprise edition. I am trying to implemented syno

RE: distributed search is significantly slower than direct search

2013-11-14 Thread Elran Dvir
Hi, We tried returning just the id field and got exactly the same performance. Our system is distributed but all shards are in a single machine so network issues are not a factor. The code we found where Solr is spending its time is on the shard and not on the routing core, again all shards are

Re: Thought exercise: features for Solr client

2013-11-14 Thread Michael Sokolov
I think there is a place for a client-side query hierarchy. It would be nice if you could build a Lucene Query and the Solr client would serialize it for you. If there were a general-purpose query serialization library then you could support a similar programming model for Lucene-only and wit

Re: Thought exercise: features for Solr client

2013-11-14 Thread Alvaro Cabrerizo
Here goes my wishlist: - Transaction management - Access control at document level Regards. On Thu, Nov 14, 2013 at 10:35 AM, Alexandre Rafalovitch wrote: > Hello, > > I am trying to imagine what would a new, fresh, Solr client library look > like. There has been a number of features add

Optimizing cores in SolrCloud

2013-11-14 Thread michael.boom
A few weeks ago optimization in SolrCloud was discussed in this thred: http://lucene.472066.n3.nabble.com/SolrCloud-optimizing-a-core-triggers-optimization-of-another-td4097499.html#a4098020 The thread was covering the distributed optimization inside a collection. My use case requires manually run

Configure maxConnectionsPerHost

2013-11-14 Thread yriveiro
Hi, Where can I configure the maxConnectionsPerHost on Solr? I'm using Solr 4.5.1 with the old style of solr.xml (I have a lot of collections and switch to the new style of solr.xml is too much work) - Best regards -- View this message in context: http://lucene.472066.n3.nabble.com/Config

Re: exceeded limit of maxWarmingSearchers ERROR

2013-11-14 Thread Loka
Hi Naveen, Iam also getting the similar problem where I do not know how to use the commitWithin Tag, can you help me how to use commitWithin Tag. can you give me the example -- View this message in context: http://lucene.472066.n3.nabble.com/exceeded-limit-of-maxWarmingSearchers-ERROR-tp3252844

Thought exercise: features for Solr client

2013-11-14 Thread Alexandre Rafalovitch
Hello, I am trying to imagine what would a new, fresh, Solr client library look like. There has been a number of features added to Solr recently, so some of the older libraries do not necessarily support them as well (e.g. multi-collections, soft commits, multiple handler end-points, schema auto-d