Solr Error

2012-01-09 Thread Rohit
Hi, I am getting the following error when I try executing a query in my solr, am not able to figure our how to fix the issue. SEVERE: java.lang.ArrayIndexOutOfBoundsException: -1 at org.apache.lucene.util.packed.Packed64.get(Packed64.java:186) at org.apach

Re: Solr core as a dispatcher

2012-01-09 Thread shlomi java
If you want to randomly distribute requests across shards, then I think it's a case of Replication. In Replication setup, all cores have the same schema AND data, so query any core should return the same result. It is used to support heavy load. Of course such setup will required some kind of load

Re: Too many connections in CLOSE_WAIT state on master solr server

2012-01-09 Thread Ranveer
Hi, I am facing same problem. Did -Dhttp.maxConnections resolve the problem ? Please let us know! regards Ranveer On Thursday 15 December 2011 11:30 AM, samarth s wrote: Thanks Erick and Mikhail. I'll try this out. On Wed, Dec 14, 2011 at 7:11 PM, Erick Erickson wrote: I'm guessing (and

Getting started with indexing a database

2012-01-09 Thread Mike O'Leary
I am trying to index the contents of a database for the first time, and I am only getting the primary key of the table represented by the top level entity in my data-config.xml file to be indexed. The database I am starting with has three tables: The table called docs has columns called doc_id,

Re: Doing url search in solr is slow

2012-01-09 Thread yu shen
Hi Erick, I only added debugyQuery=on to the url, and did not do any configuration with regard to DebugComponent. Seems like 'string' type should be substituted with 'text' type. I will paste the result here after I did some experiments. Spark 2012/1/9 Erick Erickson > Do you by chance have t

Re: ignoreTikaException value

2012-01-09 Thread Koji Sekiguchi
(12/01/10 6:31), TRAN-NGOC Minh wrote: Last year a patch with an IgnoreTikaexception has been developped. My question is how could I change the IgnoreTikaexception flag value Just setting ignoreTikaException=true request parameter should work when you calling ExtractingRequestHandler. Or you c

Solr core as a dispatcher

2012-01-09 Thread Hector Castro
Hi, Has anyone had success with multicore single node Solr configurations that have one core acting solely as a dispatcher for the other cores? For example, say you had 4 populated Solr cores – configure a 5th to be the definitive endpoint with `shards` containing cores 1-4. Is there any ad

ignoreTikaException value

2012-01-09 Thread TRAN-NGOC Minh
Hello everybody I'm setting up a manifoldcf-solr system in order to index documents. I was able to configure solr and manifold to work together. I'm now facing a problem which was solved last year but was not able to apply the solution given in the mailing list. Many document in my repository

Re: querying all data

2012-01-09 Thread Leonardo Souza
Thanks Emmanuel! -- Leonardo S Souza 2012/1/9 Emmanuel Espina > *:* is parsed as a MatchAllDocsQuery and * es a wilcard query on the > default search field. The matchalldocuments does just that, and the * > has to resolve the wilcard (that is building a automaton query in > newer versions of

Shards and (distributed) external file field

2012-01-09 Thread Markus Jelsma
Hi, Are there plans for bringing distributed capabilities for the external file field? I've not seem any hints for this in the work in distributed indexing, nor on the wiki or elsewhere. Will we be able to send a very large file and have it sliced up and have the values sent to the designated s

Re: querying all data

2012-01-09 Thread Emmanuel Espina
*:* is parsed as a MatchAllDocsQuery and * es a wilcard query on the default search field. The matchalldocuments does just that, and the * has to resolve the wilcard (that is building a automaton query in newer versions of Lucene). Also if a document has the default field empty that document will n

Re: Match raw query string

2012-01-09 Thread Emmanuel Espina
No, omitTermFreqAndPositions and omitNorms parameter must be set in the definition of the field in the schema.xml. (in the example config is shown). Have you analyzed the scoring information produced by debugQuery=true? Add that to the query parameters. That will produce information for each docum

Re: How do I go about adding a score attribute to a field

2012-01-09 Thread Erick Erickson
Have you looked at payloads? Best Erick On Mon, Jan 9, 2012 at 2:10 PM, wrote: > Hi All: >   I have been using Solr for a few months now. however I have ran into a > situation where now I need to have additional values (like score)  to a > multivalued field. > for example: >  field def : >  

RE: Multiple dataimport processes to same core?

2012-01-09 Thread Dyer, James
We do this in production and haven't had any issues. This is a 1.4.1 installation, back when there was no "threads" option in DIH. We divide the index into 8 parts and then run 8 DIH handlers at the same time, indexing simultaneously. While Lucene itself is a bottleneck, we have a lot of data

querying all data

2012-01-09 Thread Leonardo Souza
What's the difference from *:* to * when querying solr core? Using * takes longer and do not match all documents. Is it that right? thanks! -- Leonardo S Souza

RE: Match raw query string

2012-01-09 Thread McCarroll, Robert
The query comes off of the search page looking like: :/solr_/select?q=Budget%20Examiner%2FBudget%20Examiner%20%28Public%20Finance%29&hl=true&hl.fragsize=200&wt=json&start=0 And the solrconfig section for the parser in use looks like: dismax explicit 0.01 titl

Multiple dataimport processes to same core?

2012-01-09 Thread Shawn Heisey
Is it safe or advisable to run multiple dataimport handler requests on one Solr core simultaneously? Thanks, Shawn

How do I go about adding a score attribute to a field

2012-01-09 Thread ramdev.wudali
Hi All: I have been using Solr for a few months now. however I have ran into a situation where now I need to have additional values (like score) to a multivalued field. for example: field def : For each of the values, there is a corresponding score that I need to keep track of. The b

How do I go about adding a score attribute to a field

2012-01-09 Thread ramdev.wudali
Hi All: I have been using Solr for a few months now. however I have ran into a situation where now I need to have additional values (like score) to a multivalued field. for example: field def : For each of the values, there is a corresponding score that I need to keep track of. Th

Re: Match raw query string

2012-01-09 Thread Emmanuel Espina
How are you building your query? For your case it appears that the edismax query parser should solve it A good solution to this kind of problem involves: Storing norms (omitNorms=false) in the fields to search Storing the position of the terms (omitTermFreqAndPositions=false) in the fields to sear

Match raw query string

2012-01-09 Thread McCarroll, Robert
We're in the process of implementing solr to search our web site, and have run into a response tuning issue. When a user searches for a string which is an exact match of a document title, for example "Budget Examiner/Budget Examiner(Public Finance)", the number of hits in the body of much longer

Re: issues with WordDelimiterFilter

2012-01-09 Thread Steven Fuchs
Thanks for the reply On Dec 30, 2011, at 6:04 PM, Chris Hostetter wrote: > > : I'm having an issue with the way the WordDelimiterFilter parses compound > : words. My field declaration is simple, looks like this: > : > : > : > : preserveOriginal="1"/> > : > :

Re: xpathentityprocessor with flattern true

2012-01-09 Thread vrpar...@gmail.com
am i making any mistake with xpathentityprocessor? i am using solr 1.4 please help me to solve this problem? Thanks & Regards, Vishal Parekh -- View this message in context: http://lucene.472066.n3.nabble.com/xpathentityprocessor-with-flattern-true-tp3637928p3645013.html Sent from the Solr

Re: Doing url search in solr is slow

2012-01-09 Thread Erick Erickson
Do you by chance have the debugQuery on by default? Because if you look down in the "timing" section, you can see the times the various components took to do their work, there are two sections "prepare" and "process". The cumulative time is 17.156 seconds. Of which 17.156 seconds is reported to be

Re: complex keywords, hierarchical data, Solr representation problem

2012-01-09 Thread Erick Erickson
Be a little careful when looking at the index files on disk, see: http://lucene.apache.org/java/3_5_0/fileformats.html#file-names One issue is that you can pretty much ignore the *.fdt and *.fdx files when thinking about the amount of RAM you need. These files have to do with stored data and reall

Re: Doing url search in solr is slow

2012-01-09 Thread yu shen
Hi Erick, Thanks for you reply. Actually I did the following search: survey_url:http\://www.someurl.com/sch/i.html* referal_url:http\:// www.someurl.com/sch/i.html* page_url:http\://www.someurl.com/sch/i.html* I did not prepend any asterisk to the field value, but only append to them. I analyze

Re: Highlight with multi word synonyms

2012-01-09 Thread O. Klein
Thanx! Not looking at Lucene project I totally missed that. Keep up the good work. -- View this message in context: http://lucene.472066.n3.nabble.com/Highlight-with-multi-word-synonyms-tp3610466p3644729.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Highlight with multi word synonyms

2012-01-09 Thread Koji Sekiguchi
(12/01/09 21:51), O. Klein wrote: Koji, maybe you missed my confirmation due to the hijacking of the thread. I am using Solr 4.0 and after reindexing with LUCENE_33 I got the behaviour for highlighting I want. So yeah, I can confirm this is a bug. Looking forward to a fix :) The fix of the p

Re: complex keywords, hierarchical data, Solr representation problem

2012-01-09 Thread jimmy
Thanks for the fast reply. I went with your suggestion and saved the full category path as well the category_id as integer. I also tested the index space consumption and it was less than I thought. So, if i only store the category_id as an integer I have a full index size of 246MB. With the full c

Re: Missing query operators?

2012-01-09 Thread Tomás Fernández Löbbe
Hi Mike, > - exact match (disabling stemming): Ideally, users need a way of turning > this on or off for terms in their query (e.g. [ =walking running ] would > stem the word running, but not walking). > Correct, there is no way to do this with Solr just by activating/deactivating one parameter.

Re: Doing url search in solr is slow

2012-01-09 Thread Erick Erickson
Yu Shen & Arian: We can't help much without more information. In particular, how are the fields in question analyzed? What is the result of looking at the admin/analysis page? What do you get when you attach &debugQuery=on to the query? You might review: http://wiki.apache.org/solr/UsingMailingLi

CoreAware FilterFactory

2012-01-09 Thread Wojciech Gruszczyk
Hello, I need to write a custom solr FilterFactory which needs information about core against which it is registered (I assume multicore environment). For some reason I'm disallowed to implement SolrCoreAware from FilterFactory. Is it somehow possible to obtain the core from constructor/init me

Re: Highlight with multi word synonyms

2012-01-09 Thread O. Klein
Koji, maybe you missed my confirmation due to the hijacking of the thread. I am using Solr 4.0 and after reindexing with LUCENE_33 I got the behaviour for highlighting I want. So yeah, I can confirm this is a bug. Looking forward to a fix :) Koji Sekiguchi wrote > > (11/12/24 21:20), O. Klein

Re: Doing url search in solr is slow

2012-01-09 Thread François Schiettecatte
About the search 'referal_url:*www.someurl.com*', having a wildcard at the start will cause a dictionary scan for every term you search on unless you use ReversedWildcardFilterFactory. That could be the cause of your slowdown if you are I/O bound, and even if you are CPU bound for that matter.

Re: stopwords as privacy measure

2012-01-09 Thread Erik Hatcher
Mike - Indeed users won't be able to *search* for things removed by the stop filter at index time (the terms literally aren't in the index then). But be careful with the stored value. Analysis does not affect stored content. Are you anonymizing before sending to Solr (if so, why stop-word blo

Re:Re: how to avoid OOM while merge index

2012-01-09 Thread James
Sinece the hadoop task monitor will check each task, and when find it consume to much memory, then it will kill the task, so I am currently want to find a method to decrease the mem usage at solr side, any idea? At 2012-01-09 17:07:09,"Tomas Zerolo" wrote: >On Mon, Jan 09, 2012 at 01:29:39PM +08

Re: how to avoid OOM while merge index

2012-01-09 Thread Tomas Zerolo
On Mon, Jan 09, 2012 at 01:29:39PM +0800, James wrote: > I am build the solr index on the hadoop, and at reduce step I run the task > that merge the indexes, each part of index is about 1G, I have 10 indexes to > merge them together, I always get the java heap memory exhausted, the heap > size i

Re: how to avoid OOM while merge index

2012-01-09 Thread Ralf Matulat
A quick guess: If you are using tomcat for example, be sure to grand unlimited virtual memory to that process, e.g. putting "ulimit -v unlimited" in your tomcat-init script (if you're using Linux). Am 09.01.2012 06:29, schrieb James: I am build the solr index on the hadoop, and at reduce step

Missing query operators?

2012-01-09 Thread Michael Lissner
Hi, I'm setting up a search system that I expect lawyers to use, and I know they're demanding about the query operators they want. I've been looking around a bit, and while some of these are possible on the backend, I can't see how to enable them on the front end since they lack operators:

how to avoid OOM while merge index

2012-01-09 Thread James
I am build the solr index on the hadoop, and at reduce step I run the task that merge the indexes, each part of index is about 1G, I have 10 indexes to merge them together, I always get the java heap memory exhausted, the heap size is about 2G also. I wonder which part use these so many memory.