I'm pretty sure that order doesn't matter. Again, though, don't
worry about this level of trick until you can demonstrate
performance issues, your time is usually best spent in other
places
Best
Erick
On Thu, Apr 1, 2010 at 11:54 PM, wrote:
> Hello Erick,
>
> I was trying to optimise the se
I have this and the heap dump is 63mb zipped. The info stream is much smaller
(31 kb zipped), but I don't know how to get them to you.
We are not using the NRT readers
-Original Message-
From: Michael McCandless [mailto:luc...@mikemccandless.com]
Sent: Thursday, April 01, 2010 5:21 P
This code is in fact working. I had an error in my test case. Things seem
to work as advertised.
sorry / thanks -
C>T>
On Fri, Apr 2, 2010 at 10:20 AM, Christopher Tignor wrote:
> Hello,
>
> I'm having a hard time implementing / understanding a very simple custom
> scoring situation.
>
> I ha
Hello,
I'm having a hard time implementing / understanding a very simple custom
scoring situation.
I have created my Similarity class for testing which overrides all the
relevant (I think) methods below, returning 1 for all but coord(int, int)
which returns q / maxOverlap so scores are scaled bet
On Apr 1, 2010, at 11:13 PM, Michel Nadeau wrote:
> My big question is how do you loop 1M records, sum up field(s), and then
> sort on that field... all in memory (could use too much ram) ? In a
> temporary index (could take a while to re-write a lot of documents in a new
> index) ?
>
You're g
Pig generally takes csv-type flat files as input. And then you do
join/group-by/sum/count etc on the variables ( aka relations )
For Michael's example with following data:
*Affiliate / SaleDate / SaleAmount*
* mike / 2010-03-01 / 10.00
* john / 2010-03-01 / 10.00
One can write following pig-scri
OS level tools (top, ps, activity monitor, task manager) aren't great
ways to measure Java's memory usage, since they only see how much heap
java has allocated from the OS. Within that heap, java can have lots
of free space that it knows about but the OS does not (this is
Runtime.freeMemory()).
Y
I agree that if you dont know the "source" language - or can't determine it -
there is a lot of uncertainty in trying to transmogriphy the query from one
language to another! TIKA and Nutch do have language determination tools
though (ngram profiles if I'm not mistaken). And you also can interact
Le 01-avr.-10 à 16:29, henrib a écrit :
By issuing multiple queries, one against each localized index,
results being
clustered by locale.
You can further refine by translating the end-user input query terms
for
each locale and issue "translated" queries against the respective
indices.
I've