Well that's my problem: we have a lot of records of all types (afiiliates, sales) so looping tons of records each time isn't possible.
- Mike aka...@gmail.com On Thu, Apr 1, 2010 at 2:11 PM, prasenjit mukherjee <prasen....@gmail.com>wrote: > If the number of documents ( in this case "Affiliates" ) aren't huge, > sorting can probably be done as a post-process. > > Still dont see any need of joins here. > > > On Thu, Apr 1, 2010 at 7:16 PM, Michel Nadeau <aka...@gmail.com> wrote: > > Hi, > > > > Here's an example of raw data that would be in my Sales index: > > > > *Affiliate / SaleDate / SaleAmount* > > * mike / 2010-03-01 / 10.00 > > * john / 2010-03-01 / 10.00 > > * mike / 2010-03-02 / 15.00 > > * john / 2010-03-02 / 5.00 > > * mike / 2010-03-03 / 20.00 > > * john / 2010-03-03 / 1.00 > > * mike / 2010-03-04 / 10.00 > > * john / 2010-03-04 / 10.00 > > * mike / 2010-03-05 / 15.00 > > * john / 2010-03-05 / 5.00 > > * mike / 2010-03-06 / 20.00 > > * john / 2010-03-06 / 1.00 > > > > So our 2 affiliates mike and john made 6 sales each between 2010-03-01 > and > > 2010-03-06. My ultimate query should return this for a query between > > 2010-03-01 and 2010-03-06 : > > > > *Affiliate / TotalSales* > > * mike / 90.00 > > * john / 32.00 > > > > So it's exactly like - > > > > SELECT Affiliate, sum(SaleAmount) as TotalSales FROM Sales > > WHERE SaleDate >= '2010-03-01' AND SaleDate <= '2010-03-06' > > GROUP BY Affiliate > > ORDER BY TotalSales DESC; > > > > - Mike > > aka...@gmail.com > > > > > > On Thu, Apr 1, 2010 at 8:11 AM, prasenjit mukherjee < > prasen....@gmail.com>wrote: > > > >> Not sure what you mean by "joining" in lucene , since conceptually > >> there is only 1 table ( with many field aka columns ) in lucene. A > >> representative query would be good to know the use case. > >> > >> Again didn't get the "sorting" part. SUM() will return only 1 > >> aggregated value, so what do you want to sort it on ? > >> > >> -Prasen > >> > >> On Thu, Apr 1, 2010 at 7:44 AM, Michel Nadeau <aka...@gmail.com> wrote: > >> > Are you planning to be able to sort by these SUMs? A SpanQuery would > work > >> > great to get the integers... then you would loop and sum up... but > what > >> > about "joining" with your other data and sorting? > >> > > >> > - Mike > >> > aka...@gmail.com > >> > > >> > > >> > On Wed, Mar 31, 2010 at 9:23 PM, prasenjit mukherjee > >> > <prasen....@gmail.com>wrote: > >> > > >> >> I too am trying to achieve something. > >> >> > >> >> I am thinking of storing the integer values in payloads and then > >> >> using spanquery classes to compute the respective SUMs > >> >> > >> >> -Prasen > >> >> > >> >> On Thu, Apr 1, 2010 at 6:47 AM, Michel Nadeau <aka...@gmail.com> > wrote: > >> >> > Hi, > >> >> > > >> >> > We're currently in the process of switching many of our screens > from > >> >> MySQL > >> >> > to Lucene because MySQL simply dies because we have too much data > and > >> >> it's > >> >> > becoming too long to generate the stats we need. > >> >> > > >> >> > So here's one MySQL query that we use to find out our Top 10 > >> Affiliates : > >> >> > > >> >> > SELECT SUM(sale_amount) AS total_sales, affialiate_id FROM sales > WHERE > >> >> > sale_date>='2010-03-01' AND sale_date<='2010-03-31' GROUP BY > >> >> affialiate_id > >> >> > ORDER BY total_sales DESC LIMIT 10; > >> >> > > >> >> > We currently have our "sales" index, containing all sales and all > >> fields > >> >> - > >> >> > and it's one big index (over 10M records). We could fetch all > >> documents > >> >> > within the date range, loop them and add up the total_sales, but it > >> would > >> >> be > >> >> > just crazy to do this all the time (we have a high volume of > search). > >> >> > > >> >> > We made several tests with Solr (Facets, and even the beta > >> >> CollapseFields), > >> >> > but nothing is really helping us. We could pre-generate the > >> total_sales > >> >> for > >> >> > all possible date ranges... but that would be quite crazy too as > the > >> date > >> >> > range possibilities quickly become endless. > >> >> > > >> >> > So - is there any known way to efficiently do SUM(), COUNT() (and > even > >> >> AVG() > >> >> > ) using Lucene/Solr/others? I also checked Bobo Browse but it > doesn't > >> >> seem > >> >> > to offer what I need either. > >> >> > > >> >> > Thanks for any hints!!! > >> >> > > >> >> > - Mike > >> >> > aka...@gmail.com > >> >> > > >> >> > >> >> --------------------------------------------------------------------- > >> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > >> >> For additional commands, e-mail: java-user-h...@lucene.apache.org > >> >> > >> >> > >> > > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > >> For additional commands, e-mail: java-user-h...@lucene.apache.org > >> > >> > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >