I want to speed thing up, by running multiple instances of a job that
fetches data from a table.  So that for example if I need to process 10,000
rows
the query runs on each instance and returns 4 sets of 2500 rows one for
each instance with no duplication.

My first thought in SQL was to add something like this to the where clause..

and MOD(ID, INSTANCE_COUNT) == INSTANCE_ID;

so that if the instance count was 4 then the instance IDs would run 0,1,2,3.

I'm not quite sure how you would structure that using the queryAPI. Any
suggestions about that?

And there are some problems with this idea, as you have to be certain your
IDs increase in a manner that aligns with your math so that the
partitioning is equal in size.
For example if your sequence increments by 20, then you would have to futz
around with your math to get the right partitioning and that is the problem
with this technique.
 It's brittle it depends on getting a bunch of things in  "sync".

Does anyone have another idea of how to segment out rows that would yield a
solution that's not quite so brittle?



Tony Giaccone

Reply via email to