I think in this case actually the date does matter. Can you think this
problem as a regression problem. Given two coordinates date and
prices, we are actually trying to find minimum number of line segments
that fit the data best. What I mean by "best fit to data" is actually
the exact criteria you are looking for: the line segments are chosen
such that the prices are close to the average price (which is the line
segment representing the date range). A problem called sequence
segmentation might be of use to you. Check this paper for detailed
discussion of a  DP solution;

http://scholar.google.com.tr/scholar?cluster=806851568786670722&hl=tr&as_sdt=2000



On Nov 10, 5:03 pm, leoV <[email protected]> wrote:
> The dates are arbitrary subsets. not necessarily consecutive. You can
> see it as an initial set with dates and prices, that need to be
> published in a limited number of columns, with from-until date. The
> offset from the individual price values referenced to the calculated
> average of a period must be minimized.
>
> leo
>
> On 9 nov, 20:59, Gene <[email protected]> wrote:
>
>
>
>
>
>
>
> > Should the dates in the classes be consecutive?  Or are the classes
> > arbitrary subsets?
>
> > On Nov 9, 11:04 am, leoV <[email protected]> wrote:
>
> > > I need to classify a number (1000 to 10000) of key-value pairs in a
> > > maximum number of classes (usually 25 to 50) in such way that the
> > > deviations to the average of each class are minimized.
> > > The keys are unique dates and the vale is a decimal numeric value.
> > > Any hints or samples or references to methods?
>
> > > thx
>
> > > leo

-- 
You received this message because you are subscribed to the Google Groups 
"Algorithm Geeks" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/algogeeks?hl=en.

Reply via email to