With the urging and assistance of Omar AntolĂn Camarena, I will be adding two functions to data-ordlist: mergeAll and unionAll, which merge (or union) a potentially infinite list of potentially infinite ordered lists, under the assumption that the heads of the non-empty lists appear in a non-decreasing sequence.
Union takes two sorted lists and produces a new sorted list; an element occurs in the result as many times as the maximum number of occurrences in either list. The unionAll function generalizes this behavior to an infinite number of lists. A reasonable implementation of mergeAll is: > import Data.List.Ordered(merge, union) > mergeAll :: Ord a => [[a]] -> [a] > mergeAll = foldr (\(x:xs) ys -> x : merge xs ys) [] However, for many inputs, we can do better; the library implementation of mergeAll is based on H. Apfelmus's article "Implicit Heaps", which presents a simplification of Dave Bayer's "venturi" algorithm. The difference is that the foldr version uses a line of comparisons, whereas "venturi" uses a tree of comparisons. http://apfelmus.nfshost.com/articles/implicit-heaps.html http://www.mail-archive.com/haskell-cafe@haskell.org/msg27612.html However, as Omar pointed out to me, the following implementation of unionAll has a flaw: > unionAll :: Ord a => [[a]] -> [a] > unionAll = foldr (\(x:xs) ys -> x : union xs ys) [] Namely unionAll [[1,2],[1,2]] should return [1,2], whereas it actually returns [1,1,2]. After some work, I believe I have generalized H. Apfelmus's algorithm to handle this; however it seems a bit complicated. I would love feedback, especially with regard to simplifications, bugs, testing strategies, and optimizations: > unionAll' :: Ord a => [[a]] -> [a] > unionAll' = unionAllBy compare > data People a = VIP a (People a) | Crowd [a] > unionAllBy :: (a -> a -> Ordering) -> [[a]] -> [a] > unionAllBy cmp xss = loop [ (VIP x (Crowd xs)) | (x:xs) <- xss ] > where > loop [] = [] > loop ( VIP x xs : VIP y ys : xss ) > = case cmp x y of > LT -> x : loop ( xs : VIP y ys : xss ) > EQ -> loop ( VIP x (union' xs ys) : unionPairs xss ) > GT -> error "Data.List.Ordered.unionAll: assumption violated!" > loop ( VIP x xs : xss ) > = x : loop (xs:xss) > loop [Crowd xs] = xs > loop (xs:xss) = loop (unionPairs (xs:xss)) > > unionPairs [] = [] > unionPairs [x] = [x] > unionPairs (x:y:zs) = union' x y : unionPairs zs > > union' (VIP x xs) (VIP y ys) > = case cmp x y of > LT -> VIP x (union' xs (VIP y ys)) > EQ -> VIP x (union' xs ys) > GT -> error "Data.List.Ordered.unionAll: assumption violated!" > union' (VIP x xs) (Crowd ys) = VIP x (union' xs (Crowd ys)) > union' (Crowd []) ys = ys > union' (Crowd xs) (Crowd ys) = Crowd (unionBy cmp xs ys) > union' xs@(Crowd (x:xt)) ys@(VIP y yt) > = case cmp x y of > LT -> VIP x (union' (Crowd xt) ys) > EQ -> VIP x (union' (Crowd xt) yt) > GT -> VIP y (union' xs yt) -- Leon _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe