[Haskell-cafe] sort and lazyness (?)

Daniel Kraft Fri, 19 Dec 2008 06:15:24 -0800

Hi,

I'm just a beginner trying to learn a little about Haskell, and as suchwrite some toy programs (e.g. for projecteuler.net) in Haskell.


Currently, I'm experiencing what I would call "strange behaviour":

I've got a data-type

data Fraction = Fraction Int Int

to hold rational numbers (maybe there's already some built-in type forthis in Haskell, much like for instance Scheme has a rational type?),and then I compute a list of pairs of those numbers, that is

[(Fraction, Fraction)].  Fraction is declared an instance of Ord.

This list has up to 3 million elements.  If I do

main = print $ length $ points

then the program prints out the length fine and takes around 6s tofinish (compiled with GHC -O3). Ok, but I acknowledge that length isn'tquite an expensive function, as I can imagine that Haskell does notcompute and hold the entire list for this but instead each element atonce and discards it afterwards.


Doing

main = print $ length $ map (\(x, _) -> x == Fraction 1 2) points

instead, gives a slightly longer runtime (6.6s), but in this case I'msure that at least each element is computed; right?


main = print $ length $ reverse points

gives 11.9s, and here I guess (?) that for this to work, the entire listis computed and hold in memory.


However, trying to do

import List
main = print $ length $ sort points

makes memory usage go up and the program does not finish in 15m, alsospending most time waiting for swapped out memory. What am I doingwrong, why is sort this expensive in this case? I would assume thatcomputing and holding the whole list does not take too much memory,given its size and data type; doing the very same calculation in Cshould be straight forward. And sort should be O(n * log n) for timeand also not much more expensive in memory, right?

Am I running into a problem with lazyness? What can I do to avoid it?As far as I see it though, the reverse or map call above should donearly the same as sort, except maybe that the list needs to be storedin memory as a whole and sort has an additional *log n factor, butneither of those should matter. What's the problem here?

Is this something known with sort or similar functions? I couldn't findanything useful on Google, though. My code is below, and while I wouldof course welcome critics, I do not want to persuade anyone to readthrough it.


Thanks a lot,
Daniel

----------------------------------------------------------

-- Problem 165: Intersections of lines.

import List


-- The random number generator.

seeds :: [Integer]
seeds = 290797 : [ mod (x * x) 50515093 | x <- seeds ]

numbers :: [Int]
numbers = map fromInteger (map (\x -> mod x 500) (tail seeds))


-- Line segments, vectors and fractions.

data Segment = Segment Int Int Int Int
data Vector = Vector Int Int

data Fraction = Fraction Int Int
instance Eq Fraction where
  (Fraction a b) == (Fraction c d) = (a == c && b == d)
instance Ord Fraction where
  compare (Fraction a b) (Fraction c d)
    | (a == c && b == d) = EQ
    | b > 0     = if a * d < b * c then LT else GT
    | otherwise = if a * d > b * c then LT else GT


-- Build a normalized fraction and get its value.

normalize (Fraction a b) = let g = gcd a b;
                               aa = div a g;
                               bb = div b g in
                              if bb < 0
                                then Fraction (-aa) (-bb)
                                else Fraction aa bb


-- Find the inner product of two vectors.

innerProduct (Vector a b) (Vector c d) = a * c + b * d


-- Find the normal vector and direction of a line segment, as well as
-- the constant in straight-normal form for a given normal vector.

normalVector (Segment x1 y1 x2 y2) = Vector (y1 - y2) (x2 - x1)
direction (Segment x1 y1 x2 y2) = Vector (x2 - x1) (y2 - y1)
nfConstant (Segment x1 y1 x2 y2) n = innerProduct n (Vector x1 y1)


-- Check if a point is between the ends of the segment times D.

betweenEndsTimes d (Segment x1 y1 x2 y2) xD yD
  = let x1D = x1 * d; x2D = x2 * d; y1D = y1 * d; y2D = y2 * d;
        xDMin = min x1D x2D; xDMax = max x1D x2D;
        yDMin = min y1D y2D; yDMax = max y1D y2D in
      (xDMin <= xD && xDMax >= xD && yDMin <= yD && yDMax >= yD
       && (x1D /= xD || y1D /= yD) && (x2D /= xD || y2D /= yD))


-- If they are not parallel, we can find their intersection point (at least,
-- the one it would be if both were straights).  Then it is easy to check if
-- it is between the endpoints for both.
--
-- n1 * x + m1 * y = c1
-- n2 * x + m2 * y = c2
--
-- => x = (c1 * m2 - x2 * m1) / (n1 * m2 - n2 * m1)
-- => y = (n1 * c2 - n2 * c1) / (n1 * m2 - n2 * m1)
--
-- (Iff they are parallel, the determinant will be 0.)

trueIntersect s1 s2 = let (Vector n1 m1) = normalVector s1;
                          (Vector n2 m2) = normalVector s2;
                          c1 = nfConstant s1 (Vector n1 m1);
                          c2 = nfConstant s2 (Vector n2 m2);
                          d = n1 * m2 - n2 * m1;
                          xD = c1 * m2 - c2 * m1;
                          yD = n1 * c2 - n2 * c1 in
                        if d == 0
                          then Nothing
                          else
                            if (betweenEndsTimes d s1 xD yD)
                                && (betweenEndsTimes d s2 xD yD)
                              then Just ((normalize $ Fraction xD d),
                                         (normalize $ Fraction yD d))
                              else Nothing


-- Build list of segments.

takeEveryForth :: [Int] -> [Int]
takeEveryForth (a:_:_:_:t) = a : (takeEveryForth t)

n1 = numbers
n2 = tail n1
n3 = tail n2
n4 = tail n3

segments = [ Segment a b c d | ((a, b), (c, d))
                                <- zip (zip (takeEveryForth n1)
                                            (takeEveryForth n2))
                                       (zip (takeEveryForth n3)
                                            (takeEveryForth n4)) ]


-- For the first 5000 segments, calculate intersections.

firstSegments = take 5000 segments

intersects :: [Maybe (Fraction, Fraction)]
intersects = findInters firstSegments []
  where
    findInters [] l = l
    findInters (h:t) l = findInters t (addInters h t l)
      where
        addInters _ [] l = l
        addInters e (h:t) l = addInters e t ((trueIntersect e h) : l)

getPoints :: [Maybe (Fraction, Fraction)] -> [(Fraction, Fraction)]
getPoints [] = []
getPoints (Nothing : t) = getPoints t
getPoints ((Just v) : t) = v : (getPoints t)

points = getPoints intersects


-- Main program.

main :: IO ()
main = print $ length $ reverse points
--main = print $ length $ map (\(x, _) -> x == Fraction 1 2) points
--main = print $ length $ sort points

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] sort and lazyness (?)

Reply via email to