On Fri, 03 Oct 2014 16:13:35 -0400, stepharo <steph...@free.fr> wrote:
Looking at MessageTally, it seems at least half of that 278ms is spent
noodling around in WideStrings and other small things that the Spur
object format ought to help with, so it'd be interesting to see how much
of a speed boost that gives without any further work.
tell us more about your algo because we want to know.
It's not that exciting (it's a problem we give job applicants as a
take-home coding test), but I'll describe it, since it highlights a common
corner where Pharo does not perform terribly well.
Basically, we give you a pile of play lists (a text file where each line
is "Song1,Song2,Song3" and so on, representing a single play list), and
then ask you to tell us which songs appear frequently together (where
"frequently" is something we define--say, at least 100 times, for
example). While there are interesting complex algorithms, a simple one
that works well for small datasets is simply to walk all of play lists and
build up a hashtable that maps pairs of bands to a count, then walk the
hashtable and print out any keys whose counts are over the threshold. If
this sounds like "dump all the permutations of each play list into a Bag
and then walk Bag>>valuesAndCounts," you're right. As I said, it's not
that complex.
Python does very well on this because it hits on three things--hashtables,
sets, and string processing--for which Python has obscenely good
implementations in carefully optimized C. (In fact, the 80ms time I
quoted you is surprisingly close to the maximum speed you can get with
genuine C or C++ implementation on the sample dataset as well.)
Pharo struggles to match Python for the same three reasons, but
*especially* because its String functions are just much slower. In fact,
if you implement the algorithm I described above naïvely in Pharo, the
execution time is about 780ms--nearly ten times slower. After some
reflection, I chopped the time to 270ms by converting all the band names
to integers as the first step of the process (ArtistLister>>generateMap:
in the profile trace below). Not only does this avoid WideString>>hash
and friends; it also allows me to bring in IdentityDictionary and
IdentitySet in most locations due to Pharo's object format.
But there's a lot left. I'm attaching the profile output below. Notice
that Pharo spends more time reading and processing the file (about 120ms)
than the Python program spends executing total.
--Benjamin
- 279 tallies, 279 msec.
**Tree**
--------------------------------
Process: (40s) Morphic UI Process: nil
--------------------------------
69.2% {193ms} ArtistLister>>groupsIn:
|18.6% {52ms} FileReference(AbstractFileReference)>>contents
| |17.6% {49ms} MultiByteFileStream(FileStream)>>contents
| | 17.6% {49ms} MultiByteFileStream>>next:
| | 5.0% {14ms} ByteString>>at:put:
| | |5.0% {14ms} WideString class>>from:
| | 5.0% {14ms} WideString>>at:put:
| | 5.0% {14ms} primitives
| | 1.4% {4ms} MultiByteFileStream>>next
| | 1.4% {4ms} UTF8TextConverter>>nextFromStream:
|14.7% {41ms} WideString(String)>>lines
| |14.7% {41ms} WideString(String)>>linesDo:
| | 10.0% {28ms} WideString>>copyFrom:to:
| | |5.4% {15ms} WideString(String)>>isOctetString
| | | |3.2% {9ms} WideString>>at:
| | | | |2.2% {6ms} Character class>>value:
| | | |2.2% {6ms} primitives
| | |4.7% {13ms} WideString(String)>>asOctetString
| | | 3.6% {10ms} primitives
| | 4.7% {13ms} WideString(String)>>lineIndicesDo:
| | 4.7% {13ms} WideString(String)>>indexOf:startingAt:
| | 4.7% {13ms} WideString class(String
class)>>indexOfAscii:inString:startingAt:
|10.8% {30ms} ByteString(Object)>>split:
| |10.0% {28ms} ByteString(Object)>>split:do:
| | 5.7% {16ms} WideString>>copyFrom:to:
| | |3.2% {9ms} WideString(String)>>asOctetString
| | |2.5% {7ms} WideString(String)>>isOctetString
| | | 1.8% {5ms} primitives
| | 4.3% {12ms} ByteString(SequenceableCollection)>>split:indicesDo:
| | 2.9% {8ms}
WideString(SequenceableCollection)>>indexOfSubCollection:startingAt:
| | |2.9% {8ms}
WideString(String)>>indexOfSubCollection:startingAt:ifAbsent:
| | | 2.9% {8ms} WideString(String)>>findString:startingAt:
| | | 2.9% {8ms}
WideString(String)>>findString:startingAt:caseSensitive:
| | | 2.9% {8ms}
WideString(String)>>findSubstring:in:startingAt:matchTable:
| | 1.4% {4ms} primitives
|7.2% {20ms} Set(Collection)>>addAll:
| |5.0% {14ms} Set>>add:
| | |3.6% {10ms} Set>>scanFor:
| | | |3.6% {10ms} ByteString(String)>>=
| | |1.4% {4ms} Set(HashedCollection)>>atNewIndex:put:
| | | 1.4% {4ms} Set(HashedCollection)>>fullCheck
| | | 1.4% {4ms} Set>>grow
| |1.4% {4ms} OrderedCollection>>do:
|5.4% {15ms} ArtistLister>>generateMap:
| |5.4% {15ms} IdentityDictionary(Dictionary)>>at:put:
| | 3.9% {11ms} Dictionary(HashedCollection)>>atNewIndex:put:
| | |3.2% {9ms} Dictionary(HashedCollection)>>fullCheck
| | | 2.5% {7ms} Dictionary(HashedCollection)>>grow
| | | 1.4% {4ms} Dictionary>>noCheckAdd:
| | | 1.4% {4ms} Dictionary(HashedCollection)>>findElementOrNil:
| | 1.4% {4ms} IdentityDictionary(HashedCollection)>>findElementOrNil:
|4.3% {12ms} OrderedCollection>>collect:
| |4.3% {12ms} OrderedCollection>>addLast:
|4.3% {12ms} Dictionary>>at:
| |4.3% {12ms} Dictionary>>at:ifAbsent:
| | 2.2% {6ms} Dictionary(HashedCollection)>>findElementOrNil:
| | |2.2% {6ms} Dictionary>>scanFor:
| | | 1.4% {4ms} ByteString(String)>>=
| | 2.2% {6ms} primitives
|3.9% {11ms} ByteString(String)>>trimmed
| 3.9% {11ms} ByteString(String)>>trimBoth
| 3.2% {9ms} ByteString(String)>>trimBoth:
| 2.2% {6ms} primitives
19.4% {54ms} ArtistLister>>popularPairsIn:onlyKeeping:
|15.8% {44ms} OrderedCollection(Collection)>>asBag
| |15.8% {44ms} Bag class(Collection class)>>withAll:
| | 15.8% {44ms} Bag(Collection)>>addAll:
| | 13.6% {38ms} Bag>>add:
| | |13.6% {38ms} Bag>>add:withOccurrences:
| | | 10.0% {28ms} Dictionary>>at:put:
| | | |8.6% {24ms}
Dictionary(HashedCollection)>>findElementOrNil:
| | | | |7.9% {22ms} Dictionary>>scanFor:
| | | | | 4.7% {13ms} Array(SequenceableCollection)>>hash
| | | | | |3.2% {9ms} primitives
| | | | | 2.5% {7ms} Array(SequenceableCollection)>>=
| | | | | 2.5% {7ms}
Array(SequenceableCollection)>>hasEqualElements:
| | | | | 1.4% {4ms} primitives
| | | |1.4% {4ms} Dictionary(HashedCollection)>>atNewIndex:put:
| | | 3.6% {10ms} Dictionary>>at:ifAbsent:
| | | 2.2% {6ms} Dictionary(HashedCollection)>>findElementOrNil:
| | | |2.2% {6ms} Dictionary>>scanFor:
| | | | 1.4% {4ms} Array(SequenceableCollection)>>=
| | | | 1.4% {4ms}
Array(SequenceableCollection)>>hasEqualElements:
| | | 1.4% {4ms} primitives
| | 2.2% {6ms} OrderedCollection>>do:
|2.5% {7ms} ArtistLister>>uniquePairs:do:
4.3% {12ms} ArtistLister>>bandsAboveThresholdIn:
|4.3% {12ms} Bag(Collection)>>addAll:
| 4.3% {12ms} Bag>>add:
| 4.3% {12ms} Bag>>add:withOccurrences:
| 2.5% {7ms} Dictionary>>at:put:
| 1.8% {5ms} Dictionary>>at:ifAbsent:
3.2% {9ms} Set>>includes:
2.5% {7ms} Set(HashedCollection)>>findElementOrNil:
2.5% {7ms} Set>>scanFor:
**Leaves**
6.8% {19ms} WideString(String)>>asOctetString
6.5% {18ms} ByteString(String)>>=
5.4% {15ms} Dictionary>>at:ifAbsent:
5.0% {14ms} WideString>>at:put:
5.0% {14ms} WideString class>>from:
5.0% {14ms} MultiByteFileStream>>next:
5.0% {14ms} WideString(String)>>isOctetString
4.7% {13ms} WideString class(String
class)>>indexOfAscii:inString:startingAt:
4.3% {12ms} OrderedCollection>>addLast:
3.9% {11ms} WideString>>at:
3.6% {10ms} Array(SequenceableCollection)>>do:
3.6% {10ms} OrderedCollection>>do:
3.2% {9ms} Set>>scanFor:
3.2% {9ms} Dictionary(HashedCollection)>>atNewIndex:put:
3.2% {9ms} Array(SequenceableCollection)>>hash
2.5% {7ms} ArtistLister>>uniquePairs:do:
2.2% {6ms} Dictionary>>scanFor:
2.2% {6ms} Character class>>value:
2.2% {6ms} ByteString(String)>>trimBoth:
2.2% {6ms} Array(SequenceableCollection)>>hasEqualElements:
1.8% {5ms} Array class(Behavior)>>inheritsFrom:
1.4% {4ms} ByteString(SequenceableCollection)>>split:indicesDo:
1.4% {4ms} SmallInteger>>hashMultiply
1.4% {4ms} Dictionary(HashedCollection)>>findElementOrNil:
**Memory**
old +5,124,784 bytes
young -340,984 bytes
used +4,783,800 bytes
free +295,464 bytes
**GCs**
full 0 totalling 0ms (0.0% uptime)
incr 22 totalling 22ms (8.0% uptime), avg 1.0ms
tenures 15 (avg 1 GCs/tenure)
root table 0 overflows