Just thought I would report back a bit more on this - The Unicode change doesn’t work in my case (possibly not for command line Pharo as well) as I get an error where OS filename’s need unicode support (actually I think this is where its trying to write to stdout, but I didn’t dig more into this):
Error: Instances of UndefinedObject are not indexable UndefinedObject(Object)>>error: UndefinedObject(Object)>>errorNotIndexable UndefinedObject(Object)>>size Unicode class>>isLetter: Character>>isLetter Path class>>isAbsoluteWindowsPath: Path class>>from:delimiter: MacStore(FileSystemStore)>>pathFromString: FileSystem>>pathFromString: ByteString(String)>>asPathWith: FileSystem>>pathFromObject: FileSystem>>referenceTo: ByteString(String)>>asFileReference FileStream class>>fullName: FileStream class>>fileNamed: SmalltalkImage>>openLog I was able to improve on Guille’s warning about how to safely clear up monticello/metacello (and not use become: String new) with the following (I actually think Metacello should provide a #cleanUp method, so I raised a pr for consideration) logger cr; nextPutAll: '>Removing Clearing MC Registry'. MetacelloProjectRegistration resetRegistry. I’m then able to be by image down from 22mb to 13.8 (which is pretty good). As a further experiment I also noticed that there is a fair amount of space trapped in Protocols and ClassOrganisation - so I tried clearing those out (as they are lazily cached) with: Smalltalk allClassesAndTraits do: [:c | c basicOrganization: nil ]. This seems to give me a further 1mb back (but I have’t tried performance tests on this, but my naive assumption is that in a running system that isn’t adding/manipulating code - that I don’t think Protocols are used?). So I’m now at 21mb. Tim > On 16 Aug 2017, at 10:53, Guillermo Polito <guillermopol...@gmail.com> wrote: > > > > On Wed, Aug 16, 2017 at 11:46 AM, Tim Mackinnon <tim@testit.works > <mailto:tim@testit.works>> wrote: > Hi, tracing through your changes - it looks like: > > Smalltalk cleanUp: true except: #() confirming: false. > Takes care of all the non-unicode changes you proposed (and it seems like its > a known cleanup protocol). > > I based my script on #cleanupForRelease ^^. But I did not just blindly > execute it as is because I wanted to understand the implications of each line. > > I wonder if the Unicode change is worth it/risky as many web based services I > might connect to with Zinc do support Unicode so maybe I should keep that one > in. (I will for now - might verify how much of a difference it really makes) > > No, it should not break any encoding/decoding. The changes I proposed will > just nil out two things: > > - the uppercase/lowercase mapping unicode tables that says for each > codepoint if the codepoint is uppercase/lowercase and allows transformations > from/to uppercase/lowercase. This means that these may not work as expected: > > aChar asLowercase > aChar asUppercase > aChar toLowercase > aChar toUppercase > > - the unicode classification table that says if a character is letter or > digit, and so on. This means that these may not work as expected: > > aChar isLetter > aChar isDigit > aChar isAlphaNumeric > > I think my next port of call is cleanUp for Monticello/Metacello as I see a > fair amount of that stuff floating around in my image (after I’ve used it to > bootstrap my code). > > Tim > >> On 16 Aug 2017, at 02:32, Guillermo Polito <guillermopol...@gmail.com >> <mailto:guillermopol...@gmail.com>> wrote: >> >> Actually it happens first that monticello is "nicely" coupled with the >> changeset system and logs all the source code loaded in change sets :D :/ >> ¬¬. Also, the first two strings in terms of size are related to unicode >> tables (we should put them in files instead of in the image and load them on >> demand), and the two biggest arrays also to unicode. I just tried the >> following in a clean bootstrapped "minimal" image (metacello): >> >> "Careful, this will make that #isLetter, #isUppercase #isLowercase, >> #toLowercase and #toUppercase only work on ascii" >> Character characterSet: nil. >> Unicode classPool at: #GeneralCategory put: nil. >> Unicode classPool at: #DecimalProperty put: nil. >> >> UnicodeDefinition removeFromSystem. >> ChangeSet removeChangeSetsNamedSuchThat: [ :each | true ]. >> ChangeSet resetCurrentToNewUnnamedChangeSet. >> MCDefinition clearInstances. >> Undeclared removeUnreferencedKeys. >> Smalltalk garbageCollect. >> >> like this: >> >> ./vm/pharo Pharo7.0-metacello-32bit-fa236b7.image eval --save "Character >> characterSet: nil. Unicode classPool at: #GeneralCategory put: nil. Unicode >> classPool at: #DecimalProperty put: nil. UnicodeDefinitions >> removeFromSystem. ChangeSet removeChangeSetsNamedSuchThat: [ :each | true ]. >> ChangeSet resetCurrentToNewUnnamedChangeSet. MCDefinition clearInstances. >> Undeclared removeUnreferencedKeys. Smalltalk garbageCollect." >> >> and my image went down from 11MB to 6.6MB (7.0 MB if I don't change back to >> ascii with the first three lines) >> >> Then I tried a tally: >> >> ./vm/pharo Pharo7.0-metacello-32bit-fa236b7.image save spacetally >> >> ./vm/pharo spacetally.image eval --save "repo := MCFileTreeRepository new >> directory: '../src' asFileReference. version := repo >> loadVersionFromFileNamed: 'Tool-Profilers.package'. version load." >> >> re-clean since i loaded some packages >> >> ./vm/pharo spacetally.image eval --save "ChangeSet >> removeChangeSetsNamedSuchThat: [ :each | true ]. ChangeSet >> resetCurrentToNewUnnamedChangeSet. MCDefinition clearInstances. Undeclared >> removeUnreferencedKeys. Smalltalk garbageCollect." >> >> This image is now 6.6MB (7.1MB with the unicode large arrays), 4.1% of >> strings (274k) what seems reasonable. Remaining big strings are Pharo's >> licence, the buffer of the changes file and then some class comments >> (shouldn't they be fetched from disk as any other method source code?). >> >> Making again a tally shows that ~30% of the space is taken by Arrays and >> 21.9% by compiled methods. But, BUT! :) I have ~30k arrays and lots of >> collections also: >> >> "MethodDictionary" 2872 + >> "IdentitySet" 12781 + >> "OrderedCollection" 4398 + >> "Set" 2959 + >> "Dictionary" 1997 + >> "IdentityDictionary" 454 >> ----------------------------------------------- >> 25461 >> >> So there are ~5k arrays that are used outside collections. >> >> Worth exploring a bit more I think. >> >> On Wed, Aug 16, 2017 at 1:23 AM, Guillermo Polito <guillermopol...@gmail.com >> <mailto:guillermopol...@gmail.com>> wrote: >> >> >> On Tue, Aug 15, 2017 at 11:26 PM, Tim Mackinnon <tim@testit.works >> <mailto:tim@testit.works>> wrote: >> Hi Guille/Ben - I got a quick moment to try the SpaceTally (aside: it seems >> very convoluted to load a single package into the image, I was trying to >> avoid having to create a baselineOf for something so simple - I ended up >> with: >> >> I know, I also believe we have to simplify this. In any case, baselines are >> healthy as they allow to also express dependencies. Otherwise you'll end up >> loading dependencies by hand. We'll fix this soon I hope. >> >> >> repo := MCFileTreeRepository new directory: './bootstrap' asFileReference. >> version := repo loadVersionFromFileNamed: 'Tool-Profilers.package'. >> version load. >> >> Anyway - in my minimal image, like in the fat image there seems to be a >> surprising amount of bytestrings (4mb worth?). I think that might need some >> digging into? It seems like a lot somehow. Although Ben’s neat experiment of >> zipping strings shows that’s not a real route. >> >> In a deployed minimal image - maybe I can get rid of some other things like >> MethodChangeRecords or MCMethodDefiniion’s (but they are smaller wins - but >> noticeable) >> >> Class code space # instances inst >> space percent inst average size >> ByteString 2640 37365 >> 4823848 21.50 129.10 >> Array 3742 53002 >> 3961944 17.60 74.75 >> CompiledMethod 19159 30481 >> 2912968 13.00 95.57 >> Association 1148 58348 >> 1867136 8.30 32.00 >> MethodChangeRecord 431 34312 >> 1097984 4.90 32.00 >> ByteArray 4605 290 >> 908728 4.00 3133.54 >> ByteSymbol 1698 22689 >> 840168 3.70 37.03 >> IdentitySet 408 19076 >> 610432 2.70 32.00 >> MethodDictionary 3310 3520 >> 608688 2.70 172.92 >> WeakArray 1758 3024 >> 597824 2.70 197.69 >> MCMethodDefinition 4318 6659 >> 426176 1.90 64.00 >> Protocol 1679 8382 >> 268224 1.20 32.00 >> OrderedCollection 6555 5509 >> 220360 1.00 40.00 >> >> As an aside - my Gitlab project is public, the scripts that load things up >> are in ./scripts (build.sh, and minimal.st <http://minimal.st/> and >> loadlocal.st <http://loadlocal.st/>) >> >> Tim >> >>> On 15 Aug 2017, at 08:02, Guillermo Polito <guillermopol...@gmail.com >>> <mailto:guillermopol...@gmail.com>> wrote: >>> >>> >>> >>> On Mon, Aug 14, 2017 at 4:42 PM, Tim Mackinnon <tim@testit.works >>> <mailto:tim@testit.works>> wrote: >>> Hi Guille - just running SpaceTally on my dev image to get a feel for it. >>> It turns out that in the minimal images you’ve been creating, its not >>> loaded (makes sense). >>> >>> Yup, it's loaded afterwards. >>> >>> All packages are loaded through metacello baselines. We should start >>> refactoring and making standalone projects, each one with a baseline for >>> himself, and his own dependencies described. >>> >>> I was checking on your gitlab and I have probably no access: how are you >>> finally loading packages in the bootstrap image? Can you share that with us >>> in text? I'd like to improve that situation. >>> >>> I’m wondering if there is an easy way to import it in (I guess that package >>> should be in the Pharo git tree I cloned to get Fuel loaded right? Or is >>> there a separate standalone source?). >>> >>> Yes it is, you can get the package programatically doing >>> >>> SpaceTally package name >>> >>> And furthermore, get the baseline that currently is loading by doing >>> >>> package := SpaceTally package name. >>> BaselineOf subclasses select: [ :e | >>> e project version packages anySatisfy: [ :p | p name = package ]]. >>> >>> >>> Thanks for all the support, and your email about why the contexts stack up >>> is very well received (I will comment over there). >>> >>> By the way - it looks like Martin Fowler picked up on this announcement - >>> so maybe we might get some interest from his mass of followers. >>> >>> Tim >>> >>>> On 14 Aug 2017, at 10:49, Guillermo Polito <guillermopol...@gmail.com >>>> <mailto:guillermopol...@gmail.com>> wrote: >>>> >>>> Hi Tim, >>>> >>>> On Mon, Aug 14, 2017 at 11:41 AM, Tim Mackinnon <tim@testit.works >>>> <mailto:tim@testit.works>> wrote: >>>> Hey guys, thanks for your enthusiasm around this - and I cannot stress >>>> enough how this was only possible because of the work that has gone into >>>> making Pharo (in particular the 64bit image, as well as having a minimal >>>> image, and some great blog posts on serialising contexts) as well as the >>>> patience from everyone in answering questions and helping me get it all >>>> working. >>>> >>>> I’m still quite keen to get my execution time back down under 800ms and >>>> I’d like to actually get back to writing a few skills to automate a few >>>> things around my house. >>>> >>>> To Answer Denis’ question - >>>> >>>> My final footprint is 30.4mb - thats composed of a 22mb image (with a >>>> simple example that pulls in Fuel, ZTimestamp and the S3 Library which >>>> depends on XMLParser) and then the VM (from which I removed obvious dll’s). >>>> >>>> In my original experiments with a 6.0 minimal image - I did manage to get >>>> to a 13.4mb image (which started out as 12mb original size, and then >>>> loaded in STON and had only a simple clock example). I think the sweet >>>> spot is around 20mb total footprint as that seems to get me into the >>>> 450ms-900ms range. >>>> >>>> The 7.0 min image now starts out at 15mb and then I’m not sure why loading >>>> Fuel, S3 and XMLParser takes 7mb (it seems big to me - but I’ve not dug >>>> into that). >>>> >>>> You can do further space analysis using the following expression >>>> >>>> SpaceTally new printSpaceAnalysis >>>> >>>> You can do that in an eval and check what's taking space. With measures we >>>> can iterate and improve :). >>>> >>>> I’ve also found (and this on the back of unserialising the context in my >>>> example) that the way we build images has 15+ saved stack sessions that >>>> have saved on top of each other from the way we build up the images. I >>>> don’t yet know the implications of size/speed of these - but we need a >>>> better way of folding executions when we snapshot headless images. I’m >>>> also not clear if there are any other startup tasks that take precious >>>> time (this also has implications for our fat development images as they >>>> take much longer to appear than they really should). >>>> >>>> I'm working on this as I'm writing this mail ;) >>>> >>>> https://pharo.fogbugz.com/f/cases/20309 >>>> <https://pharo.fogbugz.com/f/cases/20309> >>>> https://github.com/pharo-project/pharo/pull/196 >>>> <https://github.com/pharo-project/pharo/pull/196> >>>> >>>> I'll write down the implications further in a different thread. >>>> >>>> >>>> I’ll be exploring some of these size/speed tradeoff’s in follow on >>>> messages. >>>> >>>> But once again, a big thanks - I’ve not enjoyed programming like this for >>>> ages. >>>> >>>> Tim >>>> >>>>> On 12 Aug 2017, at 16:26, Ben Coman <b...@openinworld.com >>>>> <mailto:b...@openinworld.com>> wrote: >>>>> >>>>> hi Tim, >>>>> >>>>> That is..... AWESOME! >>>>> >>>>> Very nice delivery - it flowed well with great narration. >>>>> >>>>> I loved @2:17 "this is the interesting piece, because PharoLambda has >>>>> serialized the execution context of its application and saved it into [my >>>>> S3 bucket] ... [then on the local machine] rematerializes a debugger [on >>>>> that context]." >>>>> >>>>> There is a clarity in your video presentation that really may intrigue >>>>> outsiders. As a community we should push this on the usual hacker forums >>>>> - ycombinator could be a good starting point (but I'm locked out of my >>>>> account there). >>>>> An enticing title could be... >>>>> "Debugging Lambdas by re-materializing saved execution contexts on your >>>>> local machine." >>>>> >>>>> cheers -ben >>>>> >>>>> On Fri, Aug 11, 2017 at 3:37 PM, Denis Kudriashov <dionisi...@gmail.com >>>>> <mailto:dionisi...@gmail.com>> wrote: >>>>> This is cool Tim. >>>>> >>>>> So what image size you deployed at the end? >>>>> >>>>> 2017-08-10 15:47 GMT+02:00 Tim Mackinnon <tim@testit.works >>>>> <mailto:tim@testit.works>>: >>>>> I just wanted to thank everyone for their help in getting my pet project >>>>> further along, so that now I can announce that PharoLambda is now working >>>>> with the V7 minimal image and also supports post mortem debugging by >>>>> saving a zipped fuel context onto S3. >>>>> >>>>> This latter item is particularly satisfying as at a recent serverless >>>>> conference (JeffConf) there was a panel where poor development tools on >>>>> serverless platforms was highlighted as a real problem. >>>>> >>>>> In our community we’ve had these kinds of tools at our fingertips for >>>>> ages - but I don’t think the wider development community has really >>>>> noticed. Debugging something short lived like a Lambda execution is quite >>>>> startling, as the current answer is “add more logging”, and we all know >>>>> that sucks. To this end, I’ve created a little screencast showing this in >>>>> action - and it was pretty cool because it was a real example I >>>>> encountered when I got everything working and was trying my test >>>>> application out. >>>>> >>>>> I’ve also put a bit of work into tuning the excellent GitLab CI tools, so >>>>> that I can cache many of the artefacts used between different build runs >>>>> (this might also be of interest to others using CI systems). >>>>> >>>>> The Gitlab project is on: https://gitlab.com/macta/PharoLambda >>>>> <https://gitlab.com/macta/PharoLambda> >>>>> And the screencast: https://www.youtube.com/watch?v=bNNCT1hLA3E >>>>> <https://www.youtube.com/watch?v=bNNCT1hLA3E> >>>>> >>>>> Tim >>>>> >>>>> >>>>>> On 15 Jul 2017, at 00:39, Tim Mackinnon <tim@testit.works >>>>>> <mailto:tim@testit.works>> wrote: >>>>>> >>>>>> Hi - I’ve been playing around with getting Pharo to run well on AWS >>>>>> Lambda. It’s early days, but I though it might be interesting to share >>>>>> what I’ve learned so far. >>>>>> >>>>>> Usage examples and code at https://gitlab.com/macta/PharoLambda >>>>>> <https://gitlab.com/macta/PharoLambda> >>>>>> >>>>>> With help from many of the folks here, I’ve been able to get a simple >>>>>> example to run in 500ms-1200ms with a minimal Pharo 6 image. You can >>>>>> easily try it out yourself. This seems slightly better than what the >>>>>> GoLang folks have been able to do. >>>>>> >>>>>> Tim >>>>> >>>>> >>>>> >>>> >>>> >>>> >>>> >>>> -- >>>> >>>> Guille Polito >>>> >>>> Research Engineer >>>> French National Center for Scientific Research - http://www.cnrs.fr >>>> <http://www.cnrs.fr/> >>>> >>>> >>>> Web: http://guillep.github.io <http://guillep.github.io/> >>>> Phone: +33 06 52 70 66 13 <tel:+33%206%2052%2070%2066%2013> >>> >>> >>> >>> -- >>> >>> Guille Polito >>> >>> Research Engineer >>> French National Center for Scientific Research - http://www.cnrs.fr >>> <http://www.cnrs.fr/> >>> >>> >>> Web: http://guillep.github.io <http://guillep.github.io/> >>> Phone: +33 06 52 70 66 13 <tel:+33%206%2052%2070%2066%2013> >> >> >> >> -- >> >> Guille Polito >> >> Research Engineer >> French National Center for Scientific Research - http://www.cnrs.fr >> <http://www.cnrs.fr/> >> >> >> Web: http://guillep.github.io <http://guillep.github.io/> >> Phone: +33 06 52 70 66 13 <tel:+33%206%2052%2070%2066%2013> >> >> >> -- >> >> Guille Polito >> >> Research Engineer >> French National Center for Scientific Research - http://www.cnrs.fr >> <http://www.cnrs.fr/> >> >> >> Web: http://guillep.github.io <http://guillep.github.io/> >> Phone: +33 06 52 70 66 13 <tel:+33%206%2052%2070%2066%2013> > > > > -- > > Guille Polito > > Research Engineer > French National Center for Scientific Research - http://www.cnrs.fr > <http://www.cnrs.fr/> > > > Web: http://guillep.github.io <http://guillep.github.io/> > Phone: +33 06 52 70 66 13