Re: Garbage collection (crashing on Windows)

Mark Waddingham Fri, 19 Aug 2016 10:44:06 -0700

Hi Ben,

When I got to the end of this email I remembered something quitepertinent - you mentioned that the limit you were hitting was 2Gb... Onething to check is that the install of Windows you are running on cannotbe poked to actually raise this limit to 3Gb:


https://blogs.technet.microsoft.com/askperf/2007/03/23/memory-management-demystifying-3gb/

Perhaps other's with more insider Windows knowledge can chip in there.It will depend on the machine, the version of Windows and probably lotsof other factors. Given that 'hardware is cheap' compared to rewritingsoftware - if the windows install being used currently does not use that'trick', and can be, you'll probably find you get a fair bit of mileagewith a bit of computer configuration - rather than coding!


Assuming that cannot be done then...

On 2016-08-17 19:52, Ben Rubinstein wrote:

Please refresh my memory: is there any way to cause/allow garbage to
be collected without ending all script running?

LiveCode doesn't use what is generally referred to as 'garbagecollection' as it generally frees 'things' up as soon as they are nolonger referenced. Now I say 'generally' because things fall into twoclasses:


   1) Values (strings, arrays, data, numbers)

   2) Objects (stacks, cards, buttons etc.)

I'll deal with Objects first:

Objects are deleted as soon as they can be relative to the requirementsof the engine. We actually changed this mechanism to make it lessconservative in 6.7.11, 7.1.4 and 8.0 onwards. Previously, deletedobjects wouldn't get actually freed until the root event loop runs (i.e.when there is no script running); now they will generally get freed muchcloser to when they are deleted, especially if they were created 'at thesame level or above' where the object is deleted. e.g.


   on foo
     create control bar
     delete control bar
   end foo

Here the delete will free the object immediately (as the engine knowsthat it cannot have any internal references to it internally - inparticular on the C stack).

It sounds like the problem you are having (assuming you aren't creatingand deleting lots of controls) is to do with values and so...

Values are freed *as soon as* there is no longer any reference to them.In 6.7 and before that would be whenever a variable is changed (the oldvalue was released immediately), or whenever the variable goes out ofscope (e.g. locals in a handler get released when the handler ends,script locals are released when the object is deleted). In 7.0+ thishappens as soon as there are no variables referencing the same instanceof the value. e.g.


  (1) local tVariable1, tVariable2
  (2) put "foo" & "bar" into tVariable1
  (3) put tVariable1 into tVariable2
  (4) put empty into tVariable1

After step (3), tVariable1 and tVariable2 will reference the same value.At step (4) the reference tVariable1 holds will be removed, but thevalue will not be deleted (from memory) until tVariable2 changes, orgoes out of scope. The general mechanism is that values are shared whencopied into different variables, and are only copied when a variable ismutated. e.g.


  (1) local tVariable1, tVariable2
  (2) put "foo" & "bar" into tVariable1
  (3) put tVariable1 into tVariable2
  (4) put "baz" after tVariable2
  (5) put empty into tVariable1

Here, at step (4), the value referenced by tVariable2 will be copied(and so tVariable1 and tVariable2 will no longer reference the samevalue), and then changed. This means that at step (5) the valuepreviously referenced by tVariable1 *will* be freed, because it is notshared with tVariable2 (obviously - because tVariable2 is no longer thesame value!).

The reason I was being so paedagogic in the above is that it opens anopportunity for you to potentially reduce the memory footprint of yourdataset (which sounds like it is what is causing the problem) by doingsome pre-processing and exploiting the fact that values are not copieduntil they are modified. Of course, I don't know what the structure ofthe data you are processing is - so I'm going to assume you are loadingin lots of text files and breaking them up into pieces, presumablystoring in arrays with the individual array elements being numbers andstrings.

In this case there are a few interesting things to note about theengine's implementation of values...


Array keys are *always* shared (up to case). When you do:

   put tElement into tArray[tKey]

The engine first 'uniques' tKey - this means it ensures that there isonly one copy of tKey (up to case differences) in memory. So - for everysingle array in memory which contains a key "foo", the valuerepresenting the key "foo" will not be copied, just referenced from allthe arrays. Note that "foo" and "Foo", whilst referencing the same value(unless caseSensitive is true), will be stored in memory as differentvalues which leads to memory optimization tip 1:

When constructing arrays from external data, where the case of thekey is irrelevant use:

     put X into tArray[toLower(Y)] -- or toUpper (whichever you prefer)

For the values bound to by keys, the story is different. If you do:

   put myString & "1" into tArray["foo"]
   put myString & "1  into tArray["bar"]

Then the two values of the keys "foo" and "bar" *will be different*.This is because they have been constructed differently.

You can optimize this for memory size by using another array to 'index'your string values:


   command shareAndStoreKey @xArray, pKey, pValue

set the caseSensitive to true -- this is assuming your values aresensitive to case

     if pValue is not among the keys of sValueCache then
         put pValue into sValueCache[pValue]
     end if
     put sValueCache[pValue] into xArray[pKey]
   end command

After you have processed all your arrays like this, and 'put empty intosValueCache' - all string elements in your arrays which arecase-sensitively the same will share the same value.

Of course, you can play the same trick with arrays - although it is alittle more tricky, admittedly.

So, anyway, before anyone asks 'why doesn't the engine just do this?'(particularly since it does so for array keys) then the answer isperformance. It is costly to work out which values (which are computeddynamically, or are substrings of another string in different places)are actually the same - thus you'd end up saving memory but costingperformance if the engine uniqued *everything*.

So, the next question is probably going to be, 'why does the engine doit for array keys then?' and the answer here is because stringcomparison is slow - case-less string comparison more so. When youlookup a key in an associative array, it might well take multiple stringcomparisons to find. By 'uniquing' the strings used in array keys, afterthe engine has processed the lookup request it is a constant timeoperation to do each of these comparisons to find the actual element youwant. On balance, this means you save time - assuming that you areaccessing your arrays much more frequently than building them - which isusually the case.

Now, all the above I say with caution - the engine may change how itworks in the future. It might become more 'clever' in some cases, andless 'clever' in others; thus you should only go as far to try andoptimize your code for memory footprint (if you can afford the cost ofthe pre-processing) if YOU REALLY NEED TO.

Clearly, in your (Ben's) case you really do - you are hitting thewindows 2Gb process limit at the moment, and it sounds like it is abatch process running unattended so an initial 'memory miminizationprocess' run on the dataset is probably a cost you can afford to pay.

Anyway, without more details of what you are needing to do the abovemight be completely useless...


Just my 2 pence.

Warmest Regards,

Mark.

--
Mark Waddingham ~ m...@livecode.com ~ http://www.livecode.com/
LiveCode: Everyone can create apps

_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: Garbage collection (crashing on Windows)

Reply via email to