On 25/04/11 23:16, Anders Logg wrote: > On Mon, Apr 25, 2011 at 11:10:59PM +0100, Garth N. Wells wrote: >> >> >> On 25/04/11 23:04, Anders Logg wrote: >>> On Mon, Apr 25, 2011 at 10:56:25PM +0100, Garth N. Wells wrote: >>>> >>>> >>>> On 25/04/11 22:48, Anders Logg wrote: >>>>> On Mon, Apr 25, 2011 at 10:41:58PM +0100, Garth N. Wells wrote: >>>>>> >>>>>> >>>>>> On 25/04/11 22:33, Anders Logg wrote: >>>>>>> On Mon, Apr 25, 2011 at 10:26:18PM +0100, Garth N. Wells wrote: >>>>>>>> >>>>>>>> >>>>>>>> On 25/04/11 22:08, Anders Logg wrote: >>>>>>>>> On Mon, Apr 25, 2011 at 07:40:21PM -0000, Garth Wells wrote: >>>>>>>>>> On 25/04/11 20:00, Johan Hake wrote: >>>>>>>>>>> On Monday April 25 2011 11:26:36 Garth Wells wrote: >>>>>>>>>>>> On 25/04/11 18:51, Anders Logg wrote: >>>>>>>>>>>>> On Mon, Apr 25, 2011 at 05:11:41PM -0000, Garth Wells wrote: >>>>>>>>>>>>>> On 25/04/11 17:53, Johan Hake wrote: >>>>>>>>>>>>>>> On Monday April 25 2011 08:59:18 Garth Wells wrote: >>>>>>>>>>>>>>>> On 25/04/11 16:47, Johan Hake wrote: >>>>>>>>>>>>>>>>> Commenting out the cache is really not a fix. The problem is >>>>>>>>>>>>>>>>> within >>>>>>>>>>>>>>>>> dolfin. Isn't there another way to deal with this? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> It is a fix if the cache isn't needed. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Sure. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> First: How much penalty are there with a disabled memory >>>>>>>>>>>>>>>>> cache. Maybe >>>>>>>>>>>>>>>>> the problem isn't that bad? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I don't get the point of this cache. The way it is now, a form >>>>>>>>>>>>>>>> is only >>>>>>>>>>>>>>>> preprocessed if it hasn't already been preprocessed, which >>>>>>>>>>>>>>>> seems ok to >>>>>>>>>>>>>>>> me. The old code tried to avoid some preprocessing, but it was >>>>>>>>>>>>>>>> highly >>>>>>>>>>>>>>>> dubious and I doubt that it was effective. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I think the preprocessing stage actually do take some time. >>>>>>>>>>>>>>> AFAIK the >>>>>>>>>>>>>>> preproces stage essentially do two things. It creates a >>>>>>>>>>>>>>> canonical >>>>>>>>>>>>>>> version of the Form so two Forms that are the same, but >>>>>>>>>>>>>>> constructed at >>>>>>>>>>>>>>> different times are beeing treated equal wrt form generation. >>>>>>>>>>>>>>> Then are >>>>>>>>>>>>>>> DOLFIN specific guys extracted. I am not sure what takes the >>>>>>>>>>>>>>> most >>>>>>>>>>>>>>> time. We should probably profiel it... But if it is the latter >>>>>>>>>>>>>>> we >>>>>>>>>>>>>>> could consider putting another cache in place which is more >>>>>>>>>>>>>>> robust wrt >>>>>>>>>>>>>>> changing DOLFIN objects. >>>>>>>>>>>>>> >>>>>>>>>>>>>> It should be easy to avoid the overhead of preprocessing by >>>>>>>>>>>>>> keeping the >>>>>>>>>>>>>> object in scope. If the object changes, the only robust way to >>>>>>>>>>>>>> make sure >>>>>>>>>>>>>> that the form is the same as one in the cache is to compare all >>>>>>>>>>>>>> the >>>>>>>>>>>>>> data. This requires preprocessing the form, which then defeats >>>>>>>>>>>>>> the >>>>>>>>>>>>>> purpose of a cache. It may be possible to add a lightweight >>>>>>>>>>>>>> preprocess >>>>>>>>>>>>>> to UFL, but I don't think that it's worth the effort or extra >>>>>>>>>>>>>> complication. >>>>>>>>>>> >>>>>>>>>>> I think a light weight version might be the way to go. This is then >>>>>>>>>>> stored in >>>>>>>>>>> memory cache. If we are able to strip such a form for all DOLFIN >>>>>>>>>>> specific >>>>>>>>>>> things we would also prevent huge memory leaks with mesh beeing >>>>>>>>>>> kept. >>>>>>>>>>> >>>>>>>>>>> Then we always grab DOLFIN specific data from the passed form >>>>>>>>>>> instead of >>>>>>>>>>> grabbing from the cache. Not sure how easy this will be to >>>>>>>>>>> implement, but I >>>>>>>>>>> think we need to explore it, as the DOLFIN specific part of the >>>>>>>>>>> form really >>>>>>>>>>> has nothing to do with the generated Form. >>>>>>>>>>> >>>>>>>>>>> Martin: >>>>>>>>>>> Why is it important to have the _count in the repr of the form? I >>>>>>>>>>> guess that >>>>>>>>>>> is used in ufl algorithms? Would it be possible to include a second >>>>>>>>>>> repr >>>>>>>>>>> function, which did not include the count? This would then be used >>>>>>>>>>> when the >>>>>>>>>>> signature is checked for. We could then use that repr to generate a >>>>>>>>>>> form which >>>>>>>>>>> is stored in the memory cache. This would then be tripped for any >>>>>>>>>>> DOLFIN >>>>>>>>>>> specific objects. This should work as the _count attribute has >>>>>>>>>>> nothing to do >>>>>>>>>>> with what code gets generated, but it is essential for internal UFL >>>>>>>>>>> algorithms, right? >>>>>>>>>>> >>>>>>>>>>>>> I'm not very happy with this change. >>>>>>>>>>>> >>>>>>>>>>>> The bright side is that slow and correct is a better starting >>>>>>>>>>>> point than >>>>>>>>>>>> fast but wrong ;). >>>>>>>>>>>> >>>>>>>>>>>> An easy fix is to attach the preprocessed form to a Form object. >>>>>>>>>>>> This >>>>>>>>>>>> would work robustly if we can make forms immutable once they've >>>>>>>>>>>> been >>>>>>>>>>>> compiled. Is it possible to make a Python object immutable? >>>>>>>>>>> >>>>>>>>>>> We can probably overload all setattribtue methods which prohibits a >>>>>>>>>>> user to >>>>>>>>>>> write to these but it might not be possible to prohibit a user to >>>>>>>>>>> change >>>>>>>>>>> attributes on instances owned by the Form. I guess this is similare >>>>>>>>>>> to the >>>>>>>>>>> difficulties of preserving constness in C++, but I think it is even >>>>>>>>>>> harder in >>>>>>>>>>> Python. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> What if we have the FFC jit compiler return the preprocessed form, >>>>>>>>>> and >>>>>>>>>> inside dolfin.Form simply do >>>>>>>>>> >>>>>>>>>> class Form(cpp.Form): >>>>>>>>>> def __init__(self, form, . . .. ) >>>>>>>>>> .... >>>>>>>>>> >>>>>>>>>> (...., preprocessed_form) = jit(form, . . . . ) >>>>>>>>>> >>>>>>>>>> form = preprocessed_form >>>>>>>>>> >>>>>>>>>> ..... >>>>>>>>>> >>>>>>>>>> This way, form will have form_data, and the FFC jit function will >>>>>>>>>> know >>>>>>>>>> not to call ufl.preprocess. >>>>>>>>> >>>>>>>>> Here's another strange thing. In the JITObject class, we have two >>>>>>>>> functions: __hash__ and signature. As far as I understand, the first >>>>>>>>> is used to located objects (generated code/modules) in the Instant >>>>>>>>> in-memory cache, while the second is used for the on-disk cache. >>>>>>>>> >>>>>>>>> >From some simple tests I did now, it looks like the __hash__ function >>>>>>>>> does not need to any significant speedup. The JIT benchmark runs just >>>>>>>>> as fast if I call signature from within __hash__. >>>>>>>>> >>>>>>>>> Furthermore, the __hash__ function must also be broken since it relies >>>>>>>>> on calling id on the form. >>>>>>>>> >>>>>>>>> Ideally, we should get Instant to handle the caching, both in-memory >>>>>>>>> and on-disk, by providing two functions __hash__ (fast, for in-memory >>>>>>>>> cache) and signature (slow, for on-disk cache). >>>>>>>>> >>>>>>>>> Since __hash__ cannot call id, it must be able to attach a unique >>>>>>>>> string to the form (perhaps based on an internal counter in FFC). >>>>>>>>> My suggestion would be to add this to UFL, something like set_hash >>>>>>>>> and hash (which would return None if set_hash has not been called). >>>>>>>>> If Martin does not like that, we should be able to handle it on the >>>>>>>>> DOLFIN side. >>>>>>>>> >>>>>>>>> So in conclusion: no in-memory cache in FFC (handled by Instant) and >>>>>>>>> FFC attaches a hash to incoming forms so that Instant may recognize >>>>>>>>> them later. >>>>>>>>> >>>>>>>> >>>>>>>> The code that I disabled was caching preprocessed forms, so I don't see >>>>>>>> how this can be handled by Instant. >>>>>>> >>>>>>> The point would be that one could check that "hash" of the form (some >>>>>>> unique string) instead of computing the signature which involves >>>>>>> preprocessing the form. >>>>>>> >>>>>> >>>>>> How would the hash be computed? To check if the mesh has changed, my >>>>>> limited understanding is that the entire object would have to be >>>>>> serialised, and then a hash computed. How expensive is that? >>>>>> >>>>>> The issue that I ran into was not related to signatures. It was related >>>>>> to the non-UFL data that is attached to arguments. >>>>> >>>>> The hash would be unique to each form. It could just be a counter >>>>> value and the counter would be increased inside Instant for each >>>>> object it gets as input. >>>> >>>> But how does Instant know if a form is new? I also don't see why Instant >>>> should need to know if the mesh associated with a form has changed, but >>>> is for the rest the same. Wouldn't Instant need to be DOLFIN-aware? >>> >>> The hash() function would play the same role as the id() function >>> before with the difference that we can't get the same id for a new >>> form as for an old form that's gone out of scope. >>> >>> Instant should not need to know anything it just does this: >>> >>> check if object has a set_hash() function >>> if so, calls hash() to get the hash value >>> checks the cache for that hash value >>> if not, assign unique value by calling set_hash on the object >>> >>> We would need to make sure from the DOLFIN side that when we change a >>> Form, we also change the hash value (for example by setting it to >>> None) which would trigger the Instant disk cache. >>> >> >> This is the problem - how do we know that it's changed? > > Because we change it. When we modify a Form, we need to invalidate the > hash, by making sure that we modify the Form by calling member > functions that invalidate the hash (setting it to None). >
It's the ufl form that is changed, not the DOLFIN form. Garth >> The original issue is not related to disk vs memory cache, or Instant. >> It is how to avoid calling ufl.preprocess unnecessarily when the form >> repr() is unchanged but the mesh has been changed. > > Exactly, and that's why I suggest introducing the hash as something to > be checked instead of id() which turned out not to be robust. > > And furthermore, the call to id/hash should be handled by Instant, not > FFC. Since Instant handles JIT compiling and caching, FFC should only > need to call Instant, not handle extra caching. > > -- > Anders _______________________________________________ Mailing list: https://launchpad.net/~ffc Post to : ffc@lists.launchpad.net Unsubscribe : https://launchpad.net/~ffc More help : https://help.launchpad.net/ListHelp