Hello, We already had that discussion in private, but here is a on-list summary:
On Mon, Jan 19, 2009 at 5:39 PM, Guilherme Blanco <guilhermebla...@gmail.com> wrote: > Ok, > > We'll use this method inside Doctrine ORM version 2.0, scheduled to be > released on September 1st, 2009. > > One main location where we are already using it is during Hydration process. > The process of grab a DB tuple and convert it into an Object graph. > Here is the usage. > > Each Object of the graph is a Value Object > (http://en.wikipedia.org/wiki/Value_object). So it does not have any > other mapping else than to-be persisted ones. No internal method > implementation is needed. All Active Record like actions are > controlled by EntityManager. > > Based on that, we have a ClassMetadata that is catch based on class > name (currently based on spl_object_id, but it's too resources > expensive and I'll change that). When we get the DB tuple, we need to > find the exact ClassMetadata of that item and apply the specific > DB/PHP type castings for example. Also there's a property attribution. > Property attribution is thanks to new Reflection API. We store the > ReflectionProperty of each field and assign it when we have its > definition. > > Another location where we rely spl_object_id is inside UnitOfWork > (http://martinfowler.com/eaaCatalog/unitOfWork.html). We generate a > mapping of each Entity/Collection to be persisted/updated/deleted. We > define the order of appliance of these things based on first the > generated OID (spl_object_id return) and later by Topological Sorting > (http://en.wikipedia.org/wiki/Topological_sorting). Finally, we start > the transaction and the statements. > > The point is that we may have being doing a huge hydration with a lots > of relationed objects. We may be dealing with a webpage that fetches > for more than 5000 records with even more associations. All of that > runtime. So I have to say performance is something VERY important for > us. > > Why will we not use SplStorage? > Because it'll be used on different places and should share the same > OID. Including couple of this component is not a viable idea since > it'll go to a more memory expensive solution, which we're trying to > optimize a lot and also will force us to include another get call > (through method call), which will fall into an even slower > implementation. > > Here are two files that we have being using spl_object_id (changed now > to spl_object_hash, since the idea is to update it with Marcus' > suggestions): > Object Driver for Hydration: > http://trac.doctrine-project.org/browser/trunk/lib/Doctrine/ORM/Internal/Hydration/ObjectDriver.php > UnitOfWork for Persistance: > http://trac.doctrine-project.org/browser/trunk/lib/Doctrine/ORM/UnitOfWork.php > > > > Short version: Because we want a fast, easy way to associate > information (temporarily) with an object. Most of the time we use the > object id/hash as a key in an array. Basically, spl_object_hash is > fine, it would just be nice if it could be improved in speed. > All those use cases are related to a [object => data] map, which can be solved by SplObjectStorage: $storage = new SplObjectStorage; $storage[$obj1] = $data; ... var_dump($storage[$obj1]); ... There were three concerns: 1) Speed: the main ground for spl_object_id is speed. => Splobjectstorage is faster than an array with spl_object_hash (and can be made even faster). 2) $storage[$obj1]['index'] = 2; This is sadly a limitation of ArrayAccess => It can be solved either by doing get+change+set, or using an ArrayObject instead of an array. 3) Memory: Since the object itself will be referenced in the storage, you'll have to delete it from every maps in order for GC to do its work. => This is a security, indeed, an object stays unique as long as it exists: $a = new StdClass; $h1 = spl_object_hash($a); unset($a); $b = new StdClass; $h2 = spl_object_hash($b) var_dump($h1===$h2); // bool(true) Conclusion: If you clean your objects without properly taking care of the metadata stored in the array indexed by object_id, you'll get unexpected results anyway. So far it looks like SplObjectStorage is fine with those use cases. If somebody has a practical (with code) use case in which SplObjectStorage can't be sanely used and where spl_object_id is the only solution, please shoot. > > > It'll take me some time to dig into PHP source to try to implement it. > I'm not a C developer and there're more than 4 years I didn't touch a > single line o C code. Also I can read PHP source, but I'm not able to > create it. > I already spoke with Felipe which will help me solving questions about > src, but I cannot guarantee I'll be able to do the job. > > > Regards, > > On Wed, Dec 17, 2008 at 7:19 PM, Marcus Boerger <he...@php.net> wrote: >> Hello Etienne, >> >> Wednesday, December 17, 2008, 7:59:01 PM, you wrote: >> >>> Hello, >> >>> On Wed, Dec 17, 2008 at 7:29 PM, Lars Strojny <l...@strojny.net> wrote: >>>> Hi Guilherme, >>>> >>>> thanks for moving the discussion to the list. >>>> >>>> Am Mittwoch, den 17.12.2008, 15:31 -0200 schrieb Guilherme Blanco: >>>> [...] >>>>> It seems that Marcus controls the commit access to SPL. So I'm turning >>>>> the conversation async, since I cannot find him online at IRC. >>>>> So, can anyone review the patch, comment it and commit if approved? >>>> >>>> Just for clarification, it is not about access, but about maintenance. >>>> So if Marcus gives his go, we can happily apply the patch and add a few >>>> tests (something you could start preparing now). >>>> >>>> cu, Lars >>>> >> >>> Last time I checked with Marcus, there were concerns about disclosing >>> a valid pointer to the user. >>> I'd be happy to see a use-case where this information is really needed >>> heavily. The only real usecase of heavy usages seems to be to >>> implement sets of objects. but splObjectStorage is here for that >>> precise use-case... >> >> Correct in all Etienne. The patch might be a tiny bit faster but exposes >> valid pointers which is extremely bad and also allows other bad things. >> That was the only reason I used md5 hashin. What I needed was something >> that is really unique per object (object pointer or id plus pointer to >> handler table). Since spl_object_hash() does not say how it creates the >> hash it should be fine change the way it does it. Since in a new session >> the hashes are of no more use we can even do that in any new version. >> However I must still insist on not exposing any valid information. >> >> Last but not least. In your code you know the maximum length of the >> extression, so you can allocate the string and snprintf into it. Even >> faster is to do a hexdump into a preallocated string. For the size use: >> char* hash = (char*)safe_emalloc(sizeof(void*), 2, 1); >> Now the dump of the two pointers. >> This approach should make it a bit faster for you. Something that might >> work is to create a random 128 bit hash key that is xored onto the hash >> created from the two pointers. This hash key can be allocated for each >> session the first time the function will be used. If you do that I am more >> than happy to accept that as a replacement for current spl_object_hash(). >> >> marcus >> >>> Regards >> >> >>> -- >>> Etienne Kneuss >>> http://www.colder.ch >> >>> Men never do evil so completely and cheerfully as >>> when they do it from a religious conviction. >>> -- Pascal >> >> >> >> >> Best regards, >> Marcus >> >> > > > > -- > Guilherme Blanco - Web Developer > CBC - Certified Bindows Consultant > Cell Phone: +55 (16) 9215-8480 > MSN: guilhermebla...@hotmail.com > URL: http://blog.bisna.com > São Paulo - SP/Brazil > Regards, -- Etienne Kneuss http://www.colder.ch Men never do evil so completely and cheerfully as when they do it from a religious conviction. -- Pascal -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php