Another thing to be aware of is that any special Geode serialization does not maintain referential integrity. So if you use DataSerializable or pdx and you have a single entry value that references the same object twice then it will end up with references to two objects with different identities but who are clones of each other. But you can use standard java serialization (which does maintain referential integrity) when serializing the entry value. Geode uses DataSerializer.writeObject to serialize an entry value. If the top-level object is serialized with standard java serialization then every object in that graph will also be serialized with java serialization and you will have referential integrity in that graph. Your top level class could be a pdx with two object fields that both reference a java serializable object. In that case the first instance variable would be serialized as a standard java object graph (with RI across that graph). Then the second instance variable is also serialized as its own standard java object graph (with RI across the second graph). But the two graphs do not know about each other. So you would not have RI if one of the object referenced by inst var 1 is also references by inst var 2. But if the top level class was also serialized with java standard serialization then you end up with just one serialized object graph and have RI across that entire graph. By standard java serialization I mean the different ways you can serialize with the JDK like Serializable and Externalizable. Keep in mind that Geode serialization takes precedence. So if you have an object that implements both Externalizable and PdxSerializable then Geode will serialize it as pdx and not as an Externalizable. Also keep in mind that Geode has special support for serializing various JDK and Java arrays. You can learn about these by looking at the DataSerializer class. So for example if Geode is asked to serialize an instance of HashMap it has special support for doing this that does not preserve RI across the graph reached from that HashMap. But if you created your own class that implemented Map and just delegated to the HashMap instance you could then map your wrapper class serialized with standard java serialization and have RI across the entire map.
On Mon, Aug 8, 2016 at 11:13 AM, Barry Oglesby <bogle...@pivotal.io> wrote: > We're actually saying the same thing two different ways. > > As soon as the entry is serialized, referential integrity to any other > entry is lost. Serialization will occur between client and server or > between peers. In your example once Entry2 is serialized, then deserialized > on another member, Instrument@1234 will be equal to but not identical to > the Instrument in Entry1. > > > Thanks, > Barry Oglesby > > > On Mon, Aug 8, 2016 at 12:25 AM, Dharam Thacker <dharamthacke...@gmail.com > > wrote: > >> Hi Barry, >> >> "*I think you're talking about referential integrity across region >> entries - meaning an object in one region entry's value is the same as an >> object in another region entry's value, and you want them to refer to the >> same object."* >> >> No, I am talking about equality of reference within same region's entry >> value. Let me give quick example, >> >> [Instrument Region<String, Instrument>] >> >> Entry1: Key = Bond123, Value = Instrument@1234[Contents > *instrumentId >> = Bond123, Basket = null*] {Assume HashCode = 1234] >> Entry2: Key = CDS100, Value = Instrument@1235[Contents > *instrumentId = >> CDS100, Basket = {Instrument@1234}*] {Assume HashCode = 1235} >> >> As you can notice above, CDS100 instrument is internally referring to >> "Instrument@1234(Having hashcode = 1234)" from Basket, which already >> exists within same region as "Entry1" >> >> Within plain java, this would be ideally same reference and not 2 objects >> as Instrument@1234 resolves to same hashcode and equality within same >> JVM, but what would be the situation in geode assuming it's replicated >> region? >> >> Thanks, >> Dharam >> >> >> >> >> - Dharam Thacker >> >> On Fri, Aug 5, 2016 at 10:55 PM, Barry Oglesby <bogle...@pivotal.io> >> wrote: >> >>> I think you're talking about referential integrity across region entries >>> - meaning an object in one region entry's value is the same as an object in >>> another region entry's value, and you want them to refer to the same object. >>> >>> Unfortunately, referential integrity is not maintained across region >>> entries. As soon as an entry is serialized and deserialized, referential >>> integrity is lost. Entries are kept in serialized form in almost all cases >>> (a local region being an exception). As soon as a client does a put or a >>> peer does a put into a partitioned region or entries are replicated between >>> members, the entry is serialized. So, with this case: >>> >>> regionEntryA -> instrumentA -> basketA -> listOfInstruments >>> regionEntryB -> instrumentB -> basketB -> listOfInstruments >>> >>> If the same instrument is in both lists, then as soon as you serialize >>> these entries and deserialize them, you'll have two separate instrument >>> instances. >>> >>> One thing you can do is to store instrument ids in your basket instead >>> of actual instruments, and then lookup the instrument from the region based >>> on its id. This would save in serialization speed and use less heap in >>> since you would only be storing and replicating ids rather than entire >>> instruments from client to server and from peer to peer. The region lookup >>> could be done in fromData if you're using DataSerializable, or just in a >>> method call to the instrument to get its basket. >>> >>> Serializing an entire instrument every time you add or remove an >>> instrument from its basket will probably be expensive though. Another thing >>> you could do is to have a separate basket region that stores the basket >>> instruments for an instrument. You'd probably just want to store the ids of >>> the instruments rather than the instruments. I don't know how big that list >>> can be or how often it is updated, but you could store one entry per >>> instrument containing the listOfInstruments or an entry per basket >>> instrument. The difference would be serializing an entire list whenever you >>> add or delete from it -vs- serializing an entry when it is added or >>> deleted. Lookup would be a get in the list case and a query in the per >>> entry case. In either case, you'd get a list of instrument ids, then you >>> could do a getAll or function to get the actual instruments. >>> >>> >>> Thanks, >>> Barry Oglesby >>> >>> >>> On Thu, Aug 4, 2016 at 8:34 PM, Dharam Thacker < >>> dharamthacke...@gmail.com> wrote: >>> >>>> OK so it means it follows same contract as per java hashcode equals >>>> where in case hashcode and equality stays it will always refer to same >>>> object when newly created provided their pdx serializaion binary remains >>>> same. >>>> >>>> I asked as I am going to store complex object graph which internally >>>> may refer same instruments within same region. >>>> >>>> Please correct me or guide me if there is anything else as well I >>>> should take care of. >>>> >>>> Thanks, >>>> Dharam >>>> On 4 Aug 2016 23:17, "Michael Stolz" <mst...@pivotal.io> wrote: >>>> >>>>> If the referenced instruments have the same key they will just behave >>>>> as updates to the same entry. You will not get duplicates. >>>>> >>>>> -- >>>>> Mike Stolz >>>>> Principal Engineer, GemFire Product Manager >>>>> Mobile: 631-835-4771 >>>>> >>>>> On Thu, Aug 4, 2016 at 6:53 AM, Dharam Thacker < >>>>> dharamthacke...@gmail.com> wrote: >>>>> >>>>>> Hi All, >>>>>> >>>>>> I am bit new to this. >>>>>> >>>>>> Could some one help me to understand region storage? I have following >>>>>> Instrument instance which internally refers to many Instrument instances >>>>>> via Basket. >>>>>> >>>>>> There is 1 Instrument region which contains all instruments. Would >>>>>> there be duplicate instrument objects created within Basket even though >>>>>> similar Instrument object exists within region (By hashcode/equals - >>>>>> Reference point of view) >>>>>> >>>>>> Class Instrument{ >>>>>> Basket backet;h >>>>>> } >>>>>> >>>>>> Class Backet{ >>>>>> List<Constituent> constituents; >>>>>> } >>>>>> >>>>>> Class Constituent { >>>>>> List<Instrument> instruments; >>>>>> } >>>>>> >>>>>> >>>>>> Thanks, >>>>>> - Dharam Thacker >>>>>> >>>>> >>>>> >>> >> >