Hi, Alexander, Sorry to reply late on this one. I embedded my questions and comments in-between the lines:
On Sun, Nov 15, 2015 at 7:07 PM, Alexander Filipchik <afilipc...@gmail.com> wrote: > > nodeIterator = store.range( > String.join(".", nodeId, String.valueOf(Character.MIN_VALUE)), > String.join(".", nodeId, String.valueOf(Character.MAX_VALUE))); > > Theoretically, what you want is a prefix scan, the start key should be nodeId + '.' and end key should be nodeId + '.' + maxId, in which maxId should have each character = Character.MAX_VALUE with total length that is equal or greater than the max possible nodeId. I restreamed RockDB changelog topic and I can see all this edges stored > there, but query still returnes only 4.3M nodes. > Could you help to clarify what you did here to "see all these edges" and to "query still returns only 4.3M nodes"? > 1) Have anyone seen such a behaviour before? > Not I am aware of. > 2) What is the best way to debug it on a remote machine? Any particular > logs to look for? Any RockDb config params that should be enabled? > You can try to add Jmx debug port option to task.opts. With Samza 0.10 (latest from trunk), the JMX server port is reported from the AppMaster's web API. As for the state store config, you can try to disable the CachedStore to prevent any potential issues w/ cache management. > 3) Is it a good idea to store a graph in such a format? > As long as you can partition the data based on nodeId, it should be fine. > > Thank you, > Alex > Please let us know if you find any issues with your use case. -Yi