No, there is no issue for now. It's just not theoretically 100% safe but the way we use it for now is not problematic.
On Wed, 20 Jul 2016 at 16:07 Maximilian Michels <m...@apache.org> wrote: > Is there a JIRA issue for this? > > On Mon, Jul 18, 2016 at 12:15 PM, Aljoscha Krettek <aljos...@apache.org> > wrote: > > Ah I see, Stephan and I had a quick chat and it's for cases where there > are > > 42s around the edges of the key/namespace. > > > > On Mon, 18 Jul 2016 at 11:51 Aljoscha Krettek <aljos...@apache.org> > wrote: > > > >> In which cases is it not solved? Because then we should make sure to > solve > >> it. > >> > >> On Mon, 18 Jul 2016 at 10:33 Stephan Ewen <se...@apache.org> wrote: > >> > >>> Got it. But the ambiguity is not really solved by that, just lessened. > >>> > >>> On Sun, Jul 17, 2016 at 2:10 PM, Aljoscha Krettek <aljos...@apache.org > > > >>> wrote: > >>> > >>> > @Stephan It's not about the serializers not being able to read the > key. > >>> The > >>> > key/namespace are never read again. It's just about the serialized > form > >>> > possibly being ambiguous since we don't control the TypeSerializers > and > >>> > there might be wanky var-length encoding schemes and what not. > >>> > > >>> > On Fri, 15 Jul 2016 at 19:20 Timothy Farkas < > >>> timothytiborfar...@gmail.com> > >>> > wrote: > >>> > > >>> > > I've faced a similar issue when serializing data two a key value > >>> store. > >>> > Not > >>> > > sure how helpful it is for this case but two possible solutions > I've > >>> used > >>> > > for persisting keys and values under different namespaces to the > same > >>> key > >>> > > value store are: > >>> > > > >>> > > - have all namespaces be the same number of bytes and prefix each > key > >>> > with > >>> > > its namespace. > >>> > > - Include the number of bytes in the name space and key. So the > bytes > >>> > would > >>> > > look like this: > >>> > > > >>> > > [name space num bytes] [ name space] [key num bytes] [key] > >>> > > > >>> > > Thanks, > >>> > > Tim > >>> > > > >>> > > On Fri, Jul 15, 2016 at 9:45 AM, Stephan Ewen <se...@apache.org> > >>> wrote: > >>> > > > >>> > > > Every serializer should know how many bytes to consume. The key > >>> > > serializer > >>> > > > should not need to look for 42 to know where to terminate. > >>> > > > > >>> > > > Otherwise this would be a problem case: > >>> > > > key[42, 42] - 42 - namespace [42, 42, 42] > >>> > > > key[42, 42, 42] - 42 - namespace [42, 42] > >>> > > > > >>> > > > > >>> > > > > >>> > > > On Fri, Jul 15, 2016 at 5:38 PM, Aljoscha Krettek < > >>> aljos...@apache.org > >>> > > > >>> > > > wrote: > >>> > > > > >>> > > > > I left that in on purpose to protect against cases where the > >>> > > combination > >>> > > > > of key and namespace can be ambiguous. For example, these two > >>> > > > combinations > >>> > > > > of key and namespace have the same written representation: > >>> > > > > key [0 1 2] namespace [3 4 5] (values in brackets are byte > arrays) > >>> > > > > key [0 1] namespace [2 3 4 5] > >>> > > > > > >>> > > > > having the "magic number" in there protects against such cases. > >>> > > > > > >>> > > > > On Fri, 15 Jul 2016 at 16:31 Stephan Ewen <se...@apache.org> > >>> wrote: > >>> > > > > > >>> > > > >> My assumption is that this was a sanity check that actually > just > >>> > stuck > >>> > > > in > >>> > > > >> the code. > >>> > > > >> > >>> > > > >> It can probably be removed. > >>> > > > >> > >>> > > > >> PS: Moving this to the dev@flink.apache.org list... > >>> > > > >> > >>> > > > >> > >>> > > > >> > >>> > > > >> On Fri, Jul 15, 2016 at 11:05 AM, 刘彪 <mmyy1...@gmail.com> > wrote: > >>> > > > >> > >>> > > > >> > In AbstractRocksDBState.writeKeyAndNamespace(): > >>> > > > >> > > >>> > > > >> > protected void writeKeyAndNamespace(DataOutputView out) > throws > >>> > > > >> IOException > >>> > > > >> > { > >>> > > > >> > backend.keySerializer().serialize(backend.currentKey(), > out); > >>> > > > >> > out.writeByte(42); > >>> > > > >> > namespaceSerializer.serialize(currentNamespace, out); > >>> > > > >> > } > >>> > > > >> > > >>> > > > >> > Why write a byte 42 between key and namespace? The > >>> keySerializer > >>> > and > >>> > > > >> > namespaceSerializer know their lengths. It seems we don't > need > >>> > this > >>> > > > >> byte. > >>> > > > >> > > >>> > > > >> > Could anybody tell me what it is for? Is there any > situation > >>> that > >>> > > we > >>> > > > >> must > >>> > > > >> > have this separator? > >>> > > > >> > > >>> > > > >> > >>> > > > > > >>> > > > > >>> > > > >>> > > >>> > >> >