On Sat, Aug 13, 2011 at 10:02 PM, sebb <seb...@gmail.com> wrote: > On 12 August 2011 20:56, Gary Gregory <garydgreg...@gmail.com> wrote: >> Hello All: >> >> For a first cut at generics support in Codec, please checkout the >> branch >> https://svn.apache.org/repos/asf/commons/proper/codec/branches/generics >> >> I wrote a migration guide in the root of the project called >> Codec2-Migration.htm. >> >> Let's discuss. > > The original code used Encoder and Decoder interfaces which operated > on Object, and extended these with > BinaryEncoder (byte[]) > StringEncoder(String) > > So for example StringEncoder classes need to implement > encode(Object) : Object > encode(String) : String > > As far as I can tell, those interfaces cannot be directly generified, > because type erasure causes a method signature clash. > So adding generics here means breaking binary compatibilty.
Hello Sebb, Thank you for taking the time to look this over. You are correct, a class cannot implement an interface twice, so it cannot implement the generics version of StringEncoder AND BinaryEncoder. > > Question is, what does adding generics to these interfaces actually provide? As you know. generics let us define precisely what types are supported and avoids the current bug prone guessing game for implementers and call sites. For example, look at the current mess in trunk's BinaryCodec: If I call encode(byte[]), I get a byte[] back. If I call encode(Object) with a byte[], I get a char[] back. Uh? Is the code wrong, is the Javadoc wrong, how am I supposed to know what is right? I say it's a mess. I would flip the question around: If I implemented this today (1) why would I not use generics, and (2) why would I want to support "Object encode(Object)". Codec 2.0 is the opportunity to clean this up. What I write about Encoders here applies to Decoders too. Most encode/decode methods look like this: public Object encode(Object raw) throws EncoderException { if (!(raw instanceof byte[])) { throw new EncoderException("argument not a byte array"); } return toAsciiChars((byte[]) raw); } There is always one (or more) if instanceof expr for a typed method call, and codec-styled class cast exception in the form an EncoderException.This does not smell OO to me. Tangent: Wouldn't this be better typed as a ClassCastException or an IllegalArgumentException anyway? Any advantage of this pattern (is there one? ;) is ruined by the foggy example above. I say foggy and not black box because we are FOSS. I also want typed asymmetric encoders, for example, for the new BM encoder, I want "Set<String> encode(String)", or "Set<CharSequence> encode(CharSequence)" Hex decoding is a similar but differently confusing example: If I call encode(byte[]), the result is: StringUtils.getBytesUnchecked(encodeHexString(array), getCharsetName()); encodeHexString(array) creates a string which encode(byte[]) uses with getBytesUnchecked and the char set name. So encodeHexString(array) calls "new String(encodeHex(data))". If I call encode(Object) with a byte[], the result is: encodeHex(byteArray) Why the difference with the charset name? Do I want to spend the time telling the difference? Is the code wrong? Is the Javadoc wrong? The Hex decode(byte[]|Object) methods suffer from the same problem. Every encode/decode(Object) typed method creates duplication and opportunities for bugs. So we have /at least/ three problems today with the current system which I claim would be eliminated by using generics because there would be one entry point for each type and it would get rid of encode/decode(Object) methods. > Would it not be just as effective to deprecate Decoder and Encoder, > and code can then use either BinaryEncoder or StringEncoder (etc)? We could do that too, but I would still want typed Decoder and Encoder to define codecs that do not fit in BinaryEncoder or StringEncoder. For an example different from the new BM encoder, we could recode the DoubleMetaphone encoder as DoubleMetaphoneResult<String, DoubleMetaphoneResult>. Today, if you want both encodings (normal and alternate,) you have to encode twice with a different parameter to encode(String, boolean) which is not in any interface (and both values are computed internally each time!) Granted, this problem could be solved without generics, but generics makes it obvious what the result type is, instead of reading possibly buggy Javadocs for a return typed as Object. Only changing the result type in 2.0 of "Object encode(Object)" from String to DoubleMetaphoneResult would be a runtime 'surprise' as opposed to a compile time 'surprise.' Catching problems the earlier, the better when migrating. Thank you for the discussion, :) Gary > > At the moment test code is of the form: > > Encoder enc = new Base64(); > ... > byte [] ba = (byte[]) enc.encode(binary)); // unchecked cast > > However this could be written: > > BinaryEncoder enc = new Base64(); > ... > byte [] ba = enc.encode(binary)); // no cast needed > > >> I plan on not touching trunk until the generics code is flushed out >> and brought back to trunk. >> >> Thank you, >> Gary >> >> -- >> http://garygregory.wordpress.com/ >> http://garygregory.com/ >> http://people.apache.org/~ggregory/ >> http://twitter.com/GaryGregory >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >> For additional commands, e-mail: dev-h...@commons.apache.org >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- Thank you, Gary http://garygregory.wordpress.com/ http://garygregory.com/ http://people.apache.org/~ggregory/ http://twitter.com/GaryGregory --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org