[ 
https://issues.apache.org/jira/browse/KAFKA-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13882758#comment-13882758
 ] 

Edward Ribeiro commented on KAFKA-1227:
---------------------------------------

Hello folks,

I've just started to look into your new API design and would like to register a 
few observations, from an API design perspective, for now. I hope you enjoy my 
suggestions and, please, let me know what you think about them. Excuse me in 
advance for the long message. Well, let's start:

It is a good practice to replace the implementation specific (List) of the 
parameter by a more general (abstract or interface) type so

{code}
Cluster(java.util.List<Node> nodes, java.util.List<PartitionInfo> partitions) 
{code}

becomes

{code}
Cluster(java.util.Collection<Node> nodes, java.util.Collection<PartitionInfo> 
partitions) 
{code}

This makes it possible to pass a Set, a List, for example. The same goes to 

{code}
bootstrap(java.util.List<java.net.InetSocketAddress> addresses) 
{code}

that becomes

{code}
bootstrap(java.util.Collection<java.net.InetSocketAddress> addresses) 
{code}

This can seem futile, but I have been parts of ZooKeeper API that need to fix a 
thing, but are literally freezed because their API published a concrete class. 
:( 

Also, the methods who return a collection should also return a more generic 
collection so that the swap in the future (say change the List by a Set) 
doesn't become too difficult. Therefore,

{code}
java.util.List<PartitionInfo>   partitionsFor(java.lang.String topic) 
{code}

becomes

{code}
java.util.Collection<PartitionInfo>     partitionsFor(java.lang.String topic) 
{code}

I have also looked into empty() method. Hey, it returns a new object each time 
it's called! See

{code}
    /**
     * Create an empty cluster instance with no nodes and no topic-partitions.
     */
    public static Cluster empty() {
        return new Cluster(new ArrayList<Node>(0), new 
ArrayList<PartitionInfo>(0));
    }
{code}

There's no need to do this. You can create a EMPTY_CLUSTER field as below and 
then return it each time the method is called. See
{code}
   private static final Cluster EMPTY_CLUSTER = new 
Cluster(Collections.<Node>emptyList(), Collections.<Node>emptyList());

    ... 

    /**
     * Create an empty cluster instance with no nodes and no topic-partitions.
     */
    public static Cluster empty() {
        return EMPTY_CLUSTER;
    }
{code}

This option saves creation of unnecessary objects, and makes it easy to perform 
comparison as "if (myNode == myCluster.empty())" in the client side.;)

I also see the necessity of adding an 'isEmpty()' method so that users can 
check if the Cluster is empty. In the case of adding an isEmpty then the 
declaration of EMPTY_CLUSTER becomes

{code}
   private static final Cluster EMPTY_CLUSTER = new 
Cluster(Collections.<Node>emptyList(), Collections.<Node>emptyList()) {
                                                        public boolean 
isEmpty() {
                                                                return true;
                                                        }
                                                }

{code}

As I said, it was just the first glance over the code. I can possibly have 
further suggestions, more algorithmic oriented or yet from an API design 
perspective, but that's all for now. I hope you like it.

Cheers,
Edward Ribeiro

> Code dump of new producer
> -------------------------
>
>                 Key: KAFKA-1227
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1227
>             Project: Kafka
>          Issue Type: New Feature
>            Reporter: Jay Kreps
>            Assignee: Jay Kreps
>         Attachments: KAFKA-1227.patch
>
>
> The plan is to take a dump of the producer code "as is" and then do a series 
> of post-commit reviews to get it into shape. This bug tracks just the code 
> dump.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to