[go-nuts] Implementing method overloading using first class functions

Louki Sumirniy Thu, 03 May 2018 13:14:28 -0700

I have been struggling with getting the most optimal, non-reflective and 
performant way of  implementing something that is a bit like OOP method 
overloading. As some here may know, I am implementing a novel binary tree 
based on complete trees and using ideas from B-heaps in order to achieve 
maximum data cache locality in a search and sort algorithm.


Basically I want to avoid rewriting any code. Note that this is not really 
at all an OOP paradigm that I am trying to ape with this. I want to isolate 
a small set of type specific functions from the much larger general 
functions that deal with tree walking, inserting and deleting. The consumer 
of the code needs to see something that looks like a type specific library, 
and not have to concern themselves with any type assertions or anything 
that complicates their code so that in every way the calling code will look 
the same except for the type of the payload being different and I am not 
excessively concerned with a small amount of call overhead, as it is my 
belief that the data locality structure performance will far exceed the 
small extra cost for this encapsulation. If I am wrong, I suppose I can 
rework it using a code generator to spit out type specific code, but I 
don't want to do this at this stage because it complicates my workflow in 
completing the proof of concept.

So, I'll just briefly explain what I have worked out. Note that I am still 
really a beginner with Go coding, and my reason for posting this is to get 
a little more feedback in case there is any other matters I have not 
accounted for in my construction.

I have the base type that concerns itself with the walk, search, insert and 
delete functions in an interface. Some of the functions deal solely with 
the tree node payload type, and some also require as well tree walk 
functions that are aware only of the abstract characteristics of the 
concrete type, but have a small contact surface with the data type for the 
reason of the page structure as well as the use of zero as the empty node. 
(This theoretically could be  changed in an implementation where zero is a 
valid value but at the same time in a binary tree, implicitly zero is the 
left-edge node anyway and in a binary tree we are trying to keep a sort 
structure so there is only one place a valid null would ever be anyway)

What I have done is create a type struct that gathers all of these 
functions that touch the concrete type, and a set of slightly differently 
named functions from the ones the caller will use, that also need to be 
passed the structure gathering the type specific functions and the walk and 
search/insert delete functions that implicitly must be type aware but only 
of the abstract features of the data, greater, less, equal and empty.

The concrete typed library then contains the type specific functions, these 
are basically comparators, and then the end-user functions are written that 
call functions in the base library that requires access to these type 
specific functions, these are the walk functions (the walk to parent 
function does not need to interact with the concrete type, it only needs to 
understand the geometry of the data structure), but the left/right child 
walks need to be aware of the type as in the structure zero is my 
edge/empty sentinel. Then finally the maybe unique lateral walk functions 
which use the left/right walk functions, these functions need to use the 
simple left/right child walks but also need to know the boundaries of the 
data structure as the allocation of pages in the tree  needs to be sparse 
in order to minimise memory utilisation at the bottom edges. It is a 
complete tree implementation, essentially, but done in the same general way 
as B-heaps, in order to maximise the performance gain by staying within the 
CPU cache.

Of course, a simple code example is needed. First in the base library:

type O struct {
  IsLeft, IsRight, IsEqual, IsEmpty func() bool
  IsOutsideTree, IsInsideTree       func() bool
  LeftChild, RightChild             func()
}

type B interface {
  // These only deal with the concrete type
  IsLeft(interface{}) bool
  IsRight(interface{}) bool
  IsEqual(interface{}) bool
  IsEmpty() bool
  // These deal with the concrete type as well as the parent type
  BIsOutsideTree(O) bool 
  IsOutsideTree() bool
  BIsInsideTree(O) bool 
  IsInsideTree() bool
  BLeftChild(O)  
  LeftChild()
  BRightChild(O)
  RightChild()
  BLeft(O)
  Left()
  BRight(O)
  Right()
  BFind(O)
  BFind(O)
  Find(interface{}) error
  Binsert(O, interface{}) error
  Insert(interface{}) error
  BDelete(O, interface{}) error
  Delete() error
}

I have only included in the above the functions that actually have any 
contact with the concrete type. 

Then, just one example, of how I overload or whatever it might be called, 
so that the base library can depend on the composed/embedded derivative of 
the base interface:

// LeftChild -
func (b *BastU32) LeftChild() {
   b.BLeftChild(b.Overloads)
}

The function called in this can then use the functions passed into it 
without knowing the type, as you can see none of them accept parameters or 
return type specific values.

I bundle all of the functions in the structure at the top because while 
several only require one or two of these functions, some require all of 
them and passing the individual functions makes for a lot of typing in the 
concrete type library.

The b.Overloads - I am not sure, whether it should be pass by value or pass 
by reference. I think pass by reference adds an extra resolution step, but 
it is probably more expensive as every call of the functions will have to 
perform this dereference.

I hope this is clear enough and detailed enough to explain in adequately 
concrete detail what I am trying to achieve and how I am doing it. I don't 
think this is OOP patterning, as I only want to have one level of 
abstraction and the only reason is that it reduces the size of the code 
that needs to be maintained. The base library does not care what the 
payload type is, only how to do these comparisons (the first 4 in the 
interface) and the rest either need those comparators or need a function 
that needs the comparators.

Using first class functions seemed to be the simplest way to implement 
this. It was not otherwise possible to get the base library to use the type 
specific library functions other than by passing the functions and 
rewriting them to use the passed functions, and being there is a reasonably 
sized collection, with 8 functions, and especially the ones in the bottom 
half, nearly all of the set in the struct are required. 

I know the base library functions naming scheme is maybe a bit clumsy, but 
I don't think it's that important.

I think it would be nice if there was some less wordy way to perform the 
override but this would literally be the most direct low level procedural 
programming way of doing it anyway, and as I understand it OOP languages 
hide this but that's how they implement. The big advantage is I can keep 
all the tree and abstract type aware code in the base, and implementation 
for a new type is not excessively long winded. All the non-B-prefixed 
functions have to be written more or less the same as the leftchild example 
above, and the New function that returns the 'class' to the caller only 
directly see the type relevant stuff, the abstract stuff is not directly  
exported.

The concrete type library is all the user of this library needs to know in 
order to use it. They would have to go out of their way to tinker with the 
base library. Because of Go's package scoping rules everything has to be 
exported or this would not be possible to implement in separate packages, 
and stuffing the different type implementations would both be ugly in terms 
of naming the packages. Well, maybe I am wrong about that. If they all 
lived in the same package folder all the delicate internals could be not 
exported and thus not directly accessible.

I think that it is important to keep them apart so that another user of the 
library could either make their own independent repository that imports my 
base library, as there likely is a whole mountain of possible variations. 
For one thing, key/value pairs, sorted by value, where the payload data is 
a struct containing a hash key and something that refers to another data 
set. I designed this originally for a purpose involving searching for half 
hash-collisions, and so for that there would need to be distinct 
comparators that only compare the first or last half of the data in the 
payload, though in every other way the code would be the same. Possibly 
these comparators could be overloaded by a user so only the greater/less 
function (isleft isright) are different.

Thanks in advance for any feedback on this. I struggled for a long time to 
figure out how to do this and I couldn't quite grasp related things. I 
don't think there is really that many real cases where this kind of 
abstraction actually serves any purpose, but any kind of search/indexing 
system likely will need to be able to separate the payload from the 
organising structure. And in an application where several are used (again, 
my intended purpose for developing this) as well as being easier to 
maintain code, the tree side of things is only needed to be stored once as 
it is the same for any other payload data type library using this one, 
which would reduce pressure on the code cache as well.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[go-nuts] Implementing method overloading using first class functions

Reply via email to