Re: [go-nuts] Returning a pointer or value struct.

Jason E. Aten Fri, 01 Nov 2024 12:14:41 -0700

Hi Tushar,

I think you are getting the tradeoffs and building intuition well.

p.s. Nits, if we want to be super pedantic (keeping in mind Ian's
note that there really are no hard and fast rules)...

1) you probably meant 3 words instead of 3 bytes 
(3 words at 8 bytes per word is 24 bytes; an 8x difference on amd64). 

2) I would not say "then only it makes sense to use pointers". 

Why? Because, as I said, my _default posture_ is to use pointers, almost 
always. 

If the object is bigger than 3 words, I'm going to use a pointer. If the 
object is mutable,
then we must use a pointer, for correctness, no choice about it. Your
statement reverses my presumptive stance.

Gentle reminder that De Morgan's Laws tells
us !(A and B) == (!A or !B) != (!A and !B).  Here A = less than 3 words,
and B = immutable struct. You are talking about (!A and !B), which
is different from my heuristic: only when A and B => use value;
otherwise use pointer.  

But to take it at face value and consider your hypothetical 
of !A and !B,  what does this imply? I.e. where we have a 
large and immutable object, and 
we see, taking your asserted circumstances, that 
using a pointer gives a speed advantage,
as expected (because the object is large):

I would have been using a pointer _anyway_, because of the size
for the better performance, and the 
immutability is a secondary consideration here. Whether
immutable or not, because the pointer gave us
speed, never mind the mutability aspect.

This is why I would not say, "then only it makes sense to use pointers". It
made sense to use pointers because it was a large
struct to begin with, and passing them by pointer is
almost always a win, performance wise. That was the
aim of the heuristic. In your scenario you confirmed the
speed advantage of handling large objects by pointer.
You should have been preferring pointers once you knew
the object was large to begin with; not preferring values
to begin with.

We can consider the another case that you might inquire
about: what  about a big immutable value
that profiles faster by returning and passing around values? 

Then (this is expected to be rare of course, but): sure, go 
with the faster setup!. Values are typically easier for users 
to use. They are almost impossible to mis-use. 

On Friday, November 1, 2024 at 5:21:20 PM UTC Tushar Rawat wrote:

Hi Jason, 

Make sense. 

So ideally even for the types/struct which are *more than 3 bytes* and 
*immutable, 
*if we are able to prove (with some profiling/perf. test) that their is 
actually a significant performance improvement on replacing value with 
pointer returns (because pointer copy is cheaper due to smaller size), then 
only it makes sense to use pointers, otherwise the type being immutable 
should be good enough reason to use value type, except the cases like 
big-integers where the std library itself suggest to use the pointer types 
for performance gain. 

On Friday, November 1, 2024 at 9:41:45 PM UTC+5:30 Jason E. Aten wrote:

Hi Tushar,

My rule of thumb in practice is: for returning structs, I almost always 
return a pointer. 

The exception is -- the only time I would return a value in general -- is 
when:

a) the returned value is _intended_ to be an immutable value, like a 
time.Time and a string value;

*and*

b) the returned value is 3 words (e.g. a word is 8 bytes on a 64-bit 
architecture like amd64) or less.
Both time.Time (3 words) and string (2 words) meet this criteria. 

Why this (b) heuristic? Because over 3 words and
I assume, as a heuristic -- that should be measured if it 
matters -- that it will be faster to copy the one word pointer 
and than the struct value. A pointer of course
is more likely to be faster to copy initially, but slower if 
there is a cache miss on use or if it induces alot more garbage for 
the garbage collector. 

A value is more likely to be on the stack, so typically less garbage; 
and since its more likely to be on the stack, it is also more likely to be 
in the 
hardware's cache lines, so use will be faster. You can see why you 
have to measure your actual use to see which is faster, if it 
turns out the profiling shows that it is your bottleneck 
and thus matters.

And of course the other rules in the previously referenced 
guidelines provided by Ian would also apply.

So if you have a sync.Mutex value inside, or something 
else that cannot be copied (sync.RWMutex, sync.WaitGroup),
then you _must_ return a pointer, per their documentation. Notice if you 
had 
a *sync.Mutex inside, then you might get away  with returning a value, but 
that gets tricky. Are shallow copies enforced, somehow? Will the user
make a mistake in copying them by value with a default shallow copy? Your
API design needs to balance the possibility of user error with space/time 
efficiency.
The docs can say copies are forbidden (like sync or math/big below does), 
but
the user might not read the docs, or remember their rules. 

More detail/an example:

The idea of an immutable value can be subtle. An integer (as in the math 
concept of an integer that can grow towards infinity and possibly become 
very big) 
is a good example of the tradeoffs. 

Usually an integer value fits in a word, because _usually_ they need only 
need be under 64 bits (or 63 bits for signed), and can be represented with 
an int (word sized) or int64. For example: if you are incrementing an 63-bit
positive integer once every clock cycle, and your clock cycle is
an optimistic 10GHz, so 0.1 nanosecond, and assuming that said word
integer can be incremented in a single clock cycle, then a 63 bit or 2^63 
sized integer could be incremented for 29 years before overflowing,
since 2^63/(1e10 increments/sec*60 sec/minute*60 minutes/hour*
24 hour/day*365.25 days/year) = 29.2271 years. Usually we assume
that our code is doing other things as well, and will be restarted and/or 
ported to 128-bit or higher architectures before then.

But notice the distinction when the integers need to get bigger today, say
for checking the math involved with cryptography: the math/big package
uses pointers for big integers because very big integers are going to take 
up many 
more words than 3. So the big package returns pointers and insists on using 
pointers. But that can mean user error if the user accidentally copies them
by value. See the discussion at https://pkg.go.dev/math/big#Int

> Operations always take pointer arguments (*Int) rather 
> than Int values, and each unique Int value requires its own 
> unique *Int pointer. To "copy" an Int value, an existing 
> (or newly allocated) Int must be set to a new value using 
> the Int.Set <https://pkg.go.dev/math/big#Int.Set> method; shallow copies 
of Ints are not 
> supported and may lead to errors.
>
> func (z *Int <https://pkg.go.dev/math/big#Int>) Set(x *Int 
<https://pkg.go.dev/math/big#Int>) *Int <https://pkg.go.dev/math/big#Int> // 
signature of math/big.Int.Set()

Hope this helps capture some of the nuance. Generally pointers
are the safer bet, and values are a performance optimization.

Jason

On Thursday, October 31, 2024 at 7:47:25 PM UTC Tushar wrote:

Got it. I've read both the *references* shared, it seems these rules can be 
used to understand if we should pass the value or pointer to a *function 
argument*. 
But, Can we really say that, the same rules apply for the return types as 
well ? 
I want to make idiomatic decision when to return a struct as value and when 
as pointer. 

Regards,
Tushar
On Thursday, October 31, 2024 at 11:03:13 PM UTC+5:30 Ian Lance Taylor 
wrote:

On Thu, Oct 31, 2024 at 6:24 AM Tushar wrote: 
> 
> I have seen few places where functions are returning struct as a value 
(for instance time package, time is always returned as a value) and however 
at some places folks prefer to return the pointers to struct instead of the 
copy. 
> 
> Which one should be idiomatic approach, and when should we prefer other 
once ? 

There are no hard and fast rules. There are some guidelines at 
https://go.dev/wiki/CodeReviewComments#pass-values and 
https://go.dev/wiki/CodeReviewComments#receiver-type. 

Ian 

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion visit 
https://groups.google.com/d/msgid/golang-nuts/4c5ab7e5-33f5-486c-b231-fea6ecd24360n%40googlegroups.com.

Re: [go-nuts] Returning a pointer or value struct.

Reply via email to