On Wed, 2021-02-24 at 19:18 -0600, Justin Pryzby wrote:
> On Wed, Feb 24, 2021 at 09:14:19AM +0100, Michael J. Baars wrote:
> > I've written this function to insert several rows at once, and noticed a 
> > certain postgresql overhead as you can see from the log file. A lot more 
> > data than
> > the
> > user data is actually sent over the net. This has a certain noticeable 
> > impact on the user transmission speed.
> > 
> > I noticed that a libpq query always has a number of arguments of the 
> > following form:
> > 
> > Oid         paramt[cols]    =       { 1082, 701, 701, 701, 701, 701, 20, 
> > 701 };
> > int         paraml[cols]    =       { 4, 8, 8, 8, 8, 8, 8, 8 };
> > int         paramf[cols]    =       { 1, 1, 1, 1, 1, 1, 1, 1 };
> > 
> > result = PQexecParams(psql_cnn, (char* ) &statement,  1, paramt, (const 
> > char** ) paramv, paraml, paramf, 1);
> > 
> > I think the 'paramf' is completely redundant. The data mode, text or 
> > binary, is already specified in the last argument to 'PQexecParams' and 
> > does not have
> > to be
> > repeated for every value. Am I correct?
> 
> The last argument is the *result* format.
> The array is for the format of the *input* bind parameters.
> 

Yes, but we are reading from and writing to the same table here. Why specify 
different formats for the input and the output exactly?

Is this vector being sent over the network?

In the logfile you can see that the effective user data being written is only 
913kb, while the actual being transmitted over the network is 7946kb when 
writing
one row at a time. That is an overhead of 770%!

> Regarding the redundancy:
> 
> https://www.postgresql.org/docs/current/libpq-exec.html
> > nParams
> >    The number of parameters supplied; it is the length of the arrays 
> > paramTypes[], paramValues[], paramLengths[], and paramFormats[]. (The array 
> > pointers
> > can be NULL when nParams is zero.)
> > paramTypes[]
> >    Specifies, by OID, the data types to be assigned to the parameter 
> > symbols. If paramTypes is NULL, or any particular element in the array is 
> > zero, the
> > server infers a data type for the parameter symbol in the same way it would 
> > do for an untyped literal string.
> > paramValues[]
> >    ...
> > paramLengths[]
> >    Specifies the actual data lengths of binary-format parameters. It is 
> > ignored for null parameters and text-format parameters. The array pointer 
> > can be
> > null when there are no binary parameters.
> > paramFormats[]
> >    Specifies whether parameters are text (put a zero in the array entry for 
> > the corresponding parameter) or binary (put a one in the array entry for the
> > corresponding parameter). If the array pointer is null then all parameters 
> > are presumed to be text strings.
> 
> nParams specifies the length of the arrays: if you pass an array of length
> greater than nParams, then the rest of the array is being ignored.
> 
> You don't *have* to specify Types, and Lengths and Formats can be specified as
> NULL for text format params.
> 
> > semi-prepared
> 
> What does semi-prepared mean ?
> 

I'm writing a total of 4096+ rows of each n columns to this table. Some 
preparation is in order such that the timer can be placed around the actual 
network
transmission only, preparing the statement, i.e. the string of input arguments 
and the input data structures, is being done before the timer is started.

Also I noticed that when more columns are used, prepared statements with too 
many rows * columns cannot be loaded into postgresql, probably because the size 
of
the prepared statements are limited to a certain size in memory. It does not 
return an error of the sorts from PQprepare, only during execution of the
statement.

Thanks,
Mischa.



Reply via email to