SQLAlchemy: JSON vs. PickleType vs. raw string for serialised data

2022-02-28 Thread Loris Bennett
Hi,

I have an SQLAlchemy class for an event:

  class UserEvent(Base):
  __tablename__ = "user_events"

  id = Column('id', Integer, primary_key=True)
  date = Column('date', Date, nullable=False)
  uid = Column('gid', String(64), ForeignKey('users.uid'), nullable=False)
  info = ??

The event may have arbitrary, but dict-like data associated with it,
which I want to add in the field 'info'.  This data never needs to be
modified, once the event has been inserted into the DB.

What type should the info field have?  JSON, PickleType, String, or
something else?

I couldn't find any really reliable sounding information about the relative
pros and cons, apart from a Reddit thread claiming that pickled dicts
are larger than dicts converted to JSON or String.

Cheers,

Loris

-- 
This signature is currently under construction.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: When to use SQLAlchemy listen events

2022-02-28 Thread Loris Bennett
"Loris Bennett"  writes:

> Hi,
>
> I am wondering whether SQLAlchemy listen events are appropriate for the
> following situation:
>
> I have a table containing users and a table for events related to users
>
>   class User(Base):
>   __tablename__ = "users"
>
>   uid = Column('uid', String(64), primary_key=True)
>   gid = Column('gid', String(64), ForeignKey('groups.gid'), 
> nullable=False)
>   lang = Column('lang', String(2))
>
>
>   class UserEvent(Base):
>   __tablename__ = "user_events"
>
>   id = Column('id', Integer, primary_key=True)
>   date = Column('date', Date, nullable=False)
>   uid = Column('gid', String(64), ForeignKey('users.uid'), nullable=False)
>   comment = Column('comment', String(256))
>
> (There are also analogous tables for groups and group events).
>
> The functions provided by the interface are things like the following
>
>   add_user(user, group, lang)
>   move_user(user, group)
>   delete_user(user)
>   warn_user(user, reason)
>
> Whereas 'add/move/delete_user' result in changes to the table 'users',
> 'warn_user' does not.  All should produce entries in the table
> 'user_events'. 
>
> There could be more functions similar to 'warn_user' that only create an
> entry in 'user_events'.  Potentially there could be a lot more of
> these than the 'user'-table-changing type.
>
> It seems like for the first three functions, capturing the resulting
> database changes in the table 'user_events' would be a standard use-case
> for listen event.  However, the 'warn_user' function is different.
>
> So can/should I shoehorn the 'warn_user' function to being like the
> others three and use listen events, or should I just come up with my own
> mechanism which will allow any function just to add an entry to the
> events table?

So I just ended up writing my own decorator.  That seems more
appropriate and flexible in this instance.

-- 
This signature is currently under construction.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: C is it always faster than nump?

2022-02-28 Thread Edmondo Giovannozzi
Il giorno sabato 26 febbraio 2022 alle 19:41:37 UTC+1 Dennis Lee Bieber ha 
scritto:
> On Fri, 25 Feb 2022 21:44:14 -0800, Dan Stromberg  
> declaimed the following:
> >Fortran, (still last I heard) did not support pointers, which gives Fortran 
> >compilers the chance to exploit a very nice class of optimizations you 
> >can't use nearly as well in languages with pointers. 
> >
> Haven't looked much at Fortran-90/95 then... 
> 
> Variable declaration gained a POINTER qualifier, and there is an 
> ALLOCATE intrinsic to obtain memory. 
> 
> And with difficulty one could get the result in DEC/VMS FORTRAN-77 
> since DEC implemented (across all their language compilers) intrinsics 
> controlling how arguments are passed -- overriding the language native 
> passing: 
> CALL XYZ(%val(M)) 
> would actually pass the value of M, not Fortran default address-of, with 
> the result that XYZ would use that value /as/ the address of the actual 
> argument. (Others were %ref() and %descr() -- descriptor being a small 
> structure with the address reference along with, say, upper/lower bounds; 
> often used for strings).
> -- 
> Wulfraed Dennis Lee Bieber AF6VN 
> wlf...@ix.netcom.com http://wlfraed.microdiversity.freeddns.org/

The latest Fortran revision is the 2018.
A variable can also have the VALUE attribute even though nowhere in the 
standard is written that it means passing the data by value. It just means that 
if a variable is changed in a procedure the changes don't propagate back to the 
caller.
With the iso_c_binding one can directly call a C function or let a Fortran 
procedure appear as a C function. There is the C_LOC that gives the C address 
of a variable if needed. Of course from 2003 it is fully object oriented.
The claim that it was faster then C is mostly related to the aliasing rule that 
is forbidden in Fortran. The C introduced the "restrict" qualifier for the same 
reason.
In Fortran you also have array operation like you have in numpy. 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Threading question .. am I doing this right?

2022-02-28 Thread Robert Latest via Python-list
Chris Angelico wrote:
> I'm still curious as to the workload (requests per second), as it might still
> be worth going for the feeder model. But if your current system works, then
> it may be simplest to debug that rather than change.

It is by all accounts a low-traffic situation, maybe one request/second. But
the view in question opens four plots on one page, generating four separate
requests. So with only two clients and a blocking DB connection, the whole
application with eight uwsgi worker threads comes down. Now with the "extra
load thread" modification, the app worked fine for several days with only two
threads.

Out of curiosity I tried the "feeder thread" approach with a dummy thread that
just sleeps and logs something every few seconds, ten times total. For some
reason it sometimes hangs after eight or nine loops, and then uwsgi cannot
restart gracefully probably because it is still waiting for that thread to
finish. Also my web app is built around setting up the DB connections in the
request context, so using an extra thread outside that context would require
doubling some DB infrastructure. Probably not worth it at this point.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: SQLAlchemy: JSON vs. PickleType vs. raw string for serialised data

2022-02-28 Thread Albert-Jan Roskam
   On Feb 28, 2022 10:11, Loris Bennett  wrote:

 Hi,

 I have an SQLAlchemy class for an event:

   class UserEvent(Base):
   __tablename__ = "user_events"

   id = Column('id', Integer, primary_key=True)
   date = Column('date', Date, nullable=False)
   uid = Column('gid', String(64), ForeignKey('users.uid'),
 nullable=False)
   info = ??

 The event may have arbitrary, but dict-like data associated with it,
 which I want to add in the field 'info'.  This data never needs to be
 modified, once the event has been inserted into the DB.

 What type should the info field have?  JSON, PickleType, String, or
 something else?

 I couldn't find any really reliable sounding information about the
 relative
 pros and cons, apart from a Reddit thread claiming that pickled dicts
 are larger than dicts converted to JSON or String.

 Cheers,

 Loris

   
   I think you need a
   BLOB. 
https://docs.sqlalchemy.org/en/14/core/type_basics.html#sqlalchemy.types.LargeBinary
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: SQLAlchemy: JSON vs. PickleType vs. raw string for serialised data

2022-02-28 Thread Robert Latest via Python-list
Albert-Jan Roskam wrote:
>  The event may have arbitrary, but dict-like data associated with it,
>  which I want to add in the field 'info'.  This data never needs to be
>  modified, once the event has been inserted into the DB.
>
>  What type should the info field have?  JSON, PickleType, String, or
>  something else?
>
>  I couldn't find any really reliable sounding information about the
>  relative
>  pros and cons, apart from a Reddit thread claiming that pickled dicts
>  are larger than dicts converted to JSON or String.

I've done exactly this. Since my data was strictly ASCII I decided to go for
JSON. But in the end you're the only one who can decide this because only you
know the data. That's why you won't find any hard and fast rule for this.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Getting Syslog working on OSX Monterey

2022-02-28 Thread Peter J. Holzer
On 2022-02-27 22:16:54 +, Barry wrote:
> If you look at the code of the logging modules syslog handle you will see that
> it does not use syslog. It’s assuming that it can network to a syslog 
> listener.
> Such a listener is not running on my systems as far as I know.
> 
> I have always assumed that if I want a logger syslog handler that I would have
> to implement it myself. So far I have code that uses syslog directly and have
> not written that code yet.

What do you mean by using syslog directly? The syslog(3) library
function also just sends messages to a "syslog listener" (more commonly
called a syslog daemon) - at least on any unix-like system I'm familiar
with (which doesn't include MacOS). It will, however, always use the
*local* syslog daemon - AFAIK there is no standard way to open a remote
connection (many syslog daemons can be configured to forward messages to
a remote server, however).

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Getting Syslog working on OSX Monterey

2022-02-28 Thread Barry Scott


> On 28 Feb 2022, at 21:41, Peter J. Holzer  wrote:
> 
> On 2022-02-27 22:16:54 +, Barry wrote:
>> If you look at the code of the logging modules syslog handle you will see 
>> that
>> it does not use syslog. It’s assuming that it can network to a syslog 
>> listener.
>> Such a listener is not running on my systems as far as I know.
>> 
>> I have always assumed that if I want a logger syslog handler that I would 
>> have
>> to implement it myself. So far I have code that uses syslog directly and have
>> not written that code yet.
> 
> What do you mean by using syslog directly? The syslog(3) library
> function also just sends messages to a "syslog listener" (more commonly
> called a syslog daemon) - at least on any unix-like system I'm familiar
> with (which doesn't include MacOS). It will, however, always use the
> *local* syslog daemon - AFAIK there is no standard way to open a remote
> connection (many syslog daemons can be configured to forward messages to
> a remote server, however).

I'm re-reading the code to check on what I'm seeing. (Its been a long
time since I last look deeply at this code).

You can write to /dev/log if you pass that to
SysLogHandler(address='/dev/log'), but the default is to use a socket
to talk to a network listener on localhost:514. There are no deamons
listening on port 514 on my Fedora systems or mac OS.

That is not what you would expect as the default if you are using the C
API.

What you do not see used in the SyslogHandler() is the import syslog
and hence its nor using openlog() etc from syslog API.

Barry



>hp
> 
> -- 
>   _  | Peter J. Holzer| Story must make more sense than reality.
> |_|_) ||
> | |   | h...@hjp.at |-- Charles Stross, "Creative writing
> __/   | http://www.hjp.at/ |   challenge!"
> -- 
> https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Timezone for datetime.date objects

2022-02-28 Thread Morten W. Petersen
Hi Chris, Cameron.

Well, let's say I specify the datetime 2022-02-22 02:02 (AM). I think
everyone could agree that it also means 2022-02-22 02:02:00:00, to
2022-02-22 02:02:59:59.

And I think the same applies for a date. If the pipes are clogged and I
can't take (give) a shit, a shower or do anything else involving fluids, I
can just leave the keys under the doormat, and agree a date with the
plumber, and go off to a friend of relatives' place for a couple of days
while waiting for the plumber to do the necessary work.

Usually that would imply that the plumber visits from 07:00 to 15:00 on the
given date, with an implicit timezone. It could also mean that the plumber
shows up at 01:00 at night and fixes it, or at 18:00 in the evening.

If a newspaper talks about new years celebrations, and specifically talks
about what happens on the 1st of January, this could mean at 00:01, or a
later dinner party at 20:00. But the celebration that starts at midnight,
doesn't start at the same moment all over the world.  So context, or
location and implicit timezone does matter.

I was also thinking of specifying some range objects for my needs, as that
makes sense from what I've worked with earlier, this makes sense for
example for a month or a year (or even decade) as well.

But the point is that a date is not just a flat, one-dimensional object.

Regards,

Morten

On Sun, Feb 27, 2022 at 11:11 PM Chris Angelico  wrote:

> On Mon, 28 Feb 2022 at 08:51, Cameron Simpson  wrote:
> >
> > On 27Feb2022 11:16, Morten W. Petersen  wrote:
> > >I was initially using the date object to get the right timespan, but
> > >then
> > >found that using the right timezone with that was a bit of a pain.  So I
> > >went for the datetime object instead, specifying 0 on hour, minute and
> > >second.
> > >
> > >What's the thinking behind this with the date object?  Wouldn't it be
> nice
> > >to be able to specify a timezone?
> >
> > This has come up before. My own opinion is that no, it would be a bad
> > idea. You're giving subday resolution to an object which is inherently
> > "days". Leaving aside the many complications it brings (compare two
> > dates, now requiring timezone context?) you've already hit on the easy
> > and simple solution: datetimes.
> >
> > I'd even go so far as to suggest that if you needed a timezone for
> > precision, then dates are the _wrong_ precision to work in.
> >
>
> I would agree. If you have timestamps and you're trying to determine
> whether they're within a certain range, and timezones matter, then
> your range is not days; it begins at a specific point in time and ends
> at a specific point in time. Is that point midnight? 2AM? Start/close
> of business? It could be anything, and I don't see a problem with
> requiring that it be specified.
>
> ChrisA
> --
> https://mail.python.org/mailman/listinfo/python-list
>


-- 
I am https://leavingnorway.info
Videos at https://www.youtube.com/user/TheBlogologue
Twittering at http://twitter.com/blogologue
Blogging at http://blogologue.com
Playing music at https://soundcloud.com/morten-w-petersen
Also playing music and podcasting here:
http://www.mixcloud.com/morten-w-petersen/
On Google+ here https://plus.google.com/107781930037068750156
On Instagram at https://instagram.com/morphexx/
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: SQLAlchemy: JSON vs. PickleType vs. raw string for serialised data

2022-02-28 Thread Greg Ewing

On 1/03/22 6:13 am, Albert-Jan Roskam wrote:

I think you need a
BLOB. 
https://docs.sqlalchemy.org/en/14/core/type_basics.html#sqlalchemy.types.LargeBinary


That won't help on its own, since you still need to choose a
serialisation format to store in the blob.

I'd be inclined to use JSON if the data is something that can
be easily represented that way.

--
Greg

--
https://mail.python.org/mailman/listinfo/python-list


Re: SQLAlchemy: JSON vs. PickleType vs. raw string for serialised data

2022-02-28 Thread Cameron Simpson
On 28Feb2022 10:11, Loris Bennett  wrote:
>I have an SQLAlchemy class for an event:
>
>  class UserEvent(Base):
>  __tablename__ = "user_events"
>
>  id = Column('id', Integer, primary_key=True)
>  date = Column('date', Date, nullable=False)
>  uid = Column('gid', String(64), ForeignKey('users.uid'), nullable=False)
>  info = ??
>
>The event may have arbitrary, but dict-like data associated with it,
>which I want to add in the field 'info'.  This data never needs to be
>modified, once the event has been inserted into the DB.
>
>What type should the info field have?  JSON, PickleType, String, or
>something else?

I would use JSON, it expresses dicts well provided the dicts contain 
only basic types (strings, numbers, other dicts/lists of basic types 
recursively).

I have personal problems with pickle because nonPython code can't read 
it.

Cheers,
Cameron Simpson 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Timezone for datetime.date objects

2022-02-28 Thread Chris Angelico
On Tue, 1 Mar 2022 at 09:28, Morten W. Petersen  wrote:
>
> Hi Chris, Cameron.
>
> Well, let's say I specify the datetime 2022-02-22 02:02 (AM). I think 
> everyone could agree that it also means 2022-02-22 02:02:00:00, to 2022-02-22 
> 02:02:59:59.
>

Not sure how many :59s you want there :) I'm going to assume you mean
"02:02:00" to "02:02:59".

> And I think the same applies for a date. If the pipes are clogged and I can't 
> take (give) a shit, a shower or do anything else involving fluids, I can just 
> leave the keys under the doormat, and agree a date with the plumber, and go 
> off to a friend of relatives' place for a couple of days while waiting for 
> the plumber to do the necessary work.
>

That is one of the fundamental differences between humans and
computers. Humans are very VERY sloppy with time descriptions. With
computers, it's much better to be clear about time ranges; a time does
not imply a specific window size. (And autistic humans are more like
computers.)

> Usually that would imply that the plumber visits from 07:00 to 15:00 on the 
> given date, with an implicit timezone. It could also mean that the plumber 
> shows up at 01:00 at night and fixes it, or at 18:00 in the evening.
>

Around here, that means the plumber might visit between 09:00 and
17:00, but also might not. Humans are sloppy.

> If a newspaper talks about new years celebrations, and specifically talks 
> about what happens on the 1st of January, this could mean at 00:01, or a 
> later dinner party at 20:00. But the celebration that starts at midnight, 
> doesn't start at the same moment all over the world.  So context, or location 
> and implicit timezone does matter.
>

Yes, it matters... and you can't depend on newspapers to get these
things correct, because humans are sloppy. For instance, a natural
disaster in some other place in the world might be reported with the
timezone where it happened ("the hurricane that hit New Orleans on
Thursday"), but also might use the timezone where the newspaper is
being published ("last night, a hurricane devastated New Orleans"). So
the implicit timezone matters, but is also very unclear.

> I was also thinking of specifying some range objects for my needs, as that 
> makes sense from what I've worked with earlier, this makes sense for example 
> for a month or a year (or even decade) as well.
>
> But the point is that a date is not just a flat, one-dimensional object.
>

If your needs require a range, then use a range (or a pair of
datetimes). A date does not imply a range.

Side point: It's actually not that uncommon for a day to start or end
at some time other than midnight. For instance, here's the timetable
for one of Melbourne's railway lines:

https://d309ul1fvo6zfp.cloudfront.net/1645676503802/train-4-2022-02-28-2022-03-02.pdf

The last outbound train on a Friday night (scroll down to page 8)
departs at 1:20AM from Dandenong Station and arrives at 1:34AM at
Cranbourne Station. It's the same train that departed Westall Station
at 1:03AM. Whichever way you measure it, that service most definitely
runs entirely after midnight, but it counts as a Friday service, not a
Saturday one. So if I were to ask you "how many services run on a
Friday?", you should count this one, regardless of what your phone
says the day is. The best way to define the day would be 3AM to 3AM.

But that's the railways. Here's a NightRider bus timetable:

https://d309ul1fvo6zfp.cloudfront.net/1645676503802/bus-15131-2021-10-29-2022-12-31.pdf

How many services does THIS route run on a Saturday? Four or five
(depending on direction). It would be best to define THIS day from
midnight, or maybe even noon. The definition is context-sensitive, so
if you need a range, *be explicit*. It's okay for the range to be
implicit to your users, but make it explicit in your code:

day = datetime.datetime(y, m, d, 4, 0, 0), datetime.timedelta(days=1)

Python doesn't have a datetimerange object, so I'd use either a
datetime,timedelta pair, or a tuple of two datetimes. In some
contexts, it might be safe to let the timedelta be implicit, but at
very least, it's worth being clear that "this day" means "starting at
4AM on this day", or whatever it be.

Even when you describe a minute, it is most definitely not 100% clear
whether it means 02:02:00 to 02:03:00 (inclusive/exclusive), 02:02:00
to 02:02:01 (inc/exc), or the exact instant at 02:02:00. All three are
valid meanings, and a full timerange is the only way to be clear.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list