SQLAlchemy: JSON vs. PickleType vs. raw string for serialised data
Hi, I have an SQLAlchemy class for an event: class UserEvent(Base): __tablename__ = "user_events" id = Column('id', Integer, primary_key=True) date = Column('date', Date, nullable=False) uid = Column('gid', String(64), ForeignKey('users.uid'), nullable=False) info = ?? The event may have arbitrary, but dict-like data associated with it, which I want to add in the field 'info'. This data never needs to be modified, once the event has been inserted into the DB. What type should the info field have? JSON, PickleType, String, or something else? I couldn't find any really reliable sounding information about the relative pros and cons, apart from a Reddit thread claiming that pickled dicts are larger than dicts converted to JSON or String. Cheers, Loris -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: When to use SQLAlchemy listen events
"Loris Bennett" writes: > Hi, > > I am wondering whether SQLAlchemy listen events are appropriate for the > following situation: > > I have a table containing users and a table for events related to users > > class User(Base): > __tablename__ = "users" > > uid = Column('uid', String(64), primary_key=True) > gid = Column('gid', String(64), ForeignKey('groups.gid'), > nullable=False) > lang = Column('lang', String(2)) > > > class UserEvent(Base): > __tablename__ = "user_events" > > id = Column('id', Integer, primary_key=True) > date = Column('date', Date, nullable=False) > uid = Column('gid', String(64), ForeignKey('users.uid'), nullable=False) > comment = Column('comment', String(256)) > > (There are also analogous tables for groups and group events). > > The functions provided by the interface are things like the following > > add_user(user, group, lang) > move_user(user, group) > delete_user(user) > warn_user(user, reason) > > Whereas 'add/move/delete_user' result in changes to the table 'users', > 'warn_user' does not. All should produce entries in the table > 'user_events'. > > There could be more functions similar to 'warn_user' that only create an > entry in 'user_events'. Potentially there could be a lot more of > these than the 'user'-table-changing type. > > It seems like for the first three functions, capturing the resulting > database changes in the table 'user_events' would be a standard use-case > for listen event. However, the 'warn_user' function is different. > > So can/should I shoehorn the 'warn_user' function to being like the > others three and use listen events, or should I just come up with my own > mechanism which will allow any function just to add an entry to the > events table? So I just ended up writing my own decorator. That seems more appropriate and flexible in this instance. -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: C is it always faster than nump?
Il giorno sabato 26 febbraio 2022 alle 19:41:37 UTC+1 Dennis Lee Bieber ha scritto: > On Fri, 25 Feb 2022 21:44:14 -0800, Dan Stromberg > declaimed the following: > >Fortran, (still last I heard) did not support pointers, which gives Fortran > >compilers the chance to exploit a very nice class of optimizations you > >can't use nearly as well in languages with pointers. > > > Haven't looked much at Fortran-90/95 then... > > Variable declaration gained a POINTER qualifier, and there is an > ALLOCATE intrinsic to obtain memory. > > And with difficulty one could get the result in DEC/VMS FORTRAN-77 > since DEC implemented (across all their language compilers) intrinsics > controlling how arguments are passed -- overriding the language native > passing: > CALL XYZ(%val(M)) > would actually pass the value of M, not Fortran default address-of, with > the result that XYZ would use that value /as/ the address of the actual > argument. (Others were %ref() and %descr() -- descriptor being a small > structure with the address reference along with, say, upper/lower bounds; > often used for strings). > -- > Wulfraed Dennis Lee Bieber AF6VN > wlf...@ix.netcom.com http://wlfraed.microdiversity.freeddns.org/ The latest Fortran revision is the 2018. A variable can also have the VALUE attribute even though nowhere in the standard is written that it means passing the data by value. It just means that if a variable is changed in a procedure the changes don't propagate back to the caller. With the iso_c_binding one can directly call a C function or let a Fortran procedure appear as a C function. There is the C_LOC that gives the C address of a variable if needed. Of course from 2003 it is fully object oriented. The claim that it was faster then C is mostly related to the aliasing rule that is forbidden in Fortran. The C introduced the "restrict" qualifier for the same reason. In Fortran you also have array operation like you have in numpy. -- https://mail.python.org/mailman/listinfo/python-list
Re: Threading question .. am I doing this right?
Chris Angelico wrote: > I'm still curious as to the workload (requests per second), as it might still > be worth going for the feeder model. But if your current system works, then > it may be simplest to debug that rather than change. It is by all accounts a low-traffic situation, maybe one request/second. But the view in question opens four plots on one page, generating four separate requests. So with only two clients and a blocking DB connection, the whole application with eight uwsgi worker threads comes down. Now with the "extra load thread" modification, the app worked fine for several days with only two threads. Out of curiosity I tried the "feeder thread" approach with a dummy thread that just sleeps and logs something every few seconds, ten times total. For some reason it sometimes hangs after eight or nine loops, and then uwsgi cannot restart gracefully probably because it is still waiting for that thread to finish. Also my web app is built around setting up the DB connections in the request context, so using an extra thread outside that context would require doubling some DB infrastructure. Probably not worth it at this point. -- https://mail.python.org/mailman/listinfo/python-list
Re: SQLAlchemy: JSON vs. PickleType vs. raw string for serialised data
On Feb 28, 2022 10:11, Loris Bennett wrote: Hi, I have an SQLAlchemy class for an event: class UserEvent(Base): __tablename__ = "user_events" id = Column('id', Integer, primary_key=True) date = Column('date', Date, nullable=False) uid = Column('gid', String(64), ForeignKey('users.uid'), nullable=False) info = ?? The event may have arbitrary, but dict-like data associated with it, which I want to add in the field 'info'. This data never needs to be modified, once the event has been inserted into the DB. What type should the info field have? JSON, PickleType, String, or something else? I couldn't find any really reliable sounding information about the relative pros and cons, apart from a Reddit thread claiming that pickled dicts are larger than dicts converted to JSON or String. Cheers, Loris I think you need a BLOB. https://docs.sqlalchemy.org/en/14/core/type_basics.html#sqlalchemy.types.LargeBinary -- https://mail.python.org/mailman/listinfo/python-list
Re: SQLAlchemy: JSON vs. PickleType vs. raw string for serialised data
Albert-Jan Roskam wrote: > The event may have arbitrary, but dict-like data associated with it, > which I want to add in the field 'info'. This data never needs to be > modified, once the event has been inserted into the DB. > > What type should the info field have? JSON, PickleType, String, or > something else? > > I couldn't find any really reliable sounding information about the > relative > pros and cons, apart from a Reddit thread claiming that pickled dicts > are larger than dicts converted to JSON or String. I've done exactly this. Since my data was strictly ASCII I decided to go for JSON. But in the end you're the only one who can decide this because only you know the data. That's why you won't find any hard and fast rule for this. -- https://mail.python.org/mailman/listinfo/python-list
Re: Getting Syslog working on OSX Monterey
On 2022-02-27 22:16:54 +, Barry wrote: > If you look at the code of the logging modules syslog handle you will see that > it does not use syslog. It’s assuming that it can network to a syslog > listener. > Such a listener is not running on my systems as far as I know. > > I have always assumed that if I want a logger syslog handler that I would have > to implement it myself. So far I have code that uses syslog directly and have > not written that code yet. What do you mean by using syslog directly? The syslog(3) library function also just sends messages to a "syslog listener" (more commonly called a syslog daemon) - at least on any unix-like system I'm familiar with (which doesn't include MacOS). It will, however, always use the *local* syslog daemon - AFAIK there is no standard way to open a remote connection (many syslog daemons can be configured to forward messages to a remote server, however). hp -- _ | Peter J. Holzer| Story must make more sense than reality. |_|_) || | | | h...@hjp.at |-- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!" signature.asc Description: PGP signature -- https://mail.python.org/mailman/listinfo/python-list
Re: Getting Syslog working on OSX Monterey
> On 28 Feb 2022, at 21:41, Peter J. Holzer wrote: > > On 2022-02-27 22:16:54 +, Barry wrote: >> If you look at the code of the logging modules syslog handle you will see >> that >> it does not use syslog. It’s assuming that it can network to a syslog >> listener. >> Such a listener is not running on my systems as far as I know. >> >> I have always assumed that if I want a logger syslog handler that I would >> have >> to implement it myself. So far I have code that uses syslog directly and have >> not written that code yet. > > What do you mean by using syslog directly? The syslog(3) library > function also just sends messages to a "syslog listener" (more commonly > called a syslog daemon) - at least on any unix-like system I'm familiar > with (which doesn't include MacOS). It will, however, always use the > *local* syslog daemon - AFAIK there is no standard way to open a remote > connection (many syslog daemons can be configured to forward messages to > a remote server, however). I'm re-reading the code to check on what I'm seeing. (Its been a long time since I last look deeply at this code). You can write to /dev/log if you pass that to SysLogHandler(address='/dev/log'), but the default is to use a socket to talk to a network listener on localhost:514. There are no deamons listening on port 514 on my Fedora systems or mac OS. That is not what you would expect as the default if you are using the C API. What you do not see used in the SyslogHandler() is the import syslog and hence its nor using openlog() etc from syslog API. Barry >hp > > -- > _ | Peter J. Holzer| Story must make more sense than reality. > |_|_) || > | | | h...@hjp.at |-- Charles Stross, "Creative writing > __/ | http://www.hjp.at/ | challenge!" > -- > https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: Timezone for datetime.date objects
Hi Chris, Cameron. Well, let's say I specify the datetime 2022-02-22 02:02 (AM). I think everyone could agree that it also means 2022-02-22 02:02:00:00, to 2022-02-22 02:02:59:59. And I think the same applies for a date. If the pipes are clogged and I can't take (give) a shit, a shower or do anything else involving fluids, I can just leave the keys under the doormat, and agree a date with the plumber, and go off to a friend of relatives' place for a couple of days while waiting for the plumber to do the necessary work. Usually that would imply that the plumber visits from 07:00 to 15:00 on the given date, with an implicit timezone. It could also mean that the plumber shows up at 01:00 at night and fixes it, or at 18:00 in the evening. If a newspaper talks about new years celebrations, and specifically talks about what happens on the 1st of January, this could mean at 00:01, or a later dinner party at 20:00. But the celebration that starts at midnight, doesn't start at the same moment all over the world. So context, or location and implicit timezone does matter. I was also thinking of specifying some range objects for my needs, as that makes sense from what I've worked with earlier, this makes sense for example for a month or a year (or even decade) as well. But the point is that a date is not just a flat, one-dimensional object. Regards, Morten On Sun, Feb 27, 2022 at 11:11 PM Chris Angelico wrote: > On Mon, 28 Feb 2022 at 08:51, Cameron Simpson wrote: > > > > On 27Feb2022 11:16, Morten W. Petersen wrote: > > >I was initially using the date object to get the right timespan, but > > >then > > >found that using the right timezone with that was a bit of a pain. So I > > >went for the datetime object instead, specifying 0 on hour, minute and > > >second. > > > > > >What's the thinking behind this with the date object? Wouldn't it be > nice > > >to be able to specify a timezone? > > > > This has come up before. My own opinion is that no, it would be a bad > > idea. You're giving subday resolution to an object which is inherently > > "days". Leaving aside the many complications it brings (compare two > > dates, now requiring timezone context?) you've already hit on the easy > > and simple solution: datetimes. > > > > I'd even go so far as to suggest that if you needed a timezone for > > precision, then dates are the _wrong_ precision to work in. > > > > I would agree. If you have timestamps and you're trying to determine > whether they're within a certain range, and timezones matter, then > your range is not days; it begins at a specific point in time and ends > at a specific point in time. Is that point midnight? 2AM? Start/close > of business? It could be anything, and I don't see a problem with > requiring that it be specified. > > ChrisA > -- > https://mail.python.org/mailman/listinfo/python-list > -- I am https://leavingnorway.info Videos at https://www.youtube.com/user/TheBlogologue Twittering at http://twitter.com/blogologue Blogging at http://blogologue.com Playing music at https://soundcloud.com/morten-w-petersen Also playing music and podcasting here: http://www.mixcloud.com/morten-w-petersen/ On Google+ here https://plus.google.com/107781930037068750156 On Instagram at https://instagram.com/morphexx/ -- https://mail.python.org/mailman/listinfo/python-list
Re: SQLAlchemy: JSON vs. PickleType vs. raw string for serialised data
On 1/03/22 6:13 am, Albert-Jan Roskam wrote: I think you need a BLOB. https://docs.sqlalchemy.org/en/14/core/type_basics.html#sqlalchemy.types.LargeBinary That won't help on its own, since you still need to choose a serialisation format to store in the blob. I'd be inclined to use JSON if the data is something that can be easily represented that way. -- Greg -- https://mail.python.org/mailman/listinfo/python-list
Re: SQLAlchemy: JSON vs. PickleType vs. raw string for serialised data
On 28Feb2022 10:11, Loris Bennett wrote: >I have an SQLAlchemy class for an event: > > class UserEvent(Base): > __tablename__ = "user_events" > > id = Column('id', Integer, primary_key=True) > date = Column('date', Date, nullable=False) > uid = Column('gid', String(64), ForeignKey('users.uid'), nullable=False) > info = ?? > >The event may have arbitrary, but dict-like data associated with it, >which I want to add in the field 'info'. This data never needs to be >modified, once the event has been inserted into the DB. > >What type should the info field have? JSON, PickleType, String, or >something else? I would use JSON, it expresses dicts well provided the dicts contain only basic types (strings, numbers, other dicts/lists of basic types recursively). I have personal problems with pickle because nonPython code can't read it. Cheers, Cameron Simpson -- https://mail.python.org/mailman/listinfo/python-list
Re: Timezone for datetime.date objects
On Tue, 1 Mar 2022 at 09:28, Morten W. Petersen wrote: > > Hi Chris, Cameron. > > Well, let's say I specify the datetime 2022-02-22 02:02 (AM). I think > everyone could agree that it also means 2022-02-22 02:02:00:00, to 2022-02-22 > 02:02:59:59. > Not sure how many :59s you want there :) I'm going to assume you mean "02:02:00" to "02:02:59". > And I think the same applies for a date. If the pipes are clogged and I can't > take (give) a shit, a shower or do anything else involving fluids, I can just > leave the keys under the doormat, and agree a date with the plumber, and go > off to a friend of relatives' place for a couple of days while waiting for > the plumber to do the necessary work. > That is one of the fundamental differences between humans and computers. Humans are very VERY sloppy with time descriptions. With computers, it's much better to be clear about time ranges; a time does not imply a specific window size. (And autistic humans are more like computers.) > Usually that would imply that the plumber visits from 07:00 to 15:00 on the > given date, with an implicit timezone. It could also mean that the plumber > shows up at 01:00 at night and fixes it, or at 18:00 in the evening. > Around here, that means the plumber might visit between 09:00 and 17:00, but also might not. Humans are sloppy. > If a newspaper talks about new years celebrations, and specifically talks > about what happens on the 1st of January, this could mean at 00:01, or a > later dinner party at 20:00. But the celebration that starts at midnight, > doesn't start at the same moment all over the world. So context, or location > and implicit timezone does matter. > Yes, it matters... and you can't depend on newspapers to get these things correct, because humans are sloppy. For instance, a natural disaster in some other place in the world might be reported with the timezone where it happened ("the hurricane that hit New Orleans on Thursday"), but also might use the timezone where the newspaper is being published ("last night, a hurricane devastated New Orleans"). So the implicit timezone matters, but is also very unclear. > I was also thinking of specifying some range objects for my needs, as that > makes sense from what I've worked with earlier, this makes sense for example > for a month or a year (or even decade) as well. > > But the point is that a date is not just a flat, one-dimensional object. > If your needs require a range, then use a range (or a pair of datetimes). A date does not imply a range. Side point: It's actually not that uncommon for a day to start or end at some time other than midnight. For instance, here's the timetable for one of Melbourne's railway lines: https://d309ul1fvo6zfp.cloudfront.net/1645676503802/train-4-2022-02-28-2022-03-02.pdf The last outbound train on a Friday night (scroll down to page 8) departs at 1:20AM from Dandenong Station and arrives at 1:34AM at Cranbourne Station. It's the same train that departed Westall Station at 1:03AM. Whichever way you measure it, that service most definitely runs entirely after midnight, but it counts as a Friday service, not a Saturday one. So if I were to ask you "how many services run on a Friday?", you should count this one, regardless of what your phone says the day is. The best way to define the day would be 3AM to 3AM. But that's the railways. Here's a NightRider bus timetable: https://d309ul1fvo6zfp.cloudfront.net/1645676503802/bus-15131-2021-10-29-2022-12-31.pdf How many services does THIS route run on a Saturday? Four or five (depending on direction). It would be best to define THIS day from midnight, or maybe even noon. The definition is context-sensitive, so if you need a range, *be explicit*. It's okay for the range to be implicit to your users, but make it explicit in your code: day = datetime.datetime(y, m, d, 4, 0, 0), datetime.timedelta(days=1) Python doesn't have a datetimerange object, so I'd use either a datetime,timedelta pair, or a tuple of two datetimes. In some contexts, it might be safe to let the timedelta be implicit, but at very least, it's worth being clear that "this day" means "starting at 4AM on this day", or whatever it be. Even when you describe a minute, it is most definitely not 100% clear whether it means 02:02:00 to 02:03:00 (inclusive/exclusive), 02:02:00 to 02:02:01 (inc/exc), or the exact instant at 02:02:00. All three are valid meanings, and a full timerange is the only way to be clear. ChrisA -- https://mail.python.org/mailman/listinfo/python-list