On 2023-01-14 23:26:27 -0500, Dino wrote: > Hello, I have built a PoC service in Python Flask for my work, and - now > that the point is made - I need to make it a little more performant (to be > honest, chances are that someone else will pick up from where I left off, > and implement the same service from scratch in a different language (GoLang? > .Net? Java?) but I am digressing). > > Anyway, my Flask service initializes by loading a big "table" of 100k rows > and 40 columns or so (memory footprint: order of 300 Mb)
300 MB is large enough that you should at least consider putting that into a database (Sqlite is probably simplest. Personally I would go with PostgreSQL because I'm most familiar with it and Sqlite is a bit of an outlier). The main reason for putting it into a database is the ability to use indexes, so you don't have to scan all 100 k rows for each query. You may be able to do that for your Python data structures, too: Can you set up dicts which map to subsets you need often? There are some specialized in-memory bitmap implementations which can be used for filtering. I've used [Judy bitmaps](https://judy.sourceforge.net/doc/Judy1_3x.htm) in the past (mostly in Perl). These days [Roaring Bitmaps](https://www.roaringbitmap.org/) is probably the most popular. I see several packages on PyPI - but I haven't used any of them yet, so no recommendation from me. Numpy might also help. You will still have linear scans, but it is more compact and many of the searches can probably be done in C and not in Python. > As you can imagine, this is not very performant in its current form, but > performance was not the point of the PoC - at least initially. For performanc optimization it is very important to actually measure performance, and a good profiler helps very much in identifying hot spots. Unfortunately until recently Python was a bit deficient in this area, but [Scalene](https://pypi.org/project/scalene/) looks promising. hp -- _ | Peter J. Holzer | Story must make more sense than reality. |_|_) | | | | | h...@hjp.at | -- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!"
signature.asc
Description: PGP signature
-- https://mail.python.org/mailman/listinfo/python-list