Re: [sage-devel] Re: A database of "interesting" graphs

William Stein Mon, 18 May 2015 11:33:48 -0700

On Mon, May 18, 2015 at 11:11 AM, Nathann Cohen <nathann.co...@gmail.com> wrote:
> Hellooooooooo Jernej!
>
>> A group of people here at my UNI wants to start creating databases storing
>> interesting graphs and some of their (non-trivial to compute) invariants.
>> The idea is to then make them available through optional Sage spkg's.
>>
>> ...
>>
>> - Is there any other method that you'd suggest for keeping such
>> classifications?
>> - Do you have any other constructive comments/requests before we start
>> designing such thing?
>
> Well, perhaps not an advice but perhaps an opinion: do not trust
> anybody, do not trust any code.
>
> For a start, I wouldn't use the "database" stuff either (not
> maintained=dangerous). About the SPKG: right now the spkg are a mess:
> you cannot download them from Sage's website (broken links), and my
> work on Sage during the last days have been exclusively focused on
> trying to build an 'up-to-date' list of them. It is no joke: getting
> this list isn't very easy. "sage -standard", "sage -optional" are
> broken, and that's only the beginning.
>
> Also, the long-term status of spkg is a bit uncertain, and I fail to
> see what it would bring you.
>
> To me, the best way to share your graphs and make then easy to use is
> simply to... provide a .py file on your website that makes it easy to
> load them and explore the result of your computations. I thought about
> it a bit, and it does not have to contain much:


I would add to Nathann's opinion my opinion that using SQLite rather
than a .py file would be a better choice in this particular case.
Some advantages to using SQLite over a .py file:

- SQLite database are usable directly on the sqlite3 command line, and
from pretty much *every* programming language there is, etc. -- not
just from Python (like a .py file).

- You can create very fast indexes on every column of a SQLite table
for quickly querying the database.  You can also do possibly more
complicated queries.

- You can have read/write access the database from multiple processes at once.

- Opening the database just means opening a file, hence takes
microseconds, whereas opening a .py file means having Python parse
that entire file, which could be time and memory consuming.  Nathann
does address this point when he write "The point of splitting the two
is that you have a lot of data, and that you do not want to store
everything in RAM in order to be able to  use the data. Then, when you
want the graph, you can do "load_graph(its_ID)" and get it."  However,
SQLite does solve precisely this problem very, very well (and in a way
that scales up better as mentioned below).

 - Using SQLite scales up -- if you need something much bigger, you
can sqitch to PostgreSQL or MariaDB (~MySQL).

I think it's perfectly reasonable to directly using sqlite3 directly
via the Python wrapper here:
https://docs.python.org/2/library/sqlite3.html
You do not have to use SQLalchemy or various abstracts in Sage or anything else.

Anyway, SQLite is a beautiful piece of software and is great for
applications to which it applies, which is I think yours.

>
> 1) A big list of dictionaries, each of which contains {"ID": '12345',
> "parameter1": ..., "parameter2": ... }
> 2) A function taking an ID as input and returning the graph as output
>
> The point of splitting the two is that you have a lot of data, and
> that you do not want to store everything in RAM in order to be able to
> use the data. Then, when you want the graph, you can do
> "load_graph(its_ID)" and get it.
>
> With some documentation at the head of the python file and a couple of
> examples, that's probably the best and most reliable way to share your
> graphs. It could also be used by anybody using Python (more than just
> Sage users).
>
> Note that it is very easy to query such a database:
>
>     sage: my_filtered_list = [load_graph(g['ID']) for g in db if
> g['parameter1'] == 2]
>
> Also, I just had a look at your files: they are very heavy, but the
> text files are not compressed. You could save a lot by either:
> 1) Compressing them
> 2) Storing the graph as graph6 or sparse6 string
>
> graph6 and sparse6 are "safe" ways to store graph, as they are
> relatively famous. You can even encode/decode them in command-line
> with Brendan McKay's "Nauty" tools.
>
> This, of course, should only be taken as "an opinion". Good luck! ;-)
>
> Nathann
>
> --
> You received this message because you are subscribed to the Google Groups 
> "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sage-devel+unsubscr...@googlegroups.com.
> To post to this group, send email to sage-devel@googlegroups.com.
> Visit this group at http://groups.google.com/group/sage-devel.
> For more options, visit https://groups.google.com/d/optout.



-- 
William (http://wstein.org)

-- 
You received this message because you are subscribed to the Google Groups 
"sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sage-devel+unsubscr...@googlegroups.com.
To post to this group, send email to sage-devel@googlegroups.com.
Visit this group at http://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.

Re: [sage-devel] Re: A database of "interesting" graphs

Reply via email to