On Sat, Jan 1, 2011 at 7:15 PM, Alex Leone <acle...@gmail.com> wrote: >> Alex -- can you also post your document to the wiki (or a link to it)? >> http://wiki.sagemath.org/Notebook%20scalability > > Done. It's in the Notes section. > > >> > 1. I wouldn't do a "isAdmin" property for users. Rather, create one or >> > more groups that are marked as isAdmin and then add the users to that >> > group. This is basically how it is done nowdays in linux via the >> > /etc/sudoers file where a group "admin" is marked as being special and >> > the sudo command checks if the user is in the group admin. > > This makes sense. I thought that it would have to be a property on each > user so that lookups would be fast, but I realize that it would probably > just be set as a session variable when the user logs in. > > >> > 2. The permissions, I don't really understand it. Why are they in each >> > group? > > For worksheets, I could think of a few different kinds of permissions: > 1. viewing > 2. editing > 3. changing the title? > 4. deleting the worksheet > For groups, There's not as many, but I thought it would be good to reuse the > same mechanism: > 1. adding other people to the group > 2. changing the group name? > 3. deleting the group > The 'perms' number is a bit-field. If the first bit (0b0001) is set, then > the user has permission to do x. If the second bit (0b0010) is set, the > user has permission to do y, etc. > > > >> > but if there is some crazy long output it might happen. > > If the output gets too long, it would get saved to a separate file, just > like the current notebook saves long output to "full_output.txt". > >> >> > Second, >> > updates on worksheets only happen on the cell level, never on the >> > whole document. I know, mongodb has the ability to update a part of a >> > document via the update command, but I think it's easier to have a >> > collection of all cells and reference to them. >> >> I'm not sure. If you read mongodb documentation/books, the way Alex >> laid things (with all cells in a single document) out is repeatedly >> recommended by them as the recommended way to go. The updating on >> parts of documents with mongodb is very robust, in my experience. >> Also, the data locality (having all the cells in the same document) is >> evidently a big win efficiency wise. >> >> > But still, when a cell is updated, only it's "out" field is modified. >> >> It's "in" field can also be modified, right, e.g., when you modify the >> input? And somebody maybe even the type (why not?). > > I considered both. Here's what I was thinking about: > 1. List references to cells that would go in a separate collection > (db.cells): > a. If there was ever fine-grain revision history (eg see google docs), old > cell contents could stay in the db (maybe as diffs), and the worksheet > object wouldn't get huge. But then again this could be implemented as a > diff of the whole worksheet object or something. > 2. Put the cells in the worksheet object (as proposed): > a. Like William said, it might be better to have all the data localized. > > >> Alex, I don't think you should use an _id field in the individual >> cells though. They aren't complete mongodb documents themselves, so >> don't have to have an "_id" field, and if they do it isn't treated >> specially like the _id of a complete monogodb document (which is >> forced to be unique, etc.). Thus using _id could be misleading. > > This id helps keep track of cells on the client-side, and also if the cells > get rearranged. Perhaps just 'id' would be a better name. > > >> > 4. something trivial, instead of >> > out: [{ t:"stdout", data: "..."} , {t:"stderr", data: "..."}] >> > please just do >> > out: { stdout: "...", stderr: "..." } >> > Mongodb allows to list all keys in such an associative list and no >> > need for this {t: "..."} thing. >> > (or even better, get rid of "out" and just a stdout and stderr key is >> > good enough since their relative ordering doesn't matter.) >> >> +1 -- very good idea. > > The output from a cell is a sequence of messages (Stdout, Stderr, Stdin, > Html, ...). Consider the following code: > sys.stdout.write("out1"); > sys.stderr.write("err1"); > sys.stdout.write("out2"); > sys.stderr.write("err2"); > this would generate > Stdout("out1") > Stderr("err1") > Stdout("out2") > Stderr("err2") > The messages need to be displayed in the order that they are produced.
A compromise between your two suggestions is: out: [{stdout:"out1"}, {stderr:"err1"}, {stdout:"out2"}, {stderr:"err2"}, {image:"foo.png"}] > > >> > 5. Images might probably be referenced explicitly, i.e. out: { img: >> > <file-id-reference> } > > I was thinking that there would be a Plot(...) message, a JMol(,,,) message, > etc, which would reference files. out: [{stdout:"out1"}, {stderr:"err1"}, ..., {image:"foo.png"}, {jmol:"foo.jmol"}, ...] ? In some cases it might make sense to be able to specify coordinates or other rich data: {image:"foo.png", position:[3,7]} This argues for making the output document have a type like you suggested above, e.g., {t:'image', data:'foo.png', position:[3,7]} > Currently in the notebook, any computation output is just a stream of bytes. > But that stream contains different kinds of data - stdout, stderr, latex, > plots, html tables, jmol plots, references to data files that the cell > created, etc. So why not have the computation output be that series of > "messages"? > - Alex That does make sense. William -- William Stein Professor of Mathematics University of Washington http://wstein.org -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org