As a minor tangential remark, there is now a tutorial for creating sqlite databases at http://www.sagemath.org:9001/sqlite-tutorial
+++++++++++++++++++++++++++++++++++++++++++++ On 6/16/07, Jonathan Bober <[EMAIL PROTECTED]> wrote: > > Here is a amusing, but real, snippet from a running sage session (don't > try this at home): > > [EMAIL PROTECTED]:~/math$ sage > ---------------------------------------------------------------------- > | SAGE Version 2.6, Release Date: 2007-06-02 | > | Type notebook() for the GUI, and license() for information. | > ---------------------------------------------------------------------- > Loading SAGE library. Current Mercurial branch is: bober > sage: rsa200 = > 27997833911221327870829467638722601621070446786955428537560009929326128400107609345671052955360856061822351910951365788637105954482006576775098580557613579098734950144178863178946295187237869221823983 > sage: time factor(rsa200) > CPU times: user 0.68 s, sys: 0.00 s, total: 0.68 s > Wall time: 0.69 > 3532461934402770121272604978198464368671197400197625023649303468776121253679423200058547956528088349 > * > 7925869954478333033347085841480059687737975857364219960734330341455767872818152135381409304740185467 > > > So, anyway, I recently was doing something where I thought it would be > nice to have an easily usable database of known factorizations for > interesting numbers, and that it would be especially nice if it was > usable with SAGE. So I searched google for "factorization database" and, > amusingly, got > > http://www.sagemath.org:9002/sage_trac/ticket/191 > > as the third hit. So I wasn't the first with this idea. Anyway, I have > the beginnings of something implemented, but I thought I would ask for > some suggestions before going further. > > What I have been implementing/planning thus far is mostly as follows: > > -I have a global object factor_table, which is an instance of > > sage.databases.factor_table.FactorizationDatabase > > which is a class that inherits from sage.databases.db.Database (so uses > ZODB for the underlying implementation), and has the following > properties: > > -For all integer n, factor_table[n] is of type > sage.rings.integer.Integer and is such that n % factor_table[n] == 0 > > -Something like this... > -(factor_table[n] == -1) ==> (is_pseudoprime(n) == True) > -(factor_table[n] == 1) <==> (is_prime(n) == True)* > -(factor_table[n] == n) ==> (Nothing) > > [*There will be more regarding this later for anyone who reads all of > this email.] > > Some questions: > (1) What should the minimum size be for an entry in the factor table? I > currently have it as 10^40, but I think that this might be too small. > (But I have a relatively fast computer.) Some timings for me: > > sage: q = next_prime(10^20) > sage: p = next_prime(10^21) > sage: time factor(p*q) > CPU times: user 0.33 s, sys: 0.02 s, total: 0.35 s > Wall time: 0.36 > 100000000000000000039 * 1000000000000000000117 > sage: q = next_prime(10^25) > sage: p = next_prime(10^24) > sage: time factor(p*q) > CPU times: user 2.40 s, sys: 0.05 s, total: 2.45 s > Wall time: 2.45 > 1000000000000000000000007 * 10000000000000000000000013 > sage: q = next_prime(10^29) > sage: p = next_prime(10^28) > sage: time factor(p*q) > CPU times: user 16.25 s, sys: 0.10 s, total: 16.35 s > Wall time: 16.48 > 10000000000000000000000000331 * 100000000000000000000000000319 > > > (2) I thought for a while about whether or not factor_table[n] should > always return a prime number (or 1). I decided against this because I > would like to support "partial factorizations", so I can have something > like the following: > > sage: factor_table.get_info(A) > [...] has 5 known prime factors, 2 unfactored composite factors, and at > least 9 distinct prime factors. > > Any reason why this might be a bad idea? > > (3) Instead of just storing one prime factor in factor_table[n], the > full (known) factorization could be stored. This would make things > simpler, but might increase storage space. But then again, it might save > storage space somehow - I'm not sure. The factor table would certainly > have less entries, but they would be bigger entries. (Also, less entries > means less information, but someone is unlikely to find anything useful > in a random entry that just happens to be there because it is a factor > of something "special".) Any thoughts? > > (4) I've changed factor() to search the factor table before trying to > factor normally. This makes the factor table easiest to use. This makes > the factor table easiest to use, but, given that the factor table would > probably be an optional package (since it will probably be quite a large > download), it might be better to just have a special function > factor_table.factor(). The following code is what I have put in > factor(), somewhere in the middle: > > if n > factor_table.minimum_entry and factor_table[n] != n: > p = factor_table[n] > if p == 1: > return factorization.Factorization([(n,1)], unit) > elif p == -1: > if proof and is_prime(n): > return factorization.Factorization([(n,1)], unit) > else: > return factorization.Factorization([(n,1)], unit) > else: > return factor(p,proof=proof) * factor(n/p,proof=proof) > > I don't see any problem with having it there, except for a marginal > slowdown in the general case, but it can fail nicely when the > factorization database is not installed by having factor_table[n] == n > always. > > For quick partial factorizations, I currently have factor_table.factor() > as factoring only using the factor table, and possibly returning > composite factors. > > (5) Regarding (*), above: I would like to be able to have some entries > in the factor table marked as proved primes. The main problem with this > is that there should be a way to verify the table, and verification > should involve proving that all the claimed primes are indeed primes, > but this could take a really long time. > > One option, which I don't like, would be to simply prove a number is > prime exactly once, and then be very careful about making changes to the > table. > > Another thing that might be a possibility in the future is storing a > "certificate" for any number claimed to be proved prime. The ECPP > algorithm can give a prime certificate which can (I think) be checked in > significantly less time than it takes to find the certificate. Then the > certificate could be checked, which might not take too long (I really am > not sure about this). This would require new functionality in SAGE, > which might be difficult to implement (it looks like PARI does not have > ECPP). > > For now, I just ignore this and only check for pseudo-primes, and don't > mark anything as proved prime. > > -- > > OK, I think that that is it for now. Any other thoughts would be good. > It seems that the major work to be done here is to get various factor > tables into usable formats - compared to that, implementation issues > might be easy. > > Congratulations to you if you have read this far. This email was very > long. > > -Jon > > > > > --~--~---------~--~----~------------~-------~--~----~ To post to this group, send email to sage-devel@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sage-devel URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/ -~----------~----~----~----~------~----~------~--~---