[sage-devel] Re: factorization database

David Joyner Sat, 16 Jun 2007 06:56:58 -0700

As a minor tangential remark, there is now a tutorial for creating
sqlite databases at
http://www.sagemath.org:9001/sqlite-tutorial


+++++++++++++++++++++++++++++++++++++++++++++

On 6/16/07, Jonathan Bober <[EMAIL PROTECTED]> wrote:
>
> Here is a amusing, but real, snippet from a running sage session (don't
> try this at home):
>
> [EMAIL PROTECTED]:~/math$ sage
> ----------------------------------------------------------------------
> | SAGE Version 2.6, Release Date: 2007-06-02                         |
> | Type notebook() for the GUI, and license() for information.        |
> ----------------------------------------------------------------------
> Loading SAGE library. Current Mercurial branch is: bober
> sage: rsa200 =
> 27997833911221327870829467638722601621070446786955428537560009929326128400107609345671052955360856061822351910951365788637105954482006576775098580557613579098734950144178863178946295187237869221823983
> sage: time factor(rsa200)
> CPU times: user 0.68 s, sys: 0.00 s, total: 0.68 s
> Wall time: 0.69
> 3532461934402770121272604978198464368671197400197625023649303468776121253679423200058547956528088349
>  * 
> 7925869954478333033347085841480059687737975857364219960734330341455767872818152135381409304740185467
>
>
> So, anyway, I recently was doing something where I thought it would be
> nice to have an easily usable database of known factorizations for
> interesting numbers, and that it would be especially nice if it was
> usable with SAGE. So I searched google for "factorization database" and,
> amusingly, got
>
> http://www.sagemath.org:9002/sage_trac/ticket/191
>
> as the third hit. So I wasn't the first with this idea. Anyway, I have
> the beginnings of something implemented, but I thought I would ask for
> some suggestions before going further.
>
> What I have been implementing/planning thus far is mostly as follows:
>
> -I have a global object factor_table, which is an instance of
>
> sage.databases.factor_table.FactorizationDatabase
>
> which is a class that inherits from sage.databases.db.Database (so uses
> ZODB for the underlying implementation), and has the following
> properties:
>
> -For all integer n, factor_table[n] is of type
> sage.rings.integer.Integer and is such that n % factor_table[n] == 0
>
> -Something like this...
>         -(factor_table[n] == -1) ==> (is_pseudoprime(n) == True)
>         -(factor_table[n] == 1) <==> (is_prime(n) == True)*
>         -(factor_table[n] == n) ==> (Nothing)
>
> [*There will be more regarding this later for anyone who reads all of
> this email.]
>
> Some questions:
> (1) What should the minimum size be for an entry in the factor table? I
> currently have it as 10^40, but I think that this might be too small.
> (But I have a relatively fast computer.) Some timings for me:
>
> sage: q = next_prime(10^20)
> sage: p = next_prime(10^21)
> sage: time factor(p*q)
> CPU times: user 0.33 s, sys: 0.02 s, total: 0.35 s
> Wall time: 0.36
> 100000000000000000039 * 1000000000000000000117
> sage: q = next_prime(10^25)
> sage: p = next_prime(10^24)
> sage: time factor(p*q)
> CPU times: user 2.40 s, sys: 0.05 s, total: 2.45 s
> Wall time: 2.45
> 1000000000000000000000007 * 10000000000000000000000013
> sage: q = next_prime(10^29)
> sage: p = next_prime(10^28)
> sage: time factor(p*q)
> CPU times: user 16.25 s, sys: 0.10 s, total: 16.35 s
> Wall time: 16.48
> 10000000000000000000000000331 * 100000000000000000000000000319
>
>
> (2) I thought for a while about whether or not factor_table[n] should
> always return a prime number (or 1). I decided against this because I
> would like to support "partial factorizations", so I can have something
> like the following:
>
> sage: factor_table.get_info(A)
> [...] has 5 known prime factors, 2 unfactored composite factors, and at
> least 9 distinct prime factors.
>
> Any reason why this might be a bad idea?
>
> (3) Instead of just storing one prime factor in factor_table[n], the
> full (known) factorization could be stored. This would make things
> simpler, but might increase storage space. But then again, it might save
> storage space somehow - I'm not sure. The factor table would certainly
> have less entries, but they would be bigger entries. (Also, less entries
> means less information, but someone is unlikely to find anything useful
> in a random entry that just happens to be there because it is a factor
> of something "special".) Any thoughts?
>
> (4) I've changed factor() to search the factor table before trying to
> factor normally. This makes the factor table easiest to use. This makes
> the factor table easiest to use, but, given that the factor table would
> probably be an optional package (since it will probably be quite a large
> download), it might be better to just have a special function
> factor_table.factor(). The following code is what I have put in
> factor(), somewhere in the middle:
>
>     if n > factor_table.minimum_entry and factor_table[n] != n:
>         p = factor_table[n]
>         if p == 1:
>             return factorization.Factorization([(n,1)], unit)
>         elif p == -1:
>             if proof and is_prime(n):
>                 return factorization.Factorization([(n,1)], unit)
>             else:
>                 return factorization.Factorization([(n,1)], unit)
>         else:
>             return factor(p,proof=proof) * factor(n/p,proof=proof)
>
> I don't see any problem with having it there, except for a marginal
> slowdown in the general case, but it can fail nicely when the
> factorization database is not installed by having factor_table[n] == n
> always.
>
> For quick partial factorizations, I currently have factor_table.factor()
> as factoring only using the factor table, and possibly returning
> composite factors.
>
> (5) Regarding (*), above: I would like to be able to have some entries
> in the factor table marked as proved primes. The main problem with this
> is that there should be a way to verify the table, and verification
> should involve proving that all the claimed primes are indeed primes,
> but this could take a really long time.
>
> One option, which I don't like, would be to simply prove a number is
> prime exactly once, and then be very careful about making changes to the
> table.
>
> Another thing that might be a possibility in the future is storing a
> "certificate" for any number claimed to be proved prime. The ECPP
> algorithm can give a prime certificate which can (I think) be checked in
> significantly less time than it takes to find the certificate. Then the
> certificate could be checked, which might not take too long (I really am
> not sure about this). This would require new functionality in SAGE,
> which might be difficult to implement (it looks like PARI does not have
> ECPP).
>
> For now, I just ignore this and only check for pseudo-primes, and don't
> mark anything as proved prime.
>
> --
>
> OK, I think that that is it for now. Any other thoughts would be good.
> It seems that the major work to be done here is to get various factor
> tables into usable formats - compared to that, implementation issues
> might be easy.
>
> Congratulations to you if you have read this far. This email was very
> long.
>
> -Jon
>
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/
-~----------~----~----~----~------~----~------~--~---

[sage-devel] Re: factorization database

Reply via email to