On Feb 13, 10:21 pm, Robert Bradshaw <[EMAIL PROTECTED]> wrote: > On Feb 11, 2008, at 10:14 AM, Carl Witty wrote: > > I'm still willing to work on the "randgen" class I described toward > > the end of this thread: > >http://groups.google.com/group/sage-devel/browse_thread/thread/ > > c2d86a2685018112/4b3136c4a784015a?#4b3136c4a784015a > > > Basically I'm just waiting for somebody to say "Yes, that looks like a > > good design" before I start. > > I would really like to see a sane, centralized pseudo-random number > framework. Requiring every function that uses (perhaps implicitly) > random numbers to pass around optional randgen objects will, in my > opinion, be both inefficient and cumbersome to program with. Rather, > I think the best option is to have a global randgen object that holds > states for the various frameworks we use (e.g. gmp, ntl, etc.) where > algorithms can access it directly. Swapping out this generator for a > new one should be handled via python contexts.
I like Robert's suggestion of using Python contexts. So here's a revised proposal: I describe a module sage.misc.random, which holds a few global functions (which will be imported into the command-line namespace) and the class randgen. The purpose of this module is to manage pseudorandom number generators and their seeds. Note that the methods of randgen are intended to be used by library authors (like the authors of ZZ.random_element() and RR.random_element()), not directly by end-users; end-users will probably only use the global functions. A large part of the purpose of this module is to enable reproducibility: proper use of the functions in this module should allow results to be consistent from one run of Sage to the next. To the extent feasible (without modifying the underlying systems), we will also try to make results consistent from one architecture to another. With preparation and care, it may sometimes also be possible to use these functions to get results that are consistent when some parts of your algorithm change. We refer to this as isolation; the idea is that you can allocate a new randgen object for use by some subalgorithm. Then this subalgorithm can allocate as many random numbers as it wants. When the subalgorithm is finished, the original randgen object (which was unaffected by the subalgorithm) is restored. Thus, changes to your subalgorithm (which might make it request more random numbers, for instance) do not change the behavior of the outer algorithm (as long as the return value of the subalgorithm is unchanged, of course). Isolation also works the other way: a subalgorithm which wants to give consistent answers regardless of the current random number state, may allocate a new randgen object (with either a constant seed, or a seed which is a hash of its input). Once the subalgorithm is complete, the original randgen object is restored. We do not attempt to provide any actual random number generators, random algorithms, etc., in this module (or even any wrappers for random number generators); we only wrap the seed and state handling of underlying systems. We can handle seed management for pseudorandom number generators with three kinds of interfaces. The first and nicest kind is generators where the current state is a separate object, and pseudorandom number generation routines take a reference to this object. This is the nicest kind because we can trivially provide perfect isolation. The second kind is generators where the current state is a global variable in the subsystem, but where we can read out the old state before we replace it. This still allows us to provide perfect isolation, but doing so requires more discipline on the parts of our callers. The third kind is generators where the current state is a global variable which can only be written, not read. In these cases we cannot (reasonably) provide perfect isolation. The global functions are set_random_seed(), random_seed(), and initial_seed(). set_random_seed() should only be called from the command line, never from within library code. When called with an integer parameter, it creates a new randgen with the given seed, and sets that as the current global randgen. When called with no parameter, it picks a new seed itself and prints it. initial_seed() returns the initial random number seed of the current global randgen. sage: set_random_seed(42) sage: initial_seed() 42 sage: set_random_seed() The new random number seed is: 314159265 sage: initial_seed() 314159265 random_seed() returns a new randgen. sage: random_seed(42) Random seed object with initial seed 42 sage: rgen = random_seed(); rgen Random seed object with initial seed 2718281828 randgen objects are Python context managers, so the typical use case for random_seed() is actually: sage: with random_seed(42): print ZZ.random_element() -2 sage: with random_seed(42): print ZZ.random_element() -2 sage: with random_seed(42): print ZZ.random_element() -2 This is how you provide isolation in library code. randgen is a Cython class. The main state it holds is a gmp_randstate_t, although it also has some other cached information. randgen methods include: randstate_python() Returns an instance of random.Random. The first time it is called on a given instance of randgen, a new random.Random is created and seeded from the gmp_randstate_t; this is saved, and subsequent calls return the same random.Random instance. ... There will be similar methods for every subsystem of the first kind (with separate random state objects) (if there are any more, other than Python). set_seed_libc(force=False) set_seed_ntl(force=False) set_seed_pari(force=False) set_seed_magma(force=False) set_seed_mathematica(force=False) set_seed_...() Sets the seed of the specified random number generator, from a new random number from the gmp_randstate_t. Whenever library code is about to use a generator, it should call the corresponding method. For each subsystem, we remember (globally) which randgen was last used to set its seed. If you try to seed the subsystem again from the same randgen, then we return immediately without setting the seed (unless called with force=True, in which case the seed is set unconditionally; this is a performance vs. isolation tradeoff). new_randgen() Creates a new randgen object, seeded from a random number from this object's gmp_randstate_t. initial_seed() Returns the initial seed used to create this randgen. Also, Cython code can just access the gmp_randstate_t directly. Constructor: randgen() Create a new randgen, seeded randomly (from os.urandom() if available, from the system time otherwise). randgen(n) Create a new randgen, seeded from n. The current global randgen is available from current_randgen(). So library code might look like: import sage.misc.random as random py_random = random.current_randgen().randstate_python() print py_random.random() or: random.current_randgen().set_seed_gp() print gp('random()') or, from Cython: cdef randgen rgen = random.current_randgen() mpz_urandomb(z, rgen.gmp_randstate, 512) What do you think? --~--~---------~--~----~------------~-------~--~----~ To post to this group, send email to sage-devel@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sage-devel URLs: http://www.sagemath.org -~----------~----~----~----~------~----~------~--~---