On Tue, Apr 7, 2015 at 9:40 AM, <jonas.thornv...@gmail.com> wrote: > Well of course you use same principles like a binary search setting min and > max, closing in on the digit. In this case the searched numbers > base^exp > and number< base^exp+1. > > But since the search is within large bases upto 32-bit space, so base > 4294967295 is the biggest allowed. I need to find the nearest less exp in > base for each (lets call them pseudo digits). But as you see there it will > take time to add them up. So better doing a binary search, you know min-max > half (iteration). You can do the same for a random oracle min max within > range, and if the number of tries in general over 22 i think a random oracle > do it better than a binary search.
Okay, so you're talking about doing a binary search but picking the partition point randomly instead of evenly? > It was a long time since i did this, but i do know there is a threshold where > searching min max with the oracle will be faster than the binary search. I don't know where you got this idea from, but it's fiction. If you split the space in half, then you are always reducing the amount of work remaining by a factor of 2, and the total work is O(log n). If you do an uneven split, then while you will sometimes make a good partition and reduce the amount of work by more than a factor of two, more often the target will be in the larger subspace and you will have reduced the amount of work by less than a factor of two. In the worst case of this, you only remove one element from the search space at a time and end up with a linear search. As a demonstration of this, here's a function which does a binary search for a number in a range and returns the number of partitions it had to perform to find the number: >>> def binsearch(n, min, max): ... if min == max: return 0 ... med = (max + min) // 2 ... if n <= med: ... return 1 + binsearch(n, min, med) ... else: ... return 1 + binsearch(n, med+1, max) ... And here's the average number of partitions needed to find a number in various ranges: >>> sum(binsearch(i, 1, n) for i in range(1, n+1)) / n 16.68928 >>> n = 1000000 >>> sum(binsearch(i, 1, n) for i in range(1, n+1)) / n 19.951424 >>> n = 10000000 >>> sum(binsearch(i, 1, n) for i in range(1, n+1)) / n 23.3222784 Now let's try a variant function that splits the range at one third instead of evenly: >>> def trisearch(n, min, max): ... if min == max: return 0 ... tern = (min + min + max) // 3 ... if n <= tern: ... return 1 + trisearch(n, min, tern) ... else: ... return 1 + trisearch(n, tern+1, max) ... >>> n = 100000 >>> sum(trisearch(i, 1, n) for i in range(1, n+1)) / n 17.88387 >>> n = 1000000 >>> sum(trisearch(i, 1, n) for i in range(1, n+1)) / n 21.502276 >>> n = 10000000 >>> sum(trisearch(i, 1, n) for i in range(1, n+1)) / n 25.1157592 The ternary version invariably takes more partitions on average to find the goal number. Now let's try a third variant that partitions the range randomly: >>> def randsearch(n, min, max): ... if min == max: return 0 ... part = randrange(min, max) ... if n <= part: ... return 1 + randsearch(n, min, part) ... else: ... return 1 + randsearch(n, part+1, max) ... >>> n = 100000 >>> sum(randsearch(i, 1, n) for i in range(1, n+1)) / n 22.17863 >>> n = 1000000 >>> sum(randsearch(i, 1, n) for i in range(1, n+1)) / n 26.78371 >>> n = 10000000 >>> sum(randsearch(i, 1, n) for i in range(1, n+1)) / n 31.3916557 As you can see, this one does a lot worse on average, and at n=1e6 it's already performing worse than either of the previous versions did at n=1e7. I'm not even going to try this for n=1e8 because it would take too long to run. -- https://mail.python.org/mailman/listinfo/python-list