On Tue, Dec 11, 2018 at 1:12 PM Avi Gross <avigr...@verizon.net> wrote: > I would then "generate" all possible combinations of digits 0-9 of length N. > There are an amazing number of ways to do that ranging from taking a > range(10**N) and converting it to a string then a list of numeral characters > then tossing out any that have any duplicates, to creating a recursive > function that passes along what has been used so far to the next call. Since > many of these problems only need ONE solution, the generator would only > iterate as many times as needed and no giant lists of numbers or strings need > be in memory at any time. >
First off, you can slash that number significantly, as there are only 10! (not 10**10) possibilities. Secondly, you can eliminate zero from consideration from any letter that starts a sequence other than a single digit, as demonstrated previously. So there are 9! * x, where x is the number of letters that could potentially be zero. That cuts your total possibilities from 10,000,000,000 down to, at worst, 3,628,800. > For each tuple returned by the generator, you can iterate on the set of > unique letters and use string functions to substitute the digits, or perhaps > do this all at once. You would do this to all the items in what I call the > left list as well as all the items on the right list. These would not be > "numeric" so using int() on each item would work EVEN with leading zeroes. > Seems safe enough. > Use str.replace() and int(), job done. > Yes, this is brute force. Using range(1,000,000) (no commas in real use) Use underscores instead: >>> range(1_000_000) range(0, 1000000) > would be a million iterations when there are six unique characters in the > puzzle and as many as 10 billion if all ten are in use. If you use nested > loops like I showed a while ago (albeit to make arbitrary ones for many sizes > could require multiple functions or use of an eval on one built by hand) you > can cut down the number of iterations as the nested loops count down with > each one doing one less than the one before it. Same goes for the recursive > function call method as it passes along what numerals have already been > used. There may already be a permutation function that does this efficiently > in C. > Yeah, check itertools :) > The real cost that dominates here is not memory, albeit garbage collection > may be busy as it generates lots of temporary small bits of data. It is how > the number of iterations grows. Correct. And that's why a pure brute-force solution needs some refinement. Algorithmic improvements almost always trump mechanical improvements. > I have looked at a somewhat related issue of how you generate all possible > SCRABBLE words (or BOGGLE or ...) given specific starting letters. > One way to make all possible combinations is along the lines above with many > needed changes as there can (in theory) be as many as 26 unique letters (in > English) and you can have multiple copies of some letters. If you allow other > languages, the problem grows to the point where brute force is not realistic. > And, ideally, you winnow down your choices by checking each word against a > large dictionary. Hmm, IMO that's the wrong way around. Instead, *start* with the dictionary, and winnow down the possibilities to the words you have. A decent word list will probably have 100K-1M words in it, which is a small enough corpus to go through them all. > Anyone know of one that has a decent selection and can be freely imported? I > mean one word per line. > What OS are you on? For my usage, I just read from /usr/share/dict/words and it's usually there. On Debian Linux, that's provided by the dictionaries-common package, or the wbritish or wamerican Ditch-side-specific packages. In fact, using that file makes your code independent of language (although you may need to concern yourself with different alphabets if you want to support Russian/Greek, and anything involving "letters" doesn't really work with Chinese), so I would strongly recommend it. On Windows, where that path doesn't exist, and there probably aren't standard dictionaries, you could download one of them from wordlist.aspell.net or wordlist.sourceforge.net - they're freely available, but you would have to go fetch them. > I apologize for the length. The main question was whether eval is > particularly expensive. Well, yes it IS expensive... but the cost of it is less significant than the algorithm and consequent number of iterations/attempts. Using eval() on three million possibilities is going to be WAY cheaper than a more efficient calculation technique used on ten billion. Write your code with two main priorities: 1) Get your algorithm right 2) Express your algorithm cleanly in code. Don't worry about performance until you've done the above two steps, *measured*, and found that it's taking too long. ChrisA -- https://mail.python.org/mailman/listinfo/python-list