Baba: > def i_palindrome(pal): > while len(pal)>1: > if pal[0] == pal[-1]: > pal=pal[1:-1] > return True > > print i_palindrome('annab')
In normal programming a significant percentage of the time is spent debugging. Experience shows that even short functions may be buggy. If you don't want to waste too much time debugging your code you need to adopt a bit more rigorous approach to write programs. Learning to play piano requires a lot of self-discipline, programming too needs some. Ask yourself what are the exact purposes of your function/class/ program. And then define what are the correct outputs for all input corner cases you are able to find. This is TDD, Thought-driven development. For this program, what's the right output for mixed case strings, for empty strings, for input strings that contain many words with mixed white space and punctuation between them, for unicode strings that contain weird characters? Generally even simple functions may become very complex if you want to manage even complex input cases. You are free to ignore some of those inputs, to keep your code simple enough, but it's usually better for your function to not just ignore some inputs, but give some kind of error for the inputs you don't want to consider. Let's say you accept simple Unicode strings, you don't want to ignore white space, an empty string is palindrome, text cases need to be ignored. I don't like Test-Driven development a lot because your most powerful tool is your brain&mind and not your unit-tests, but I like tests a lot. So you need tests for this small function too. Here doctests are enough. Your first test is better to fail, to be sure your doctest is working: def is_palindrome(text): """ >>> is_palindrome("xy") False """ return True if __name__ == "__main__": import doctest doctest.testmod() print "Doctests done." Code formatting and function and argument names are important to keep code readable to yourself, improve its future maintenance, and reduce the total bug count. You may write this Python function in a high level style like this: def is_palidrome(text): ltext = text.lower() return ltext == ltext[::-1] But that doesn't teach you much programming, so for learning purposes you may try to create one function in lower-level style (that also doesn't stress the garbage collector, despite probably being overall slower in Python). So we use iterative loops and single char tests. But if you use a language as Scheme then a recursive solution too is a normal option. Now you need to invent the algorithm that solves your problem. Inventing algorithms that solve your problems is an art that requires training, experience and even ideas from this little book: http://en.wikipedia.org/wiki/How_to_Solve_It There are few ways to solve that problem, one possible way it to look at the first and last char of the string (ignoring their case), if they are different the given word is not palindrome, otherwise you look at the second and penultimate char, and you perform similar comparisons for all the char pairs. If your string length is odd there is no need to test the last char, but if you want to test it with itself because it keeps the code simpler it's not a problem. To avoid bugs like ones in your code you may think about the loop invariant and loop variant. This blog post shows an example to follow: http://reprog.wordpress.com/2010/04/25/writing-correct-code-part-1-invariants-binary-search-part-4a/ I use two indexes, i and j, that move forward and backwards in sync starting from the first and last char of the 'text' string. The program stops when they are on the same char or they cross. I have used an elaborate loop invariant, for this simple program it's overkill, but shows how you may do it. i ==> <== j +--+--+--+--+--+--+--+--+--+ text | | | | | | | | | | +--+--+--+--+--+--+--+--+--+ 0 1 2 3 4 5 6 7 8 def _is_palindrome_loop_invariant(i, j, length): assert i >= 0 assert i < length assert j >= 0 assert j < length assert i <= j assert i == length - 1 - j return True def is_palindrome(text): """ >>> is_palindrome("") True >>> is_palindrome("1") True >>> is_palindrome(u"x") True >>> is_palindrome("aa") True >>> is_palindrome("ab") False >>> is_palindrome("abc") False >>> [is_palindrome(s) for s in ["abA", "ABA", "ABA"]] [True, True, True] >>> is_palindrome("aibohphobia") True >>> is_palindrome("aibohphobia" * 1000) True >>> is_palindrome(list("aibohphobia")) True >>> is_palindrome([1, 2, 3]) Traceback (most recent call last): ... AttributeError: 'int' object has no attribute 'lower' """ n = len(text) i = 0 j = n - 1 while i < j: assert _is_palindrome_loop_invariant(i, j, n) if text[i].lower() != text[j].lower(): return False else: i += 1 j -= 1 return True if __name__ == "__main__": import doctest doctest.testmod() print "Doctests done." is_palindrome([1, 2, 3]) fails not because the input is a list, but because integers lack the lower() method. It's useful to add few failing doctests too, to show what are the limits of the duck typing of the function. As you see I have tried some corner cases, empty string, single char, two chars, three chars, and more. I have also added a 'stress test' with a larger input. In Python this code is not efficient because of the interpreter overhead and because I call a lower() method on each char (string of length 1), so this is not good Python code. But it's easy to translate to a lower level language, and I think Psyco does a good enough work on it. In C language (plus stdbool) the tolower() is more efficient, but strlen() wastes some time. Often it's better to keep around the length too of your strings, for example using a "fat pointer", as in D. // C code //#define NDEBUG #include "assert.h" #include "stdio.h" #include "string.h" #include "stdbool.h" #include "ctype.h" #include "stdlib.h" bool is_palindrome_loop_invariant(int i, int j, int length) { assert(i >= 0); assert(i < length); assert(j >= 0); assert(j < length); assert(i <= j); assert(i == length - 1 - j); return true; } bool is_palindrome(const char *text) { const int n = strlen(text); int i = 0; int j = n - 1; while (i < j) { assert(is_palindrome_loop_invariant(i, j, n)); if (tolower(text[i]) != tolower(text[j])) { return false; } else { i++; j--; } } return true; } void is_palindrome_test() { assert(is_palindrome("")); assert(is_palindrome("1")); assert(is_palindrome("aa")); assert(!is_palindrome("ab")); assert(!is_palindrome("abc")); assert(is_palindrome("aba")); assert(is_palindrome("abA")); assert(is_palindrome("AbA")); assert(is_palindrome("ABA")); assert(is_palindrome("aibohphobia")); assert(!is_palindrome("aaaaabaaaa")); const int n = 1000; const char *s = "aibohphobia"; const int len_s = strlen(s); char *text = malloc(n * len_s + 1); if (text == NULL) exit(EXIT_FAILURE); int i; char *p = text; for (i = 0; i < n; i++) { memcpy(p, s, len_s); p += len_s; } text[n * len_s] = '\0'; // puts(text); assert(is_palindrome(text)); } int main() { is_palindrome_test(); return EXIT_SUCCESS; } C experts (or good C lints) are probably able to find some loss of precision or loss of sign errors in the following lines: const int n = strlen(text); const int len_s = strlen(s); char *text = malloc(n * len_s + 1); memcpy(p, s, len_s); Writing perfect C code isn't an easy art. (I have used signed values because writing correct C code with unsigned values is a pain, despite avoids loss of precision or sign). Compiling the C code with GCC 4.5.1, and disabled assertions (NDEBUG defined): -O3 -Wall -fomit-frame-pointer -S The loop of is_palindrome() gets compiled to (x86, 32 bit): L10: addl $1, %esi subl $1, %ebx cmpl %ebx, %esi jge L9 L4: movsbl (%edi,%esi), %eax movl %eax, (%esp) call _tolower movl %eax, %ebp movsbl (%edi,%ebx), %eax movl %eax, (%esp) call _tolower cmpl %eax, %ebp je L10 Those two calls to _tolower inside the loop don't help performance, but if you know your strings contain only upper case and lower case chars, and you are able to assume few other things on your machine, it's easy to replace them with non-portable inlined code. The addl and subl are the i++ and j--, GCC has moved them at top as commonly done optimization. The first cmpl is the (i < j) test, and the jge is a jump that ends the loop when it's the right time to do so. The movsbl go read one char, and put them in eax. The movl are necessary to set arguments for the call to the tolower function. The last cmpl compares the results of the tolower, and je (jump equal) jumps to the start of the loop if they are equal. Bye, bearophile -- http://mail.python.org/mailman/listinfo/python-list