N-Queen Problem: Python 2.6.5 vs PyPy 1.5.0

reinhardt · on May 29, 2011

"...all N-Queen problem comes to mere 8 lines of code"

If you are optimizing in LOC, it can come down to basically a single expression; of course that's not a reason to write it as a one-liner:

    def n_queen(n):
        return (p for p in itertools.permutations(xrange(n))
                if all(j-i != abs(p[i] - p[j])
                       for i in xrange(n - 1)
                       for j in xrange(i + 1, n)))

beagle3 · on May 29, 2011

If you are optimizing for LOC, use a suitable language:

from http://nsl.com/papers/qn.htm , the code that computes the solution is

  qn:{[n],/{:[n=#*x;,*x;,/_f'f x,\:/:(!n)_dvl,/x]}'(f:0 -1 1+/:)@!n}

And it uses backtracking, which makes it infinitely faster.

But armin isn't optimizing LOC.

ggchappell · on May 29, 2011

This is actually a rather poor way to solve n-queens. Consider those boards that have a queen in the first space of the first row and the second space of the second row. These two queens can attack each other, so no such board is a solution. However, this method will still go through every such board. On the other hand, a backtracking style solution, which builds up the board row by row, will backtrack after the second queen, thus skipping all these boards.

candeira · on May 30, 2011

True, it's inefficient, but it's good for profiling Python interpreters, since it does a lot of work.

It's also fast enough for low n (up to 8). Why write more code when you can get your result with less? Maybe the time he's saved with this approach can now be used writing a deduplicator (to cull out mirrored and rotated solutions).

Maybe the point of the article was not only to show how fast PyPy is, but how convenient itertools.permutation() is. You can't show the convenience of itertools by not using it.

However... You know what? When I was asked to write a n-queens solver recently, I also wrote a backtracker. It just doesn't mean I did it better than this guy.

ggchappell · on May 30, 2011

I know the point of this article is not to give the fastest n-queens algorithm. But what the heck; here is an iterative backtracking solution. On my machine, this code finds all solutions for the 10x10 board in about 1.2 sec, while the code from the article takes just under 18 sec. Using CPy 2.6.5, just like the article. Have not tried PyPy. Output is essentially the same format, except that my generator spits out lists instead of tuples.

Yes, it's twice as long and harder to read. Improvements, anyone?

  def hit_last(p):
      x = p[-1]
      i = len(p)-1
      for j in range(i):
          y = p[j]
          if x == y or i - j == abs(x - y):
              return True
      return False

  def n_queen(n):
      p = [0]
      while True:  # Begin iter w/ board good, except maybe last Q
          full = len(p) == n
          hit = hit_last(p)
          if full or hit:  # Will we backtrack?
              if not hit:  # Found solution?
                  yield p
              while len(p) > 0 and p[-1] == n-1:
                  p.pop()
              if len(p) == 0:
                  return
              p[-1] += 1
          else:
              p.append(0)

kingkilr · on May 30, 2011

Don't worry, PyPy beasts this as well, at ``n_queens(12)`` on my machine:

    alex@alex-gaynor-laptop:/tmp$ time python queens.py 

    real    0m16.837s
    user    0m16.820s
    sys 0m0.000s
    alex@alex-gaynor-laptop:/tmp$ time pypy queens.py 

    real    0m2.548s
    user    0m2.530s
    sys 0m0.010s

overgard · on May 29, 2011

Pypy is very fast, but to me what's way more significant about pypy is that more than being just interpreter, it's a toolchain for writing other interpreters that automatically generates a JIT compiler for you. I'd really be interested to see what sort of performance you could get writing interpreters for other languages in RPython.

dzorz · on May 29, 2011

I think this is an example where C++ really shines. The code is not much longer (44 vs 33 lines, although the output in python version is much nicer) than the python code and on my computer it is about 140x faster than python 2.7.1.

patio11 · on May 29, 2011

We did a distributed N-queens solver to test the Gifu Prefectural Computing Grid that an ex-job helped to develop, and that was literally all ~35 of my lines of production C code ever. (Though most of the grid turned on "graphical" mode, which both a) solved in Java and b) solved in Java at the speed of the Swing ability to update the visual board... which, after realizing, I patched to sleep frequently and run the C in the background anyway.)

This was ~5 years ago, and I don't remember the exact magnitude of the speedup on Java vs. C. I do remember it being rather less than I was expecting.

jrydberg · on May 29, 2011

That's why you write a C-extension to CPython when you need the performance. And if you don't need the performance you stay sane by not having to write stuff in C++.

reinhardt · on May 29, 2011

Using a library for computing the permutations I guess? I haven't touched C++ for years but I'd be surprised if the 44 lines include a permutations implementation.

dzorz · on May 29, 2011

next_permutation is in standard library (header algorithm).

makmanalp · on May 29, 2011

For the curious, STL <algorithm> is one of those things that you wish you knew ages ago. Usually my code is much cleaner after I run through it and replace relevant parts of it with STL stuff.

http://www.cplusplus.com/reference/algorithm/

megaman821 · on May 29, 2011

The optimized version of the PyPY code is around 17x faster than the first version of the Python code. That makes your C++ code 8x faster than the PyPy code, but I consider any Python code within an order of magnitude of C++ good.

dzorz · on May 29, 2011

140x speedup was compared to optimized python version. I've tried it vs. pypy now (version 1.5) - it is 20x faster.

stavros · on May 29, 2011

Would it be possible for you to try ShedSkin on it?

dzorz · on May 29, 2011

I managed to compile the program using shedskin, but it throws a runtime error after it finds the first solution:

    ==== solution 1 =====
    (0, 2, 5, 7, 9, 4, 8, 1, 3, 6)
    terminate called after throwing an instance of '__shedskin__::TypeError*'
    Aborted

stavros · on May 30, 2011

Ah, that's too bad... Thanks for the effort, anyway.

danielharan · on May 30, 2011

The proposed solution uses brute-force. How fast can you try another approach like backtracking if you're writing 30% more LOC?

beza1e1 · on May 30, 2011

The most interesting observation from my point of view is that the author sped up PyPy with a simple manual optimization. This means that PyPy still has even more potential for optimization.

orborde · on May 30, 2011

Why is there so much CPU time being spent in system? And why is the rest divided between User and Nice in a single-threaded process?