The Chess Mind

Author: Dennis Monokroussos.
This is a blog for chess fans by a chess fan who is more than a chess fan - other topics do creep in from time to time, per my interest.
All material here is copyrighted, and may not be reproduced without my prior permission.

Tuesday, June 27, 2006

One Horizon Effect, or Two?

A reader (Paul) writes:

Hi, I love the blog. I thought you'd find this entertaining. I was running an internet blitz game through Fritz for suggestions, and it proposed the line below. I knew something was wrong, but couldn't believe just how wrong...(Fritz when asked for a hint at the end of the line sees it immediately of course). Anyhow, I hope you enjoy the small piece of entertainment in return.

1. e4 c5 2. Nf3 Nc6 3. d4 cd 4. c3 dc 5. Nxc3 Nf6 6. Bc4 e5 7. 0-0 Fritz evaluates as 0.22, and recommends instead 7. Ng5 d5 8. Nxd5 Nxd5 9. Bxd5 Bb4+ 10. Kf1 Qf6 11. Bxf7+ Kf8 12. Bd5 Nd4 13. Be3 which it evaluates as 0.75. Just lovely.

Hi Paul,

Thanks for the nice words about the blog, and I appreciate your submission.

I attempted to replicate your results, but was unsuccessful. One thing I'd need to know was which version of Fritz you were using (I'm on Fritz 9), and at what move it produced that variation, what depth, etc. My best guess is that this analysis takes place after Black's 6th move, but on my computer Fritz gives 7.Ng5 d5 8.Nxd5 Nxd5 9.Bxd5 Be6 as Black's best hope, but thinks White has a pawn advantage or so after 10.Bxc6+ bxc6. (That's at depth 13; at depth 14 it continues 11.Qxd8+ Rxd8 12.Nxe6 fxe6, with the same +1.08 evaluation.)

This doesn't change as I move further into your variation. After 9.Bxd5, it considers 9...Bb4+ as its second choice, but follows up 10.Kf1 with 10...Be6, not 10...Qf6. (At least through depth 14.) Once I've entered 10...Qf6, it advocates 11.Bxf7+ Kf8 12.Bd5, but then thinks Black should play 12...h6. Finally, once we get to 12.Bd5, 12...Nd4 is its third choice, and it immediately recommends 13.Kg1, avoiding all the ...Qa6 shenanigans it apparently overlooked in your experience.

Nevertheless, my inability to duplicate your results doesn't disprove the more general phenomenon known as the horizon effect. This refers to the propensity of computers to calculate a variation to a certain depth and evaluate it in their favor, only to find, upon getting nearer to the line's conclusion, that the evaluation was (seriously) mistaken. The problem was initially out of the computer's "sight" - it was like a ship approaching, but not yet having appeared on the horizon.

Much has been made of this weakness in computer chess over the years, but not necessarily correctly. The reason is that this same problem befalls humans: it often happens that we calculate long variations, only to miss a zinger at or near the end of our variation. So why make fun of chess engines for the same thing? As long as computers are too weak to solve the game, and are forced to search, prune, and evaluate without certainty, horizon effect errors are guaranteed to occur. (Indeed, in a trivial sense, all (unintentional) errors are horizon effect errors.)

It seems to me that we can distinguish between types of horizon effect errors, though, in a way that is illuminating to the difference between human play and that of computers, and which may still be of use in games between the two. The first, not-too-interesting or useful sort is the one we've discussed so far: the missed tactic at the end of a long sequence. Chess engines are getting better and better at not missing these, but I'm not sure those errors can be stamped out completely, prior to the game's being solved.

The second and, for our purposes, more interesting sort is what I've called the frog-in-the-kettle problem. Apparently (I haven't tested this, and earnestly hope no one reading this will, either) if you put a frog in hot water, it will show good sense and jump back out if it can, but if you put it in warm weather and heat the pot, it will stay put. The application of this strange fact is not that you should put your computer in a vat of water and heat it - surprise, surprise. Rather, it's this: if you engage in a slow build-up against the enemy king, but do so in such a way that there are no hard-to-meet threats coming up against the enemy king in the next 5-10 moves, it turns out that the program will tend to ignore what you're doing, and will evaluate the position favorably to itself, provided everything else is going well.

As always, programmers are aware of the problem and are doing what they can to fix it, and it's not as easy to exploit this idea as it once was. Even so, as I've shown many times on my ChessBase show and on the blog, too, chess engines tend to underestimate one side's attacking prospects until the threats are right on top of them.

Thus while the first sort of horizon effect is a general problem that afflicts humans and engines alike, this second problem is distinctively silicon-based. A moderately experienced club player will know almost immediately that when the opponent starts massing troops on the border, it's time to bring in the reinforcements, or send the king elsewhere, or do something to deter the opponent's attacking ambitions. Not so for chess engines, even for one that's the strongest player in the world.

Unless Kramnik is reading this blog - and I'll go out on a limb and guess that he isn't - it's unlikely that any of us are going to face a computer in a meaningful event anytime soon. It is useful to keep this second horizon effect idea in mind when using a chess engine to analyze, however. If you're examining a position where one side seems to have a promising attack in the offing, even if it requires a bit of preparation first, then if the computer disagrees, ignore it. Finish the preparatory moves, and keep an eye on the evaluation. Is it creeping in the attacker's favor? Good! Continue in that same vein, and you'll often notice a pro-attacker trend. You'll get more out of your computer when you're aware of this, and it's useful when preparing novelties for your unsuspecting opponents - especially those who don't fully realize the danger of the frog-in-the-kettle horizon effect!

Posted by Dennis Monokroussos on Tuesday June 27, 2006 at 12:35am. 6 Comments 0 Trackbacks

Friday, June 16, 2006

The Readers Write: How Badly Can a Computer Misevaluate, Redux

Martin van Essen writes:

Hi Dennis,

I just read the topic "From the Mailbox: How Badly can a Computer Misevaluate?" from some months ago.

I remember once having setup a position in Chessmaster 9000 (on a humble 500 MHz) involving a black h-pawn (h3 or so) and nine black 'wrong bishops' (eight promoted ones). White's lone king at h1 faced an approximate 19 pawns deficit according to Chessmaster. I'm curious what other programs have to say about this.

My computer is more powerful than yours, but that didn't matter to Fritz, Shredder or Rybka, which gave evaluations ranging from +24 to +34! Indeed, it wouldn't matter if one were running the position on Hydra or Deeper Blue: either the software "gets it" or it doesn't. The hardware problem is this: the game could continue for almost 350 moves just taking into account the process of getting rid of the promoted bishops. So until computers are a lot more powerful than they are now, the only way for chess engines to get it right is by programmers creating a specific rule that 1-8 wrong-colored bishops draw, provided the defending king can reach the queening square.

Posted by Dennis Monokroussos on Friday June 16, 2006 at 6:44pm. 3 Comments 0 Trackbacks

Saturday, June 3, 2006

And the World Champion is...
Junior, "who" won the 14th World Computer Chess Championship with 9/11, half a point ahead of Shredder and Rajlich (Rybka) and a point and a half ahead of Zappa. All four programs went through the event undefeated, with the one exception of Rajlich's round 3 loss to Shredder.

As for Fritz, it wasn't entered, but we'll get to see its latest and greatest version in action against Kramnik at the end of the year.

And now, a request for my readers. I occasionally see various ratings listed for programs - 2995 for Zappa, 2994 for Rybka from the source I mentioned a few weeks ago; here 2830 for Zappa, 2820 for Rybka, 2810 for Shredder and 2800 for Junior (who finished in inverse rating order); and the now out of date SSDF rating lists as well. What do these ratings actually mean, in FIDE terms, if anything? Does anyone know?
Posted by Dennis Monokroussos on Saturday June 3, 2006 at 12:07am. 2 Comments 0 Trackbacks