Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>Deepmind hasn't shown anything breathtaking since their Alpha Go zero

They went on to make AlphaZero, a generalised version that could learn chess, shogi or any similar game. The chess version beat a leading conventional chess program 28 wins, 0 losses, and 72 draws.

That seemed impressive to me.

Also they used loads of compute during the training but not so much during play.(5000 TPUs, 4TPUs).

Also it got better than humans in those games from scratch in about 4 hours whereas humans have had 2000 years to study them so you can forgive it some resource usage.



It's not like humanity really needs another chess playing program 20 years after IBM solved that problem (but now utilizing 1000x more compute power). I just find all these game playing contraptions really uninteresting. There are plenty real world problems to be solved of much higher practicality. Moravec's paradox in full glow.


The fact that it beat Stockfish9 is not what is impressive with AlphaZero.

What was impressive was the way Stockfish9 was beaten. AlphaZero played like a human player, making sacrifices for position that stockfish thought were detrimental. When it played as white, the fact that is mostly started with the Queen pawn (despite that the King pawn is "best by test") and the way AlphaZero used Stockfish pawnstructure and tempo to basicaly remove a bishop from the game was magical.

Yes, since its a game, it's "useless", but it allowed me (and i'm not the only one) to be a bit better at chess. It's not world hunger, not climate change, it's just a bit of distraction for some people.

PS: I was part of the people thinking that Genetic algorithm+deep learning was not enough to emulate human logical capacities, AlphaZero vs Stockfish games made me admit i was wrong (even if i still think it only works inside well-defined environments)


Two observations:

Just because Fischer preferred 1. e4, it doesn't make it better than other openings. https://en.chessbase.com/post/1-e4-best-by-test-part-1

Playing like a human for me also means making human mistakes. A chess-playing computer playing like a 4000 rated "human" is useless, one that can be configured to play at different ELOs is more interesting, although most can do that and there's no ML needed, nor huge amounts of computing power.


> What was impressive was the way Stockfish9 was beaten.

Without its opening database and without its endgame tablebase?

Frankly, the Stockfish vs AlphaZero match was the beginning of the AI Winter in my mind. The fact that they disabled Stockfish's primary databases was incredibly fishy IMO and is a major detriment to their paper.

Stockfish's engine is designed to only work in the midgame of Chess. Remove the opening database and remove the endgame database, and you're not really playing against Stockfish anymore.

The fact that Stockfish's opening was severely gimped is not a surprise to anybody in the Chess community. Stockfish didn't have its opening database enabled... for some reason.


I think for most people, the research interest in games of various sorts, is not simply a desire for a better and better game contraption, a better mousetrap. But rather the thinking is, "playing games takes intelligence, what can we learn about intelligence by building machines that play games?"

Most games are also closed systems, and conveniently grokkable systems, with enumerable search spaces. Which gives us easily produceable measures of the contraptions' abilities.

Whether this is the most effective path to understanding deeper questions about intelligence is an open question.

But I don't think it's fair to say that deeper questions and problems are being foregone simply to play games.

I think most 'games researchers' are pursuing these paths because they themselves and no one else has put forth any other suggestion that makes them think, "hmm, that's a really good idea, that seems like it might be viable and there is probably something interesting we could learn from it."

Do you have any suggestions?


This is so true, I can't understand why people miss this. The games are just games. It's intelligence that is the goal.

And comparing Alpha Go Zero against those "other chess programs that existed for 30 years" is exactly missing the point also. Those programs were not constructed with zero-knowledge. They were carefully crafted by human players to achieve the result. Are we also going to count in all the brain processing power and the time spent by those researchers to learn to play chess? Alpha Go Zero did not need any of that, besides the knowledge about the basic rules of the game. Who compare compute requirements for 2 programs that have fundamentally different goals and achievements? One is carefully crafted by human intervention. The other one learns a new game without prior knowledge...


It shows something about the game, but it's clear that humans don't learn in the way that alpha zero does, do i don't think that alpha zero illuminated any aspect of human intelligence.


I think that fundamentally the goal of research is not necessarily human-like intelligence, just any high-level general intelligence. It's just that the human brain (and the rest of the body) has been a great example of an intelligent entity which we could source of a lot inspiration from. Whether the final result will share a the technical and structural similarity (and how much) to a human, the future will tell.


In principle you are right. In practice we will see. My bet is that attempts that focused on the human model will bear more fruit in the medium term because we have huge capability for observation at scale now which is v. exciting. Obviously ethics permitting!


Not sure if I am reading you correctly but to me you basically are saying "we have no idea but we believe that one day it will make sense".

Sounds more like religion and less like science to me.

I guess we could argue until the end of the world that no intelligence will emerge from more and more clever ways of brute-forcing your way out of problems in a finite space with perfect information. But that's what I think.


But humans could learn in the same way that AlphaZero does. We have the same resources and the same capabilities, just running on million-year-old hardware. Humans might not be able to replicate the performance of AlphaZero, but that does not mean it is useless in the study of intelligence.


The problem is that outside perfect information games, most areas where intelligence is required have few obvious routes to allow the computer to learn by perfectly simulating strategies and potential outcomes. Cases where "intelligence" is required typically entail handling human approximations of a lot of unknown and barely known possibilities with an inadequate dataset, and advances in approaches to perfect information games which can be entirely simulated by a machine knowing the ruleset (and possibly actually perturbed by adding inputs of human approaches to the problem) might be at best orthogonal to that particular goal. One of the takeaways from AlphaGo Zero massively outperforming AlphaGo is that even very carefully designed training sets for a problem fairly well understood by humans might actually retard system performance...


I totally agree with you and share your confusion.

On the topic of the different algorithmic approaches, I find it so fascinating how different these two approaches actually end up looking when analyzed by a professional commentator. When you watch the new style with a chess commentator, it feels a lot like listening to the analysis of a human game. The algorithm has very clearly captured strategic concepts in its neural network. Meanwhile, with older chess engines there is a tendency to get to positions where the computer clearly doesn't know what its doing. The game reaches a strategic point and the things its supposed to do are beyond the horizon of moves it can computer by brute force. So it plays stupid. These are the positions that, even now, human players can beat better than human old style chess engines at.


The thing is that you can learn new moves/strategies that were never thought about before in these games but you still doesn't understand anything about intelligence at all.


A favour work by Rodney Brooks - "elephants don't play chess"

https://people.csail.mit.edu/brooks/papers/elephants.pdf


It's not like the research on games is at the expense of other more worthy goals. It is a well constrained problem that lets you understand the limitations of your method. Great for making progress. Alpha zero didn't just play chess well, it learned how to play chess well (and could generalize to other games). I'd forgive it 10000 times the resources for that.


> It is a well constrained problem

But attacking not-well-constrained problems is what's needed to show real progress in AI these days, right?


I'd say getting better sample efficiency is a bigger deal. It isn't like POMDP's are a huge step away theoretically from MDP's. But if you attach one of these things to a robot, taking 10^7 samples to learn a policy is a deal breaker. So fine, please keep using games to research with.


>it learned how to play chess well

This. Learning to play a game is one thing. Learning how to teach computers to learn a game is another thing. Yes chess programs have been good before, but that's missing the point a little bit. The novel bit is not that it can beat another computer, but how it learned how to do so.


Big Blue relied on humans to do all the training. Alpha Go zero didn't need humans at all to do the training.

That's a pretty major shift for humanity.


It's Deep Blue, not Big Blue. The parameters used by its evaluation function were tuned by the system on games played by human masters.

But it's a mistake to think that a system learning by playing against itself is something new. Arthur Samuel's draughts (chequers) program did that in 1959.


Sorry, mix up, thanks for the correction.

It's not that it's new, it's that they've achieved it. Chess was orders of magnitude harder than draughts. The solution for draughts didn't scale to chess but Alpha Go zero showed that chess was ridiculously easy for it once it had learned Go.


Both Samuel's chequer's program and Deep Blue used alpha-beta pruning for search, and a heuristic function. Deep Blue's heuristic function was necessarily more complex because chess is more complex than draughts. I think the reason master chess games were used in Deep Blue instead of self-play was the existence of a large database of such games, and because so much of its performance was the result of being able to look ahead so far.


> It's Deep Blue, not Big Blue.

Big Blue is fine - it's referring to the company and not the machine. From Wikipedia "Big Blue is a nickname for IBM"


I meant Deep Blue, but yeah Deep Blue was a play on Big Blue.


I guess there are reasons why researchers build chess programs: it is easy to compare performance between algorithms. When you can solve chess, you can solve a whole class of decision-making problems. Consider it as the perfect lab.


What is that class of decision-making problems? It's nice to have a machine really good at playing chess, but it's not something I'd pay for. What decision-making problems are there, in the same class, that I'd pay for?

Consider it as the perfect lab.

Seems like a lab so simplified that I'm unconvinced of its general applicability. Perfect knowledge of the situation and a very limited set of valid moves at any one time.


> What decision-making problems are there, in the same class, that I'd pay for?

an awful lot of graph and optimization problems. See for instance some examples in https://en.wikipedia.org/wiki/A*_search_algorithm


Perfect information problem solving is not interesting anymore.

Did they manage to extend it to games with hidden and imperfect information?

(Say, chess with fog of war also known as Dark Chess. Phantom Go. Pathfinding equivalent would be an incremental search.)

Edit: I see they are working on it, predictive state memory paper (MERLIN) is promising but not there yet.


Strongly disagree. There are a lot of approximation algorithms and heuristics in wide use - to the tune of trillions of dollars, in fact, when you consider transportation and logistics, things like asic place & route, etc. These are all intractable perfect info problems that are so widespread and commercially important that they amplify the effect of even modest improvements.

(You said problems, not games...)


Indeed, there are a few problems where even with perfect information you will be hard pressed to solve them. But that is only a question of computational power or the issue when the algorithm does not allow efficient approximation (not in APX space or co-APX).

The thing is, an algorithm that can work with fewer samples and robustly tolerating mistakes in datasets (also known as imperfect information) will be vastly cheaper and easier to operate. Less tedious sample data collection and labelling.

Working with lacking and erroneous information (without known error value) is necessarily a crucial step towards AGI; as is extracting structure from such data.

This is the difference between an engineering problem and research problem.


Perhaps a unifying way of saying this is: it's a research problem to figure out how to get ML techniques to the point they outperform existing heuristics on "hard" problems. Doing so will result in engineering improvements to the specific systems that need approximate solutions to those problems.

I completely agree about the importance of imperfect information problems. In practice, many techniques handle some label noise, but not optimally. Even MNIST is much easier to solve if you remove the one incorrectly-labeled training example. (one! Which is barely noise. Though as a reassuring example from the classification domain, JFT is noisy and still results in better real world performance than just training on imagenet.)


> Perfect information problem solving is not interesting anymore.

I guess in the same way as lab chemistry isn't interesting anymore ? (Since it often happens in unrealistically clean equipment :-)

I think there is nothing preventing lab research from going on at the same time as industrialization of yesterday's results. Quite on the contrary: in the long run they often depend on each other.


There’s plenty of interesting work on poker bots.


Poker bots actually deal with a (simple) game with imperfect information. It is not the best test because short memory is sufficient to win at it.

The real challenge is to devise a general algorithm that will learn to be a good poker player in thousands of games, strategically, from just a bunch of games played. DeepStack AI required 10 million simulated games. Good human players outperform it at intermediate training stages.

And then the other part is figuring out actual rules of a harder game...


I think chess may actually be the worst lab. Decisions made in chess are done so with perfect knowledge of the current state and future possibilities. Most decisions are made without perfect knowledge.


For chess, the future possibilities are so vast, you can't call them "perfect knowledge" with a straight face.


This is not what the terminology "perfect knowledge" means. Perfect knowledge (more often called "perfect information") refers to games in which all parts of the game state are accessible to every other player. In theory, any player in the game has access to all information contained in every game state up to the present and can extrapolate possible forward states. Chess is a very good example of a game of perfect information, because the two players can readily observe the entire board and each other's moves.

A good example of a game of imperfect information is poker, because players have a private hand which is known only to them. Whereas all possible future states of a chess game can be narrowed down according to the current game state, the fundamental uncertainty of poker means there is a combinatorial explosion involved in predicting future states. There's also the element of chance in poker, which further muddies the waters.

Board games are often (but not always) games of perfect and complete information. Card games are typically games of imperfect and complete information. This latter term, "complete information", means that even if not all of the game state is public, the intrinsic rules and structure of the game are public. Both chess and poker are complete, because we know the rules, win conditions and incentives for all players.

This is all to say that games of perfect information are relatively easy for a computer to win, while games of imperfect information are harder. And of course, games of incomplete information can be much more difficult :)


A human might not be able to, but a computer can. Isn't the explicit reason research shifted to using Go the fact that you can't just number crunch your way through it?


AlphaGo Zero did precisely that. Most of its computations were done on a huge array of GPUs. The problem with Go is that look-ahead is more of a problem than in Chess, as Go has roughly between five and ten times as many possible moves at each point in the game. So Go was more of a challenge, and master-level play was only made possible by advances in computer hardware.


Chess was already easy for computers. That's why Arimaa came to be.


When you can solve chess, you can solve a whole class of decision-making problems

If this were true, there would be a vast demand for grandmasters in commerce, government, the military... and there just isn’t. Poker players suffer from similar delusions about how their game can be generalised to other domains.


> Poker players suffer from similar delusions about how their game can be generalised to other domains.

Oh that's so true

Poker players in the real life would give up more often than not, whenever they didn't know enough about a situation or they didn't have enough resources for a win with a high probability.

And people can call your bluff even if you fold.


Those traits seem to me like a thing most people desperately need ... Everyone being confident in their assessment of everything seems like one of major problems of today's population.


I think batmansmk doesn't mean "when X is good at chess, X is automatically good at lots of other things", but "the traits that make you a good chess player (given enough training) also make you good at lots of other things (given enough training)".


I might suspect (but certainly cannot prove) that the traits that make a human good at playing chess are very different to the traits that make a machine good at playing chess, and as such I don't think we can assume that the machine skilled-chess-player will be good at lots of other things in an analagous way to the human skilled-chess-player.


And Gaius point stands before this argument as well, chess is seen as such a weak predictor that playing a game of chess or requesting an official ELO rating isn't used for hiring screening for instance.

I suspect that chess as a metagame is just so far developed that being "good at chess" means your general ability is really overtrained for chess.


Second world chess champion Emanuel Lasker spent a couple years studying Go and by his own report was dejected by his progress. Maybe he would have eventually reached high levels, but I've always found this story fascinating.


True, but I'd phrase it the other way around. The traits that make you (a human) good at general problem solving are also the traits that make you a good chess player. I do suspect, though, that there are some Chess-specific traits which boost your Chess performance but don't help much with general intelligence. (Consider, for example, the fact that Bobby Fischer wasn't considered a genius outside of his chosen field.)


Tell me about it. The brightest minds are working on ads, and we have AI playing social games.

Can AI make the world better? It can, but it won't since we are humans, and humans will weaponize technology every chance it gets. Of course some positive uses will come, but the negative ones will be incredibly destructive.


Just because you haven't seen humongous publicity stunts involving pratical uses of AI doesn't mean they aren't being deployed. My company using similar methods to warn hospitals about patients with high probability of imminent heart attacks and sepsis.

The practical uses of these technologies don't always make national news.

I'm sure you would also have scoffed at the "pointless impractical, wasteful use of our brightest minds" to make the the Flyer hang in the air for 30 yards at Kitty Hawk.


Let's start with defining "better"


>>20 years after IBM solved that problem

We solved nothing.

IBM Deep Blue doesn't exactly think like humans do.

Most of our algorithms really are 'better brute force'.

https://www.theatlantic.com/magazine/archive/2013/11/the-man...


Exactly. To my not-very-well-informed self, even AlphaGo Zero is just a more clever way to brute-force board games.

Side observers are taking joy in the risker plays that it did -- reminded them of certain grand-masters I suppose -- but that still doesn't mean AGZ is close to any form of intelligence at all. Those "riskier moves" are probably just a way to more quickly reduce the problem space anyway.

It seriously reminds me more and more of religion, the AI area these days.


>Also it got better than humans in those games from scratch in about 4 hours whereas humans have had 2000 years to study them so you can forgive it some resource usage.

Most humans don't live 2000 years. And realistically don't spend that much of their time or computing power on studying chess. Surely a computer can be more focused at this and the 4h are impressive. But this comparison seems flawed to me.


You're right, though the distinction with the parent poster is that AlphaGo Zero had no input knowledge to learn from, unlike humans (who read books, listen to other players' wisdom, etc). It's a fairly well known phenomenon that e.g. current era chess players are far stronger than previous eras' players, and this probably has to do with the accumulation of knowledge over decades, or even hundreds of years. It's incredibly impressive for software to replicate that knowledge base so quickly.


Not so much from the accumulation of knowledge because players can only study so many games. The difference is largely because their are more people today, they have more free time, and they could play vs high level opponents sooner.

Remember people reach peak play in ~15 years, but they don't nessisarily keep up with advances.

PS: You see this across a huge range of fields from running, figure skating, to music people simply spend more time and resources getting better.


But software is starting from the same base. To claim it isn't would be to claim that the computers programmed themselves completely (which is simply not true).


Sure, there is some base there, and a fair bit of programming existed in the structure of the implementation. However, the heuristics themselves were not, and this is very significant. The software managed to reproduce and beat the previous best (both human and the previous iteration of itself), completely by playing against itself.

So, in this sense, it's kind of like taking a human, teaching them the exact rules of the game and showing them how to run calculations, and then telling them to sit in a room playing games against themselves. In my experience from chess, you'd be at a huge disadvantage if you started with this zero-knowledge handicap.


> In my experience from chess, you'd be at a huge disadvantage if you started with this zero-knowledge handicap.

One problem is that we can't play millions of games against ourselves in a few hours. We can play a few games, grow tired, and then need to go do something else. Come back the next day, repeat. It's a very slow process, and we have to worry about other things in life. How much of one's time and focus can be used on learning a game? You could spend 12 hours a day, if you had no other responsibilities, I guess. That might be counter productive, though. We just don't have the same capacity.

If you artificially limited AlphaGo to human capacity, then my money would be on the human being a superior player.


All software starts with a base of 4 billion years of evolution and thousands years of social progress and so on. But Alpha Zero doesn't require a knowledge of Go on top of that.


> The chess version beat a leading conventional chess program 28 wins, 0 losses, and 72 draws.

In a not equal fight, and the results are still not published. I'm not claiming that AlphaZero wouldn't win, but that test was pure garbage.


The results were published - https://arxiv.org/abs/1712.01815

I agree AlphaZero had fancier hardware and so it wasn't really a fair fight.


Stockfish is not designed to scale to supercomputing clusters or TPUs, Alpha Zero wasn't designed to account for how long it takes to make a move, fair fight was hard to arrange.


No, these are not full results. There are just 10 example games published. Where is the rest?


How was it not equal?


There's discussion here https://chess.stackexchange.com/questions/19366/hardware-use... AlphaZero's hardware was faster and Stockfish had a year old version with non optimum settings. It was still an impressive win but it would be interesting to do it again with a more level playing field.


And didn’t they just do all of this? It’s not like 5 years have passed. Does he expect results like this every month?


> Also it got better than humans in those games from scratch in about 4 hours whereas humans have had 2000 years to study them so you can forgive it some resource usage.

Few would care. Your examiner doesn't give you extra marks on a given problem for finishing your homework quickly.


oh wow, it can play chess. can it efficiently stack shelves in warehouses yet?


"It" can reduce power consumption by 15%.

https://deepmind.com/blog/deepmind-ai-reduces-google-data-ce...

Just because alpha zero doesn't solve the problem you want it to doesn't mean that advancements aren't being made that matter to someone else. To ignore that seems disingenuous.


there is no human that has studied any of those games for 2000 years. So I think you mean 4 hours versus average human study of 40 years.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: