I'm really excited to see the limits get lifted, in particular around items and ...

star-trek-fleet · on Aug 6, 2018

Great summary.

I'll add:

- The coordination between AI bots are clearly beyond human level. Or at least as demonstrated from similar performance from the humans on similar style of heros.

(It might not appear too different from the show match, but based on my 2-3k hours watching pro games, the coordinations are noticebaly better than the best team in history, aka Wings gaming 2016 TI champion).

I am not sure how such coordinations are modeled in dnn, which itself seems the most valuable from this research.

- In general I think with this show match, it pretty much sealed the doom of human players in dota2.

As it shows that the general approach is scalable and capable to handle the problem itself. As from laning to team fight, and item building, the AI did not show weakness at all.

I was worrying about AIs general inefficiency in deriving the winning strategy, laning stage, and team fight coordinations, which turns out to be obviously superior to human players.

Drafting probably will be even more favorable to AIs. The challenge would be can they train faster by observing the change log, I.e. finding winning strategy without training from scratch each time after a patch release.

I seem no reason AIs lose to vp/liquid/lgd (the top 3 going into ti8). The idea that split pushing hero can deal with the team fights seems underestimate the AIs discipline, which is clearly superior to the best humans.

- Last is how much computing resources are used in the training and playing. Hopefully value can team with open ai to release a benchmark bot team for calibration and a different ladder systrm of playing different AI strength level

ufo · on Aug 6, 2018

The AI did show lots of areas of weakness, which still need better training. Their warding was very awkward, they never used smoke ganks, and they didn't play optimally against invisibility. The human team got a lot of mileage out of rushing a Shadow Blade, which is a cheesy strategy that a coordinated team should be able to easily counter.

That said, the AI still won despite these limitations, which shows that it is very strong in other areas of the game.

halflings · on Aug 6, 2018

> As from laning to team fight, and item building.

I was disappointed to hear one of the devs say that item selection was hard-coded following a popular guide (don't think they communicated this before).

This is OK, because what matters the most is that it won. It was previously impossible to build hard-coded AIs that would even beat decent players, and now the AI has beaten some pro players (albeit not the best of the best); but it'll still be nice to see superior item buying strategies being learned.

darrenkopp · on Aug 7, 2018

It's not as bad as it sounds. The bots are hard-coded to follow a item guide that is published by TorteDeLini, but at the same time many human players do the exact same thing. I'd say that most players blindly follow those guides from 1.5K to the low 3K mmr brackets most games.

iotb · on Aug 6, 2018

As someone who plays Dota, the matches were a clear demonstration of sloppy snowballing and brute force cheesing. Coordination 'appears' to be beyond human level because the bots are collectively synced on a team goal value. If a true 'pro' or pro team was allowed to observe and play multiple games against this bot, I'm more than certain one could find an exploit of such an unintelligent mathematical approach to action selection at the individual or group level. In fact, this would be what you would use dark seer strategically for or a whole host of other characters and functionality that is currently banned.

Not everything can be calculated especially when tricks are intentionally done to throw a bot off.

> I am not sure how such coordinations are modeled in dnn, which itself seems the most valuable from this research. There's a tuned group/individual driver function centered on various calculations. This is not actually a valuable part of the research as its dynamic and game dependent and can't cover all of the possibilities thus why someone broke their 1v1 bot (corner case)

> In general I think with this show match, it pretty much sealed the doom of human players in dota2.

If you are indeed a player and have viewed that many hours of dota2, I question the nature of such a comment. A great player wouldn't look to see how to 'beat' their bot, as it is a bot w/ no intelligence, the strategy would instead be to try to break it and shove it into corner cases. It's not playing the full range of characters that were intended to disrupt cheesy snowballing so I wonder why you're making such an optimistic statement .. being that you claim you have watch so many dota matches. Do you play much yourself? Maybe that would change your opinion.

> As it shows that the general approach is scalable and capable to handle the problem itself. As from laning to team fight, and item building, the AI did not show weakness at all.

I'm starting to see a pattern with your commentary. The gameplay look like your typical "south" players. Hardly anything impressive : Aggressive boneheaded tower diving and aggressive and cheesy snowballing.. If you can last past 30min, you outwit and outplay such people in the mid/late game.

> superior to humans > superior to humans > superior to humans

More than 50% of the Dota 2 dynamics aren't even present and are restricted at the moment. Are you getting paid for this post?

star-trek-fleet · on Aug 7, 2018

> A great player wouldn't look to see how to 'beat' their bot, as it is a bot w/ no intelligence, the strategy would instead be to try to break it and shove it into corner cases.

I cannot see any differences between "beating opponents" vs. "break it and shove it into corner cases", that's always how dota games are played out.

Unless we formed different views on how Dota is played through our thousands of time watching games (I sensed from your statement that you had similar gaming experiences :) )

iotb · on Aug 7, 2018

I have thousands of hours logged playing various Dota variants. I rarely watch as it's not a good way to retain skills or understand what's going on. If you play enough, you should understand exactly what I'm saying. I watch on rare occasions to get exposed to something I haven't thought of or tried but that's about it. Beating an opponent especially a good one is far different that 'breaking a bot'. What I mean by 'breaking the bot' is literally doing things to confuse an unintelligent mathematical algorithm into constant miscalculations and mis-predictions and for bonus points for finding a bug in the way it individually or collectively performs actions/interprets data. It's how their first 1v1 bot is defeated. This doesn't occur to a human being because we have intelligence and aren't just optimization algorithms purring along using game state data.

This is what Strong AI is centered on.

nopinsight · on Aug 7, 2018

Your point regarding the fact that the bots ‘may’ not be adaptive to surprising strategies is a good one. We do not know for sure in the case of OpenAI Five as there are too few public games to look at.

AlphaGo Lee (the version which won 4-1 against Lee Sedol) did seem to get thrown off track by Lee’s surprising move and lost that game.

However, AlphaGo Zero, which is based on some of the same principles/sets of algorithms, were much stronger than AlphaGo Lee (More than 3 stones according to DeepMind. Three stones is about a difference between top pros and top amateurs/beginning pros.) and seemed like it would be insusceptible to any surprises thrown its way from human experts.

The difference was that AlphaGo Lee learned from play records of human Go experts while AlphaGo Zero did not and only learned via self-playing. Dota 2 is clearly more complex than Go but if the same principles apply then an AI trained from pure self-plays would be adaptive to most surprises in the domain, if the system had explored those edge cases before (which depends in turn on how the self-plays were conducted during training).

(As a side note: OpenAI Five probably chose the “simple-minded” snowballing-cheesing strategy because it determined from extensive experience that the strategy is most likely to yield a win given its capabilities (which are advantageous to humans in some respect like instantaneous global information observation, great coordination, consistency, etc). This is very different from the reason some human players choose the strategy. Perhaps precisely because Five bots don’t get sloppy that the strategy is so effective for them.)

star-trek-fleet · on Aug 7, 2018

My feeling is that humans are not adaptive to changing game flows either.

Most pro games are played with a strategy that is settled once draft is finalized. If the strategy turns out not working, Humans did not show noticeably different adaptivity.

Occasionally, a versatile team can transition from a late-game oriented line up to play a split push game. But usually such transition is based on a suiting draft, which requires the team members to be versatile in playing their heroes in slightly different styles; and a well-oiled team coordination to transition from one to another style.

> the “simple-minded” snowballing-cheesing strategy

In the show matches, there is no cheesing. It's plain team fight + push; the AIs executed the plan with ruthless precisions.

TBH, a typical pub game is best described as strategy-less game play. And pro games probably have 3 styles of play:

- Team fight

- Stick-together push

- Split push

The most close team that shows vastly better versatility is Wings gaming, which pretty much run any lineups they feel fitting.

Sadly the team disbanded after TI6, otherwise, their match against with OpenAI would be the most interesting thing I can imagine.

iotb · on Aug 7, 2018

> My feeling is that humans are not adaptive to changing game flows either.

They are. I'd actually argue that this occurs much more in a pub game than with pros. I'm largely against the concept of a pro for this reason as it amounts moreso to having settled on lockin strategies moreso than intelligent/active/dynamic exchanges. I play a lot of pub games for this reason... To enjoy the heightened dynamics. Tons of rotations and adjustments. Tons of punishments for a great player hot dogging to break their psyche. Lots of very intense examples of dynamic human intelligence.

> Most pro games are played with a strategy that is settled once draft is finalized. If the strategy turns out not working, Humans did not show noticeably different adaptivity.

You're speaking moreso of 'pro games'. I encounter a great deal of dynamics outside of this grouping... It's where a lot of intelligence comes into play. I think a lot of people are completely uninformed about the game who watch others play a lot w/o actually playing themselves. Pro games are literal theatre for the masses like in a large number of professional leagues. The real stuff happens outside of the spotlight.

> Occasionally, a versatile team can transition from a late-game oriented line up to play a split push game.

This happens in just about every game I play... Tons of rotations and readjustments when things aren't working out [happens sometimes w/ no communication]. Tons of split pushes.. Strategic ganks. If people have good emotional stability, there will be a pronounced reflection/change after a massive team death incident.... The point of these games are highly intellectual battles. Its why it's a disservice to restrict any features of the game. It's how they maintain balance to avoid the game devolving into idiotic bot like cheesing. Games live and die based on how much cheese is present.

> But usually such transition is based on a suiting draft, which requires the team members to be versatile in playing their heroes in slightly different styles;

I'd expect pros to have these skill-sets yet I don't see it much because in such showcases its more about optimization and lockin strategies than dynamics.

> and a well-oiled team coordination to transition from one to another style.

Happens in regular pub and ranked matchups all the time many times w/ little to no communication. As long as someone is not an emotional child, it can sometimes be stressed and instantiated over prolonged swearing and yelling at various players. This is what's maybe missing from the Pro-league... Someone getting in your ass openly for doing something stupid like continuing to battle 3 well organized bots 3v1.

> In the show matches, there is no cheesing. It's plain team fight + push; the AIs executed the plan with ruthless precisions.

One of the games opened with 4 bots diving bottom tower to get a kill and persistently pushing bot. The human 'pro' sat there hashing it out even though he could have ran to safety and avoided another death and no one from top or mid tp'd to bot on the human team. This hamfisted cheesy snowballing occurred in every match on the bot's behalf because OpenAI restricted the gameplay to favor it. Even so, in a pub someone would have been swearing to the top of their lungs on a mic telling the $@(#@(%* at top/mid to immediately TP and punish such a brazen exchange especially with creeps all over them. Absolutely nothing was precise about the gameplay from the humans or bots. It was the kind of slop I see on servers from the southern portions of the world and punished heavily by any seasoned players. I guess this is where 'pros' are a meme and I've served a good number of them up with gameplay outside of their carefully scripted comfort zones.

> TBH, a typical pub game is best described as strategy-less game play. And pro games probably have 3 styles of play:

Typical pub is chaos which is why I've seen a number of pros get their behinds handed to them in it. They're sort of like bots in that they think they have the game completely figured out and have a golden strategy no one can defeat. It's a flaw not a good trait. In ranked, you're going to see some amazing gameplay even w/ random non-party individuals. Anyone who plays knows about the games where its like a symphony playing. Limited talking, tons of rotations/ganks/readjustments/team fights/split pushes/team pushes/ratting/baiting/etc.

"Pros" are not Pros in my book. They're a group of players who center on a optimal echelon of gameplay that everyone at that level tends to agree upon. Throw some dynamics in and they fall apart.

What I saw across all of the OpenAI bot games is nothing to go home writing about. If they were true to things, they'd show how these bots play in all-pick no restrictions. They claim to be after Strong AI not Weak based game bots. It's not about winning/losing... It's how you play.

This is enough of my personal commentary on this issue. People are unable to see past these approaches and what they truly are and that's fine with me at this point.

Catch you on the flip side.

nopinsight · on Aug 7, 2018

Really curious: If pub-styled plays are much more adaptive/superior to pro-leagued style as you said, why don’t top pub players simply team up (and perhaps sharpen some ‘simple’ micro skills) and take on pros in TI to win > $10 million prizes and live really well?

What would really happen when pub-styled play is actually used in pro scenes for million-dollar prizes? If it is actually superior, why none of the pro teams caught up and tried using it to win?

iotb · on Aug 7, 2018

> We do not know for sure in the case of OpenAI Five as there are too few public games to look at.

Thus the nature of a canned showcase demo. We do know they have a slew of restrictions. As an avid player, I know exactly why : because such combos require much deeper and true intellect to play efficiently. Even as such, given that I know i'd be up against an optimization algorithm, my strategy would be to create as much chaos and uncertainty as possible. Information theory is clear as to the impact this would have : It would be unstructured noise that would be hard to optimize and likely not seen before or significantly reflected in the AI's weighting system. This is the basis of adversarial attacks. I'm sure with a decent amount of games I'd be able to figure out a suitable one for 5 linked bots.

The perspective as to what's going on with this demo is much different if you actually play the game. I've actually seen a number of games like this bot exhibited. It's a strategy low skilled players engage in with the hope of overwhelming opponents with brute force. The character restrictions favor it. So its not by accident that this all converged into a demo that favors an unintelligent brute force optimization bot.

It favors something that can do range/hit point calculations quickly/accurately. Snowballing is required because there is no broader intelligence among the bots. When the bots snowball, it's essentially just one big optimization function. When they're stretched apart, the calculations are much harder.

Knowing what I know about the game and the fact that I'm up against a Weak AI bot with an optimized model, I'd know exactly how to screw it up with an adversarial attack. I'd train a team of people on that and show everyone exactly what human intelligence of capable of and why its superior. This happens in your average dota 2 match constantly.. Low skill players attempt brute force strategies just like these bots and you essentially wait them out and pick them apart. This isn't a new and amazing style of gameplay or something. There's already names for it.

When I used the term 'sloppy' I meant against the spirit and nature of the game and w/o consideration of the 'way in which one wins'... Ambushing towers at open 4v1 or 2 is some very hamfisted foolishness. Even in regular pub games with upper avg. players, there'd be a sharp punishment for such bro-tier gameplay. It usually results in an equally massive 'gank'. The way the human players responded in these pressure scenarios really has me questioning the whole event as I see avg. random players make far better decisions every day in dota.

That's just my unfavorable two cents. I'm not impressed because I understand how their bots are doing what they're doing, where the advantages lie, and I'm aware of what restrictions they placed on the game in favor of their bot.

Elon claims he's worried about a dark future with AI, it's actually solutions like this that are most scary because there is zero intelligence and a [by any means possible so long as you achieve the object] steering function. If you want to unleash chaos and destruction on the world and see a darker side to human intelligence you've never seen before, start releasing such 'weak AI' to manipulate people from the shadows. This is not strong AI or a path to it. It's more of the same Weak AI provided with exclusive and insane amounts of computer power/data and an objective to optimize for by any means necessary. In cases where it dominates, it's almost certainly a reliance on finding loopholes/flaws in a particular game not actual intelligence. You should see the danger in this right away.

Funny because OpenAI originally opened with the spoopy terminator like dangers of AI being so destructive we needed a group like them to save us... To now openly sharing such unintelligent and dangerous weak AI optimization platforms in the mainstream fear. Sort of like the 'Do no Evil' Mantra that was just slogan.

I think this is a great engineer accomplishment that no doubt taught them a lot. I don't see any broader 'safety' ideology underlying this... Just another great team of people trying to achieve AI like everybody else utilizing popularized approaches. It's better to just come out and say that. We can drop the 'Save the world from AI'/'Safety' superman talk and get to the brass tax of what they are doing and how, if at all, its different from what anyone else is doing in the space.

jjjjjjjjjjjjjjj · on Aug 7, 2018

> The perspective as to what's going on with this demo is much different if you actually play the game. It's a strategy low skilled players engage in with the hope of overwhelming opponents with brute force. The character restrictions favor it.

As a dota player who has been in the 99.5th percentile mmr at several occasions (right now at 98th) I disagree with this and a lot of the stuff you're trying to say. Dota is a strategy game, and the meta dictates what strategies are strong at a given point in time. The death ball strategy that the bots played is a result of that being the best strategy in the bot meta. So in contrast to what you said, it's not low skilled players that play these strategies, but rather high skilled players that play whatever strategy is popular in the meta (regardless of how 'intelligent' it is), in order to increase the chances of winning.

nharada · on Aug 7, 2018

My impression is that when rich and powerful people talk about "the dangers of AI" what they really mean is "the dangers of AI (to me when it's not controlled by me)"

red75prime · on Aug 7, 2018

It is nothing new, or particularly bad. If we (good guys) will not have (insert powerful technology), then bad guys will have it and everyone will be worse off.

make3 · on Aug 7, 2018

"not everything can be calculated" that's not how neural networks work. it develops a super complex model of the game by itself by playing a huge amount of game, and optimizing to progressively learn to win more

haeffin · on Aug 6, 2018

Why wouldn't it have good coordination? A bot has access to a perfect model of how the other bot would act - itself.

Also, computer engines didn't seal the doom of human players in chess and in go, so I don't get why it would do so in dota.

BigJono · on Aug 6, 2018

I think by 'seal the doom' he just means that this result shows that OpenAI is almost definitely going to be able to defeat a pro team in an unrestricted game of DotA.

Which I'm still not completely sold on. It's likely, but the remaining restrictions aren't trivial by any means. There's an outside chance that removing one or more of them is going to brickwall their progress.

sakarisson · on Aug 7, 2018

One should keep in mind that some of the restrictions were in place to prevent the bots from having too easy of a time. For example, the anti micro/illusion rule was intended to limit the obviously superior micro coordination of the bots.

Ntrails · on Aug 7, 2018

I'm not sure that's true? I can see the bots being utterly terrifying with meepo in a teamfight - but would need supports stacking, proper farm prioritisation (much more use of jungling and ancients), etc etc.

I genuinely believe the bot would win a game of turbo against any team in the world. But remove _all_ of the restrictions and it's not clear that it doesn't just lose at the moment

Karlozkiller · on Aug 7, 2018

They specifically said they would have to implement a special case for heroes that control more than one unit in the future.

So you're saying that even before they set up a rule about microing illusions to protect humans from a feature that they have not yet implemented nor, I assume, have trained the model on?

hohenheim · on Aug 7, 2018

Not only that, but also lets not forget humans learn as well. Meaning the more games players play against the bot the better they would become at understanding and defeating it.

randomamazondev · on Aug 6, 2018

> Why wouldn't it have good coordination? A bot has access to a perfect model of how the other bot would act - itself.

As far as I know it is five (Hence the name) individual AI instances controlling each character and with basically no AI to AI communication.

It is not one overriding AI controlling all five.

I have no idea if the AI instance controlling each character is identical though, if so then your statement still holds true I guess (Assuming each AI has the exact same information to work with which might be the case). It would be interesting to see if AIs specialised.

iotb · on Aug 6, 2018

There's a presiding team value function that impacts and steers team play. The bots 'communicate' through this. There's nothing magical going on.

As a counter bot strategy, I'd work on how to break and trick it using multiple-stepped logic that an optimization function would be unable to see beyond. I'd also use varying tactics of chaotic/sporadic configurations. The bot isn't 'playing fair' nor should a human w/ intelligence. The advantage being that a human can think along a multitude of strategies and adapt. The bot is only optimizing some steps ahead.

Their 1v1 bot was defeated in this manner and it just goes to show what true intellect and superiority is. I've played random pub games w/ little to no communication and have had all other 4 players converge on different strategies based on a perception of what's going on. If someone decided to cheese/snowball, you simply wait it out and let them push themselves into a nightmare. I saw little to none of this in the games I watched which leads me to question the intelligence of said 'pros'.

ufo · on Aug 7, 2018

The team value function is just a hyperparameter that describes how greedy the individual agents are. At the start of training the team spirit is 0 and the bots are only rewarded for their own actions. This encourages them to learn basic micro skills, like last hitting. As training progresses the team spirit is increased. When it finally reaches 1, the bots value a reward for a teammate as highly as a reward for themselves.

The actual source of the "communication" is not the team spirit parameter, but the basic fact that the bots have been trained together and they receive the same inputs when making decisions. Unlike humans, who have a limited focus to their attention, the bots can look at the whole map at once. They don't need to communicate because the already "know" what their allies will do when given the same input.

shelune · on Aug 6, 2018

I think it's because the organizers want to make sure the bots will have a good performance here. Ofc OpenAI is awesome but it's impossible to cover such a complicated game as DotA within just 1 year. What they have achieved though is still awesome.

I'm just slightly bugged by the fact that the developers didn't take action execution into delay consideration. The response time is 200ms but humans also need some more time to drag the mouse and click to perform the action. Their insane reactions actually make me less impressed.

roenxi · on Aug 6, 2018

Dota is a bit deceptive like that; it is secretly a slow and deliberate game. IMO most of the deaths occur 2-5 seconds before the action begins when a hero gets out of position.

There was a moment in game 1 that was the exception proving the rule for me. The bot playing lion successfully disabled the human initiator on earthshaker at the end of the game. It looked like a superhuman reaction, but it was also a bit different from all the other fights of the game where it was usually the fundamental position being too far in the AIs favour - they had a gold advantage and had been developing a lead through the entire game by consistently trading deaths 1-0, 2-1 or 3-2 in engagements.

The potentially superhuman reaction took the game from "looks like bots are winning" to "humans resign now", but the vast bulk of the advantage was that the bots simply had a better understanding of which team enjoyed a superior position. I would not be surprised if higher bot reaction times (+100-300ms range) weren't all that impactful on the results.

It'll be really interesting when the courier distortion is removed and the AI has to play more defensively. Also, I suppose the actual, harder to articulate, complaint in the "reaction time" complaint is that the bot teammates have the capacity to chain abilities more accurately and have played so many hours together there is an advantage there that isn't 'fair'. It'll be a fun milestone when they can drop a single bot in a pub game where their teammates aren't all that coordinated.

ionforce · on Aug 6, 2018

If we had the compute luxury, I would love to see more AIs trained with faults like a longer reaction time or deficiencies to make them more human.

But yes, very exciting next stages when there are item builds, more heroes, and standard couriering.

ufo · on Aug 6, 2018

I got the impression that they weren't particularly looking for a hero pool with a "deathball" meta, and the hero pool they selected had more to do with those being some of the simplest heroes to program at first. There is a lot of overlap between with the heroes they implemented and the set of introductory heroes that are recommended for new players who are playing their first games.

Regarding the insane reaction, I wonder if there is a natural way to handicap the AI to more human-like reaction times. Reaction time is not a good measure on its own because human reaction time can vary a lot depending on the level of surprise.

That said, my impression from watching the game was that the power of the AI had less to do with perfect reaction time and more to do with their "hive mind" coordination. If an enemy ever gets in the wrong position, they are immediately punished by a concerted attack. Humans have a harder time doing this because they need to communicate their intentions first. Sometimes each player will be focusing their attention on a different target.

iotb · on Aug 7, 2018

Yep. This is clear to even a novice player to be a strategically chosen death ball/snowball configuration. Some of the least skilled players in Dota shoot for it but is undone by a broader range of unironically named 'intelligence' characters. It also becomes unwound in longer matches. Snowball/death ball cheese is easily bot-able and a human being would have a hard time winning vs it because you'd have to range and hit point calculations and act on it faster than a computer which a human being cannot.

> That said, my impression from watching the game was that the power of the AI had less to do with perfect reaction time and more to do with their "hive mind" coordination.

Yes, which is why they restricted character selection to favor it. There's no reason to call it a hive mind as it doesn't possess that. There's a global steering function presiding over 5 bots that need to act swiftly based on global dmg/etc. This is why they restricted the game to snowball cheese. A human can't beat this just as a human can't beat a TI-89. This is why, if you look closely, they absolutely destroy the bots when the team is separated and the human players aren't playing like greedy noobs diving towers.

> If an enemy ever gets in the wrong position, they are immediately punished by a concerted attack.

> bot : (10-9) =1 (I win) Strong AI right around the corner

ionforce · on Aug 6, 2018

Has there been any official information on how they came to choose their hero pool?

ufo · on Aug 7, 2018

Not that I can recall.

acchow · on Aug 7, 2018

The drafting limitation is a joke.

Game 3, the OpenAI was forced to reveal its carry pick (HARD CARRY SLARK) as first pick. What a farce.

noddingham · on Aug 8, 2018

Watch again. OpenAI picks in game 3 came from the audience and twitch chat. They were purposely making all the bad picks for the OpenAI team.