Friday, March 30, 2012

Point/Counterpoint Result: Rosewater wins Targeted Draw debate, but loses issue

The Magic: the Gathering website's latest idea is to give the public insight into the process through debates about issues. This week, the topic was whether the default setting for a card that allows a player to draw cards should be "You draw N cards" or "Target player draws N cards." Zac Hill took the first side of the argument, while Mark Rosewater took the other side. While Mark defended his side of the argument more deftly and concisely, it was not enough to persuade me of the merits of the position.

Mark summarized his argument this way:

Newer players have to learn the word "target" to play the game, so forcing them to encounter it slightly more often is good for them.

Multiplayer play is an important and growing part of Magic. Templating cards to make them multiplayer friendly is crucial to evolving the game to where it wants to be played.

To help Magic crystallize with newer players, it is key that we build into the game the hooks needed for them to emotionally bond.

Evidence points to the fact that newer players have no problem figuring out what targeted draw does and having good complexity hidden in the game out of their sight makes Magic a better overall game.

Let me take each of these and explain why I think they fall short of the goal:

Newer players have to learn the word "target" to play the game, so forcing them to encounter it slightly more often is good for them.

This is true, as far as it goes, but it doesn't mean that we need to apply it to card draw. There are already several game effects that are targeted, such as damage, healing, power/toughness modifiers, and direct destruction (see Doom Blade and Naturalize). These things are targeted, for the most part, because they need to be so. In order to create a card like Shock, that only deals damage to one creature or player, choosing which one to affect is ably dealt with by the rules for targeting.

Card draw, as a general rule (and that is what we are dealing with when deciding the default setting for a card) does not need to be targeted. It is simple enough to say "You draw two cards", and players will understand that. To try to accomplish the same thing for many other cards, look at the Portal version of Alluring Scent. This makes it harder to understand, not easier. Just because many cards need to be targeted does not mean that every card that can be targeted should be.

Multiplayer play is an important and growing part of Magic. Templating cards to make them multiplayer friendly is crucial to evolving the game to where it wants to be played.

The need to make friendlier cards for multiplayer play is there, and designers need to make sure that such cards are present as each new set comes out. But Zac missed the boat when he compared the targeted draw to a card that reads, "Choose one—Draw two cards, or lose the game. If your opponent has two or fewer cards in his or her library, and does not have hexproof, instead you win the game." First, the analogy fails to capture the functionality of allowing a teammate or opponent to draw in multiplayer in order to get a card that helps your position. Second, even in a duel, there were decks such as Turbo Zvi (see http://www.wizards.com/magic/magazine/Article.aspx?x=mtgcom/daily/bd172 for decklist) that won by repeated application of a small targeted card draw. Thus, the analogy fails to hold, as it doesn't take into account these potential scenarios.

Back to our discussion about targeted draws as a default. Targeted drawing at the same cost as non-targeted drawing is strictly better. Thus, we should impose a cost on it, whether it comes increased mana cost, restricting it to sorceries (and allowing non-targeted card drawing to be instant), or restricting targeted card drawing from common. Making the non-targeted draw the default is a means of costing similar to this last option. If the targeted draw is needed for a set, there is nothing stopping designers from adding it. Doing so, however, should be seen as the complexity increase it is.

One other thing to note about multiplayer is that most forms of it are Constructed, not Limited. This means that getting good multiplayer cards into sets is less of a priority at common. Blue Sun's Zenith does its job at creating targeted draws for multiplayer, even though it is rare. It's only in formats like Two-Headed Giant Limited that non-targeted draw as a default limits how much of it is present, but if a set needs targeted draw, it can be added.

To help Magic crystallize with newer players, it is key that we build into the game the hooks needed for them to emotionally bond.

This is a great point, but a bit shortsighted. It's great to have moments where players realize strategy is more complex than they imagined, but the defaults should be the minimum complexity the card needs to function. There are already such moments once a targeted card draw is added to a set, and once that moment is reached, it's good. But there are other ways to make such moments: blocking strategy, triggering fateful hour by striking yourself with a red damage spell, etc. and these are all from the most basic versions of the cards.

Everyone acknowledges that the non-default targeted card draw version can be made if needed. And when such a card is printed, the person who gets it can use it just as Mark suggested. The chances of getting it are lower, and so the realization may take longer, but it is still there.

Evidence points to the fact that newer players have no problem figuring out what targeted draw does and having good complexity hidden in the game out of their sight makes Magic a better overall game.

Every card creates an experience. Default cards are more prevalent than others, and once players recognize the default settings, they become the standard by which future cards are judged. If the default is the simplest card of its type, the new cards with the option are seen as "upgrades" and are favorably judged by the public for it. If the default is the "better" version, the non-default version is seen as a "downgrade" and the public gets upset at being screwed.

We have seen this before. Shock was a lightning rod for criticism until it had been around for a few years, as it was a downgrade of Lightning Bolt. Then Shock became the default, so the two years of Lightning Bolt's return (in Magic 2010 and 2011) were seen as a positive for Red. The same howling happened with Counterspell vs. Cancel. But there has been no such emnity over cards like Ashcoat Bear and Ajani's Pridemate, which are better than the default Runeclaw Bear and Silvercoat Lion.

The final analysis is this: Players may have to learn how to target, but that doesn't mean that everything that can target should. Multiplayer cards need to be made, but they don't have to clutter up the defaults to do it. The game needs emotional hooks, but making everything an emotional hook is not the answer.and while people may figure out what the targeted card draw does, making them the default sets the public up for disappointment when the non-targeted versions are printed out of necessity. So while Mark Rosewater may have made the more compelling arguments, they were not enough to win this heart to his position.

Saturday, February 18, 2012

Heresy the Second: DCI Tournament Procedures Can Improve

In the years since my last article on this (published in 2005), two of the changes I suggested have actually been tested. To help shape the discussion, here are three questions to get you thinking about the issues to come:

1. Under what circumstances can a person only lose two games in Swiss and finish behind a person who lost six or more games?

2. If you get a bye in the first round, then sweep your next three opponents two games to none, lose the fifth round in three games, and win the sixth in three games, how many more matches must you sweep to get a game win percentage over 80%?

3. List the benefits of finishing first in the Swiss portion of a tournament, and which, if any, also apply to the person who finishes second as opposed to third in the Swiss portion of a tournament?

Take as much time as you need to think about the answers, and feel free to revise your answers as you go along. I'll post the answers later.

If you have ever been to a tournament, you know that you play matches against people, and the results are recorded and used to determine what happens from round to round, but have you ever thought about why certain things happen one way and not another? I would not expect most people to have done this, but I have, and here are some observations that I have made.

Section 1: Game, Set, Match
The first observation is that game records for elimination events are completely irrelevant, and at Swiss, they are almost irrelevant, given the current format for scoring Swiss events. Currently, players are ranked by match record, with a player getting 3 points for a win or a bye, and 1 point for a draw. If there is a tie for an important spot, the computer calculates tiebreakers. In order, these are:

Opponents' Match Win Percentage: This is found by taking each opponent's match points (not counting those earned in byes) and dividing it by a number equal to 3 times the number of opponents that person actually played. If any opponent's match win percentage at this point is less than 1/3, it counts as 1/3 for this purpose. These are then averaged to determine the final value, expressed as a percentage rounded to 4 decimal places (63.2984%, for example).

Player's Game Win Percentage: This is found by taking the player's game points (calculated the same way as match points) and dividing it by a number equal to three times the number of games played.

Opponents' Game Win Percentage: This works like Opponents' Match Win Percentage, except game wins are used.

As game records only impact the second and third tiebreakers (which, for almost any event over 30 players, goes unused), the primary incentive to actually finish a match when you are a game ahead is either the satisfaction of finishing, or the threat of a penalty for slow play. In play, match record is far more important than game record.

This leads to the first question:
1. Under what circumstances can a person only lose two games in Swiss and finish behind a person who lost six or more games?

The answer is, when both those losses occur against the same opponent. In that case, the person in question has lost a match, and is behind every player who has not yet lost a match, even if they have lost far more games.

Does it make sense that a person who goes 15-2 in games over a 7-round Swiss ranks behind someone who went 14-7 just because both of the first person's game losses occurred against the same opponent, while the second player has managed to spread the losses evenly over all seven matches? Does it matter if either or both of those game losses were due to mana problems? Is one extra match win really worth more than a 21.5% advantage in game win percentage?

Now that we're dealing with the game win percentage statistic, let's look at the next question:
2. If you get a bye in the first round, then sweep your next three opponents two games to none, lose the fifth round two games to one, and win the sixth in three games (again, 2-1), how many more matches must you sweep to get a game win percentage over 80%?

The answer is two. At the point of the question, you have 27 game points. (4 match wins with 2 game wins each, plus one match loss with a game win means 9 total game wins. At 3 game points per win, that's 27.) There are 36 game points total (12 total games played), and each sweep adds 6 to each column. The value passes 80% at 39/45, which is two matches away.

I bring this up because, at the point of the question, the match win percentage, not counting byes, is 80%. So why is the game record only 75% at this point? Because a match that is won by a two game to none sweep only contributes two games to the percentage, while a match that goes to a third game contributes three. Add a third game win to each sweep and we have a game record of 80% now, not two rounds from now.

As was established before, this is a minor consideration in the current format, so it hasn't had much impact. Even so, it is still a quirk of the system that needs to be acknowledged.

Section 2: Swiss Folly
Tournament players, ask yourselves: If you can guarantee that you will finish 5th in the Swiss by taking a draw in the final round, but playing would mean that you finish either 1st or 9th, do you take the draw? I can hear all of you saying that you'll take the draw in a heartbeat, and some of you are looking for documents to sign to that effect. For those who don't understand why, I'll break it down here. But at this point, it's time to deal with our last question:

3. List the benefits of finishing first in the Swiss portion of a tournament with a Top 8 final, and which, if any, also apply to the person who finishes second as opposed to third in the Swiss portion of a tournament with a Top 8 final?

The seeding for the final matches is based on Swiss finish (1st plays 8th, 2nd plays 7th, 3rd plays 6th and 4th plays 5th) in both Limited and Constructed. When Rochester Draft (Limited) was used for the finals of Pro Tours (or Pro Tour Qualifiers), the person finishing 1st in the Swiss could choose who drafted first. Now that the default draft option is Booster Draft, even that limited benefit is no longer present. However, when considering the final standings for a Professional-level event, such as a National Championship, Grand Prix, or Pro Tour, the person finishing higher in the Swiss counted as finishing in the higher position among those who were eliminated in the quarterfinals or semifinals.

As the above indicates, there are some benefits to finishing high in the Swiss. But as the first paragraph of this section indicates, most people won't play for them if they risk a losing a Top 8 finish that would otherwise be assured. This indicates that the benefits above aren't all that beneficial, at least in comparison to the risk involved.

For those who despise the idea of intentionally taking a draw for a match instead of playing it out, here are two of the factors involved. First, game wins are all but meaningless, so you can't compensate for a match loss by sweeping the remaining games. And second, there isn't that much difference between 1st in the Swiss and 8th, so there's no reason to take the risk of playing a match when a draw gets you into Top 8. Add these together, and you get the general "admission" among judges that intentional draws are here to stay.

Section 3: What do we do now?
But is it really such a foregone conclusion? If you limit yourself to the current system, I might have to agree. But that is begging the question of whether we should consider a change. If such change is considered, here are two places I would look:

1. Include Game Performance in Primary Scoring. In the olden days (before the millennium), I read of some reports from British tournaments that divided the match points based on the outcome of the match. I would modify it a bit (using a 12-point system instead of the 10-point system described) and score accordingly. If a person wins the match 2-0 with no draws, that player would get the full 12 points, and the loser would get 0. If there were any drawn games, the winner would only get 11, and the loser would get 1. a 2-1 win with no draws would score 10 for the winner and 2 for the loser, draws would change it to a 9-3 split. A 1-0 win (with any number of draws) would score 8 for the winner and 4 for the loser, and a drawn match would provide a 6-6 split. Although this gives more credit for a draw (awarding half the match points available instead of 1/3 now), it also works to separate scores even from the opening round.

2. Give a play advantage to finishing higher in the Swiss. Instead of making Swiss finish an afterthought, why not offer a real incentive to finishing higher in the Swiss? The one I'm leaning toward now is to award a game win on each Top 8 match to the player who finished higher in the Swiss. If that were the situation, would the choice be so clear to take the draw for the guaranteed 5th place if playing could get you the Swiss lead, and a bonus game win in all your Top 8 matches?

3. Base all or part of the prize fund on the Swiss rankings, with the rest based on Top 8 performance. This is now common practice at some events where invitations are involved. The organizer is permitted to pay out the non-invite prizes based on Swiss standings, and have the players play only for the invitations. Such a policy needs to be spelled out in advance.

4. Give the higher seed the first play/draw decision in the match. This was used in 2010's Pro Tour Amsterdam as an experiment. We will await the results to see if this becomes official tournament policy.

Other options are available, but the need to consider the tournament structure as it stands is there.

Friday, September 17, 2010

Heresy The Third: ELO Rating System Inappropriate for Magic

(This is a reworking of a previous article from 2005.)
The ELO* rating system has been adopted by Magic for some time now, for reasons I have yet to understand. I don't hold that opinion out of some deep-seated frustration borne of not understanding ELO. In fact, I understand it about as well as anyone, as will be made evident throughout this article. It is precisely because I do understand it that I think it is inappropriate for Magic.
Under the ELO system, ratings are adjusted after each match (or, in the case of chess, after each event) based on a formula that compares how well you did versus how well the formula predicted you would do based on the difference between your rating and that of your opponent(s). If you exceeded expectation, your rating is assumed to be too low and gets adjusted up, whereas if you did worse than expected, your rating is assumed to be too high and gets adjusted down. This is eventually supposed to settle down to the point where you hover around your "true rating" which is supposed to be a measure of your skill level.
Even if we assume that a person's skill rating doesn't change that much, the above is ludicrous. Take three players, Alan, Barry, and Chris. They begin, as every new DCI player does, with a rating of 1600. Let's assume that somehow the only sanctioned matches Alan and Chris have are against Barry, and that they play enough matches so that all three players' ratings stabilize. Barry wins about 76% of his matches against Alan, but Chris wins 76% of his matches from Barry. Over time, the ratings will stabilize to around 1400 for Alan, 1600 for Barry and 1800 for Chris. So far, so good.
The problem with this is that the system makes several implications, none of which are justified. First, is that Chris will win, according to the formula, 91% of matches against Alan. This assumption is dubious at best. Next, once another person enters the mix, the chances of the formula being able to predict performances against every other player goes down dramatically. With the vast database of players around**, the chances that the active players' ratings will actually converge at all are virtually nonexistent, even if we assume that only 1% actually play on a regular basis and that we only care about the system converging for these players. Therefore, the very assumptions that the ELO system is built around fail to hold even under ideal conditions.
The factor which determines the rate of adjustment is called the K-value. This is a number that determines the greatest amount by which a match can change ratings. In cases of ratings that are near one another, the change will be close to half the K-value; changes equal to the full K-value only occur in cases of a complete upset (in a 32K event, that happens when an underdog wins who is 720 rating points or so lower than his or her opponent). However, where are people most likely to have ratings that are vastly different than their "true rating"? Friday Night Magic events. But those are also the events that have the lowest K-values, so a person who plays only in FNM events who is head and shoulders above the competition will take forever to increase to something close to a true rating. How long? Assuming every match is against a 1600 player, it would take a 73-0 match record to reach 1799. By the same comparison, a person who wins a 40K event (Grand Prix or the like) against all 1600-players with a 17-1-1 record will hit around a 1799. But aren't events like that usually populated by people who have been around for so long that their ratings should be established by now? So the events where the ratings can change rapidly are populated by the people whose ratings are likely to be close to their true rating, whereas the events that don't change ratings much have the people who are the most likely to not be near their true rating.
And just what is a person's true rating, especially in Constructed? Suppose a person plays well enough to earn a rating of 1840. This is about an 80% win percentage against an "average field" of 1600. Now when that the format rotates and this person's killer deck rotates out, or a card from this deck gets banned, what happens to this player's rating? Should it drop if the new state of the format is such that no deck and no player can post better than a 70% win percentage against the field? Even a 70% win percentage against an average field drops the "true rating" around which this person's rating will hover by nearly 100 points (1747 to be exact).
Another consideration is that ratings only change when there are new matches to put in them.The qualifier tournaments for PT Paris in 1997 counted in the Eternal ratings. I did well enough there to post an above average rating during that season. But since I haven't played in an Eternal event since, my rating in that category is still at that value. Now take the person who builds up a high enough rating to get invites he or she never uses. You look at the ratings and see "1950" or whatever year after year. Is the person sitting on the rating or maintaining it through new matches? Unless you happen to know the person in question and can ask directly, how would you be able to tell? (Sure, you could look this person up now and then, but who is going to do that?) Meanwhile, if you have a situation where only the top X players get invites, having someone sit on ratings makes it harder for everyone else.
Besides, if ELO ratings were a true measure of a current player's ability, why are there Pro Points? Pro Points begin at 0, and are awarded based on finishes at certain events. (I will refer to this as an additive system, because any adjustment can only add points to your rating.) Lifetime points are used to determine some things, such as Hall of Fame eligibility, whereas annual points are used for Player of the Year awards. This system seems to work well enough there, and I am convinced that a modified version of that system can work in general.
In this grand plan, instead of a person beginning at 1600 and being adjusted up or down on a per-match basis, you would begin at 0 and move up based on performance. Ratings adjustments can be calculated on the spot, with no knowledge of any other player's ratings needed. There is never a time when your lifetime ratings drop.
To handle the problem of people who earned points eons ago, it would be easy enough base invites due to rating on how many points were earned in the last year. But how do you "discount" older performances in an ELO system? Chess awards bonus points if recent performances were far in excess of predictions, and Arimaa assigns a rating uncertainty (K-value) based on how recent your latest activity has been, but neither one solves the issue of fully removing the older ratings.
In an additive rating system, it makes sense to assign bonus multipliers for high-profile events. Whereas moving 200 points in ELO rating might be impressive for randomly winning a Grand Prix, it doesn't reflect the true value of the win, and that rating boost will get demolished if a losing streak hits. Of course, you still get Pro Points for it, but having a 980 (or 990) point addition to a rating that starts at 0 says a lot more, and can never be taken away.
The possibilities for people to find "lesser, but attainable" goals based on rating multiply. What if we switch over and, by applying the new formula to the matches already in the database, we find some big-name player at a "low" rating of 2850? Now imagine someone seeing that and making it a goal to earn that many rating points in a year. That could prompt someone into entering a late-year Grand Prix just to get the remaining 30-50 points to achieve this goal, or it could spark a race between two people on opposite sides of Europe trying to compete for the Continental Amateur of the Year. Awards such as Pro Player of the Year are intelligible because the Pro Points are additive, and doing the same for everyone can only improve matters. If it's done right, Pro Player Points could become just a subset of standard Rating Points. We can even keep the distinction between Limited and Constructed (and Eternal, if we want) and even break it down by format.
(For example, "Joe Gamerdude is a father of three kids and racked up 2,500 rating points in Scars of Mirrodin Booster Draft in just six months. His hobbies include climbing Blackcleave Cliffs, reading Venser's Journal, and beating people up.")In a world of additive ratings, you can never be fully assured that you will remain on the top of any lifetime ranking list for any decent period of time. Perhaps I play everything I can to get to 750,000 points (and a supposedly insurmountable points lead in my state) by the end of the decade. What's to stop another person from doing the same thing to beat me? Leave a high additive rating sit and it becomes a target.
So how would I award points? I'm so glad you asked. I created a system based on the original ranking formula used by Wizards in 1994 before ELO was adopted. Since the old system was in place before more than 99.9% of the current Magic community ever played (and before a growing number of them were even born), I will explain the old system.
Back in these days, all sanctioned Magic tournaments were single elimination. (On another note, lobbying for other formats, such as Swiss events, was my first entry into Magic heresy.) Everyone who played in the first round earned 10 rating points. Those who survived and played in the second round earned an additional 20, for a total of 30. Likewise, those who played in the 3^rdround earned another 30 points, and so on. The winner of the event was treated as playing in an extra round alone, so a person who won a 16-person single elimination event (4 rounds to eliminate everyone) would have earned 150 points (10 + 20 + 30 + 40 + 50).
The first thing to note about this system is that every player who entered got points, even those eliminated in the first round. The second thing to note (if you're a math geek like me) is that the points any person earns in the event are 10 times a number you would get by adding successive integers, so the formula is well-known. So taking the successive integer formula and multiplying the result by 2 is exactly what my formula says people in a "perfect" single elimination (i.e., one with exactly 8 players, or 16, or 32, etc. so that no one gets a bye) will receive, if we plug in the number of rounds the player plays. It is also 1/5 of the number of points the original DCI system would have awarded.
Rather than bore you with the details of how that formula morphed into what I have now, I will give you my formula and show you how the values coincide. I would derive for each player two numbers: W, which is the number of matches the person won plus 1, and V, which is a value which starts at 2 and goes up by one each time you advance into a new bracket. I would then award W x V rating points. This produces for the 16-person single elimination event:>Place Wins W V My Old9-16 (1st round) 0 1 2 2 10
5-8 (quarterfinalist) 1 2 3 6 30
3-4 (semifinalist) 2 3 4 12 60
2 (finalist) 3 4 5 20 100
1 (winner) 4 5 6 30 150As you can see, the values for my rating system are exactly 1/5 what the 1994 system provided.
For imperfect single elimination, the lowest V is based on which "perfect" elimination value the number of players is closer to. So an 11-person event is closer to 8 (the lower bound) than 16 (the upper bound), but a 13-person event is closer to the upper bound. Those values that are exactly in between (12, for example) are treated as being closer to the upper bound. The "bottom" V for upper bounded events is again 2, whereas it drops to 1 for lower bounded events. I'll show you a 20-person single elimination in the same format as above. As there are byes, I'll give two lines where a bye is involved.
Place Wins W V My Old 17-20 0 1 1 1 10
9-16(bye) 0 1 2 2 20
9-16 (no bye) 1 2 2 4 30
5-8(bye) 1 2 3 6 50
5-8(no bye) 2 3 3 9 60
3-4(bye) 2 3 4 12 90
3-4(no bye) 3 4 4 16 100
2(bye) 3 4 5 20 140
2(no bye) 4 5 5 25 150
1(bye) 4 5 6 30 200
1(no bye) 5 6 6 36 210
This abandons the 1/5 rule above in favor of trying to balance the event against the ones we have a good formula for. When the number of players is close to a lower bound, there are many people who get a first-round bye. This system treats those people exactly the same as if they had played in the "perfect" elimination with the lower number of people (16 in this case). Those that don't have byes are treated slightly worse off than would be the case if they had been in the event with more players, but then again, they are likely to have faced one or more people who did have byes, so this balances. By similar argument, setting the lowest value of V at 2 for those events closer to the upper bound balances most people (who in this case are the ones without a bye) against those in the "perfect" event where people play the same number of rounds they did, at the "expense" of giving the players with byes a small points bonus.
Now we have the tools to expand this system to every tournament format. We can use the basis of the elimination charts generated above to determine each player's V for the tournament based on their finish. In the rare event that two people are legitimately tied for a position, we average the V those finishes would receive. We know how many matches the person won, so W is easy to find as well. One simple multiplication and we're done.
There are only three real adjustments that I make at this point. The first adjustment is for big events. Big events should be worth a lot more than small events, even if the small event has the same number of people. A 32-person Qualifier Tournament should be more impressive than a 32-person Friday Night Magic. As a simple means of accomplishing this, I would take an event's current K-value and divide by 8. This makes the formula for any player in any place in any event K/8 x W x V. If this were to be adopted, the inflated K-values would no longer be needed and could be refigured to deal with the new system, so that Worlds could count as a "6K" event, meaning that the awards would be 6 x W x V.
The second adjustment concerns drops. In the elimination format, drops are not an issue, as the game automatically drops people when they lose. But in a Swiss, a person who loses the first round can go on to win the event, and people can drop after any round. But also notice that our W was figured as one plus the number of matches won. In terms of ratings, the fact that each match win increases your W, and may increase your standing to the point where your V increases, might lead more people to remain in events past their viability for prizes, especially if the K is big. This is something I personally like, so I offer the further incentive to stay by giving the +1 in W only if the player stays until his or her natural end in the event. If you drop, you lose the part of the W component of the rating award you would have received had you stayed to the end. I know this is the opposite position than that of many Tournament Organizers, but that discussion is a future Heresy waiting to happen.
The third issue is that of draws. Most rating systems count draws at half the rate of wins. For standing purposes, draws only count as 1/3 of a win at DCI events. So the question that needs to be asked is, do we want to extend this to the rating awards?
The problem is the degree to which fractions enter the system. Count draws as losses and you still get random half-points from tied V scores. Count them as 1/2 of a win and quarter-points enter. But count them as 1/3 of a win, and you have the real possibility that someone might end up with an award of 5.8333 rating points or some such nonsense. This can be eliminated by multiplying the awards by a certain factor (no more than 6), but whether such an adjustment is worth making is a topic I am completely ambivalent about.
So a 16-person Friday Night Magic running 4 rounds of Swiss with no Top 8 will award a first place person 5 (4 wins plus 1 makes a W of 5) times 6 (9^th-16^th is the lowest tier, which gets 2, 1^st place is 4 tiers higher, hence V for this person is 6) for a total of 30 rating points. By the same token, a person who goes 16-1-1 and wins a 400-person Grand Prix will get 980 or 990 points***, assuming a K/8 of 5.
This system is fully adjustable based on what people would want to emphasize. If the consensus opinion doesn't like the idea of everyone who stays in an event getting points, the system could be adjusted to only award non-zero V values to people finishing in certain positions, such as Top 8 or "1/5 of those who entered, rounded up, plus ties" or whatever. Maybe the starting V is too low and should be raised to 3 or 4. Maybe some event should offer 10 times normal points. Whatever ends up being decided is fine with me, as long as everyone follows the basic plan. But to me, sticking with an ELO system that flies in the face of Magic reality is worse than any additive system we can devise.
*The ELO rating system is named for Arpad Elo, a mathematician who designed the basic formula for chess. (By convention, ELO [in all caps] is the name of the rating system and "Elo" refers to the mathematician, not that the Magic Floor Rules care.) The formula used is one of a class of functions that have these four mathematical properties:
1) Every rating difference (positive or negative) must produce a Win Expectancy (WE) number between (but not equal to) 0 and 1.
2) My WE versus you plus your WE versus me must add up to exactly 1. As one rating difference will be the negative of the other, it means that WE(D) + WE(-D) = 1 for all values of D, where D is the rating difference.
3) Every WE between 0 and 1 must have a unique rating difference associated with it, if you extend the ratings to include decimal values.
4) The function is always increasing (i.e, the more I am above your rating, the higher my WE will be).
**As of September 13, 2010 there were 320,348 players in the global rankings database, meaning they have played at least one DCI sanctioned match.
***980 if draws count as 1/3, or 990 if they are counted as 1/2 of a win. The reader is encouraged to figure out what W and V are for this example.

Thursday, September 16, 2010

Heresy the First: Chess Clocks in Magic are Necessary

This is a reworking of the article I first published in 2005. It outlines my heretical belief that Magic not only can work with chess clocks, but that current policy is unnecessarily warped by not exploring this option. I have never denied being a heretic on this issue, but I haven't been as forceful about it in a public forum as I will be now. (If you ever see me post, "I won't go into my usual rant about using chess clocks in Magic," it's because I don't want to type the entirety of this article in that spot.)
Throughout the history of Magic there have been people who have suggested chess clocks. Each time, the suggestion has been rejected, for various reasons. I have never understood the arguments, and I think that clocks would solve more problems than they create. This is an attempt to examine the situation as it stands and offer reasonable answers to the objections against their use.
There are several standard reasons given for not using chess clocks. One argument is based on inertia, claiming there is no need to do anything when the current system works. Some objectors cite cost issues, claiming that we shouldn't place a high cost burden on organizers or on the players. But the main complaint is that chess clocks won't work in Magic, and it runs along the lines of, "Given all the times a person can pass priority during a turn, there is no way that clocks can be implemented. If you don't believe me, try doing MTG Online with all stops enabled." I will examine each of these in detail.

Do we have a sufficient system already?In bridge tournaments, there is no chess clock governing each team's bidding and play. You are given approximately seven minutes per hand to bid and play it (except during major events, where more time is usually given). When the time is called, the current hand is finished, and if there are other hands that haven't been played yet, the director (i.e. judge) can either assign a "late play" (meaning the missing hand will be played after the usual session) or an artificial result (which is normally the average score a player can receive, possibly modified up or down by 10% based on a judgment of who is at fault for the problem).

Although this has been the way of the world for bridge, it's not a perfect solution, even there. If you limit yourself to one "problem round" per session, it's not that hard to run one "late play" that happens after the match is over and be "a bit late" on several others. Furthermore, bridge can get away with this, as most of the events don't have Swiss pairings. Each team and each hand moves in a defined pattern, which you can continue into the next round while the slower players finish the previous one. The policy, in effect does not punish slow play so much as play around it.

So does this really convert to Magic? Some games of Magic are over in a single turn, while others take 100 or more. Some games are extremely balanced where neither player can advance because they have enough resources to counter each others' plans. These games will last until someone manages to draw the one card that can't be stopped, and if it's four cards from the bottom of the deck, whose fault is this that the game lasts 40 minutes? Should a person resign just to leave enough time for another game (or two)? What if one person attacks "meaninglessly" into a Platinum Angel? Should he be punished for taking an action whose only result seems to be to tap a creature and "burn" up to 30 seconds off the match time? Should the opponent be punished for building a deck that can't win in under 15 minutes? These are the questions that come up in tournament events now, and they can all be directly attributed to the fact that both players are drawing time from the same match clock.

It is true that we've been playing without clocks for several years now, and that there haven't been major meltdowns. But anyone who has a deck that plays quickly and is running up against time pressure because of the opponent knows the situation is problematic. So, while the current system hasn't blown up yet, it isn't so error-free as to preclude looking at alternatives.

Who is going to pay for all these clocks?If the organizers want to, they can provide clocks. I am not expecting that to happen, though, as this doesn't even happen at chess events. However, a valid chess clock (see below what I mean by that) can be had new for under $50, which you can't say for a box of any Standard-legal set. The cost for a player isn't that much more than what is already being paid.

Of course, all this assumes that the chess clocks were mandated. I don't have to assume that. I would allow anyone with a valid clock to request that all their matches use a clock, and run the normal method for the other matches. After seeing both methods in action, I am convinced that clocks would become more popular.

So how do you handle all those passes?This is the most common objection I receive against clock use. It usually is phrased something like, "Take a situation where a person is sitting on less than one minute on their side. What stops the opponent from moving through every single change of priority, forcing that person to waste time reacting to each one?" or "In MTG Online, you can get the person who uses Seeker of Skybreak to untap itself. Since you would have to respond to each activation (even if the board position appears unchanged), a person who falls behind on time can be drained to nothing and lose."

The answer lies in three related ideas. First, we already have procedures in place for dealing with people who deliberately waste time. Judges are already capable of handling the extreme cases by means of penalties for slow play, stalling, and trying to take advantage of a time limit. When a chess clock is involved, slow play and stalling only penalize the person being slow, and thus special rules for it don't need to be created. And as the only time limit you can control is your own, the only way you can take advantage of a time limit is to play faster, which is exactly what we want to encourage.

Second, the shortcuts we already use are adequate for keeping most games at a reasonable pace. Without them, even a normal match could not be completed in the time allotted. It is already common for a person to draw a card, possibly play a land and end the turn. We understand what this means, and if the opponent wants to do something during such a turn, we know how to make this happen. Furthermore, if an opponent insisted on declaring everything just to run out the clock, the judges are already poised to deal with it. If I saw someone repeatedly tap a Seeker of Skybreak to untap itself, I am quite willing and able to call a judge to deal with it. The presence of the clock doesn't change that option. If a person who has been doing the "draw, go" turns above now suddenly wants to declare every single stop just because I have less than one minute and he has five, that's worthy of a judge call as well. Again, the procedures already in place handle these.

And third, we can take advantage of the newer chess clock technology. Many tournaments nowadays operate with "delay" clocks. These clocks, once the button is pressed to switch priority from one player to another, wait anywhere from 2-5 seconds (the exact time depends on the type of event - blitz games have a limit closer to 2 seconds, whereas games with very long time controls usually have a 5 second delay) before the time starts coming off the player's allotment. So if the response is made (and the clock is pressed) before the delay period expires, no time is lost. If a person wants to do the Seeker of Skybreak trick, he needs to announce the ability, including the fact that it's targeting itself, tap the creature to pay for the ability, and then hit the clock. All I have to do is hit the clock to send it back. Then, he has to untap the creature and repeat the process. Again, all I have to do is hit the clock. I know I can pass within two seconds, so a delay timer of at least that size means that I won't lose any time. I can't guarantee that the opponent won't lose time, and even if he doesn't, we're still going nowhere. Furthermore, I can adopt the Magic Online approach, where an option exists to never respond to that ability. As I said above, this is already covered under current procedures, but even so, the delay timer means that I don't have to worry about it.

In fact, if I am in a game with a delay clock and I get into a time crunch, it's probably to my advantage to find as many stops as I can and use them. Each one gives me a few more seconds of time that is not coming off my clock, so I may, by doing the thing that supposedly harms me, find the time to think of something I wouldn't have otherwise. This to me is the ultimate proof of the emptiness of the original argument.

Other derived benefits of using clocksIf clocks are used, we no longer have to worry about how much time we give for delays in handling deck checks and judge calls, as we can simply pause the clock and resume when the situation is resolved. Also, there would be a simple procedure for determining the penalty for not arriving to a match on time. The person who is there simply starts the game clock for each game. If the opponent arrives during that time, that game continues with that much time lost for the person who arrived late. If a game clock expires, that is a game loss for the person who has not yet arrived. Enough game losses will translate into a match loss. If neither opponent is there, the judge can note the time and deduct it from each person until one shows up, and assign double game losses or match losses as necessary. The five-turn rule would become meaningless and vanish.

Rules for Using Clocks (version 2.0)
In the time since the last iteration, I have made a few changes to the rules I would suggest for using a clock. Here is my latest attempt:

Use of Chess Clocks at Magic events

1. Section 1: Definitions

a. A chess clock is defined as a device that keeps a timer for two players, such that no more than one player’s timer is working at any given time.

b. A simple delay function is a feature of a chess clock that, after a player has passed the clock, delays deducting time from the opponent’s clock for a predetermined period of time. If the person whose time would now run manages to complete his or her action before the full time of the delay passes, no time is removed from his or her clock.

c. A Bronstein delay function is a feature of a chess clock that, after a player has passed the clock but before the opponent’s time begins, adds time onto the active player’s clock equal to the lesser of:

i. The amount of time since the clock was passed to the player, or

ii. The delay time defined before the game starts.

d. Passing the clock is defined as invoking the mechanism that causes the timer controlling the current active player’s turn to stop, and engaging the simple or Bronstein delay function.

e. Stopping the clock is defined as invoking the mechanism that causes both timers to stop (along with any timer associated with a delay function) , usually as either a prelude to calling for a judge or to signify a game is complete.

f. Flag fall occurs when one or both players exceed their allotted time for game completion.

2. Section 2: Conditions of Use

a. A player who wants to use a chess clock for all his or her matches must bring one with either a simple delay function or Bronstein delay function.

b. When registering for the event, the player must inform the head judge about the clock and his or her desire to use it for the event.

c. The head judge may inspect any chess clock prior to its use in the event. Improper or defective clocks may be removed from the event at the head judge’s discretion. In addition, the head judge may remove the clock if, in his or her determination, the player who brought it does not have sufficient understanding of how to use it.

3. Section 3: Setting up for the match

a. If neither player in a match has a clock, the match proceeds normally. If at least one player has a clock, the following procedures are used instead of the normal rules for timing Magic events.

b. When only one player has a clock, that clock is used. If two players with proper clocks play each other, determine who plays first. The player who does not make the choice about who plays first instead chooses which player’s clock will be used and where it will be placed in the play area. The clock must be set in a place where both players have access to it.

c. Once this is done, the pre-game procedures below are followed to start the first game.

4. Section 4: Pre-game procedures

a. If the clock in question has both a simple delay function and a Bronstein delay function, the simple delay function is used. Otherwise, use whichever of the two functions the clock has.

b. A clock with a simple delay function is set to provide 8 minutes for each player and a 3 second delay between each player’s actions.

c. A clock with a Bronstein delay function but no simple delay is set to 8 minutes and 3 seconds for each player, with a “refund value” of 3 seconds after each move.

d. Players shuffle their libraries, draw opening hands, and resolve mulligans.

e. Once the game is ready to begin, the non-active player starts the opponent’s time.

f. If a player believes his or her opponent is taking too long to perform pre-game procedures, or if a person is tardy, he or she may ask a judge for permission to start the opponent’s time early.

5. Section 5: Passing the clock

a. The clock is passed when any of the following occurs:

i. The active player ends his or her turn.

ii. The active player passes priority.

iii. A spell or ability being resolved only has actions controlled by the other player left to perform. (Example: A player casts Day of Judgment. When it resolves, the player who cast it has already moved all his or her creature cards into the graveyard, but the opponent has not. He or she passes the clock to allow the opponent to finish the action of Day of Judgment.)

iv. The non-active player whose time is running passes priority

v. The explanation of a proposed shortcut is complete

vi. The opponent of a player who has proposed a shortcut either accepts it or announces when he or she will interrupt it and what action will be taken at that point

vii. The non-active player whose time is running finishes resolving actions under his or her control.

viii. The player whose time was started early by permission of a judge arrives after being tardy and/or completes the actions that prompted the judge to grant permission for the early start.

b. The clock is not passed in the following situations:

i. The active player plays a land card.

ii. The active player takes an action that doesn’t go on the stack, such as turning a card with morph face up, or drawing his or her card at the beginning of the draw step.

iii. The player still has actions to perform as part of resolving a spell or ability he or she controls, even if the opponent also has actions to perform.

c. If a situation that doesn’t appear in either of these lists occurs, the head judge can be called to rule on whether the clock should be passed.

d. Etiquette for clock passing

i. Players must pass the clock with the same hand that they used to perform the action. If both hands are involved in the action, either hand may be used.

ii. Players may not have a hand hovering in order to save time passing the clock.

iii. Players may not use excessive force to pass the clock.

6. Section 6: Stopping the clock

a. The clock is stopped in the following circumstances:

i. A player has called for a judge.

ii. The game is over.

b. If a player intentionally stops the clock, but does not call for a judge, this is interpreted as the equivalent of a concession. The player should communicate the concession to his or her opponents.

c. If the game is over, but not the match, reset the clocks and perform the pre-game procedures for the next game.

7. Section 7: Flag fall

a. Even though a flag has fallen, the game proceeds normally until flag fall has been announced by one of the players, or a judge watching the game.

b. Once flag fall has been noticed, no further turns can take place. If flag fall is announced during combat, the combat continues but the player whose flag has fallen can’t cast spells or activate abilities.

c. If the game has not already reached a conclusion by that time, the player whose flag has fallen loses the game.

8. Section 8: IPG considerations

a. When clocks are use, there is no advantage to playing slowly, and thus the rules for Slow Play are not applicable.

b. For purposes of Player Communication Violations, information provided by the chess clock, such as how much time is remaining for each player and whether or not a flag has fallen, are all considered derived information.

c. Only the players in the game and judges watching the game may speak about what is on the clock. Comments about a fallen flag or the time on the clock made where the players can hear can be considered Outside Assistance.

d. Judges may investigate the use of excessive force to pass the clock under the guidelines of Unsporting Conduct.

SummaryI don't see any reason why chess clocks can't be used in Magic. However, there are many issues that are easily solved by them. Every time I sit across from someone with a deck who is spending a minute or more deciding whether to Mana Leak my White Knight, I wish I had a chess clock there. Every time I call a judge over during a time crunch I wish I could stop the clock and resume it instead of trying to negotiate how much extra time gets added to the match. It's an idea whose time has finally come.