Forum
A place to discuss topics/games with other webDiplomacy players.
Page 143 of 160
FirstPreviousNextLast
North America Minor Powers
So far it seems pretty balanced but I may need to run another test to make sure.

16 replies
Open
Caustic (827 D)
12 Jun 20 UTC
Help
Can some tell me all that is needed to create a variety or point me in the right direction?
2 replies
Open
David E. Cohen (1000 D)
10 Jun 20 UTC
(+1)
New Dip AI Article
If you are interested: https://venturebeat.com/2020/06/10/deepmind-hopes-to-teach-ai-to-cooperate-by-playing-diplomacy/amp/
3 replies
Open
WumbologyDude (1000 D)
12 Jun 20 UTC
(+1)
Touchy Diplomacy
I'm not sure if I can advertise games on this forum but I started a Touchy diplomacy game called Touchy Diplomacy. I've really wanted to try it out so I would love it if I could get some people to get the game started!
1 reply
Open
CalvinWallC (1226 D)
05 Jun 20 UTC
(+10)
Black Lives Matter
I saw that WebDip has put a little header on their website in support of Black Lives Matter and the protests in the United States. I thought it was a really good way to use their platform for good and I was wondering if Vdip could do something similar?
56 replies
Open
OrdinalSean (998 D)
12 Jun 20 UTC
What is a "turn"?
I was just wondering what counts as a turn, for example for "extend the first X turns". Is "Spring 1901" a turn, or does it count all phases, or does it count the movement and build phases but not retreat?
1 reply
Open
Lukas Podolski (1234 D)
11 Jun 20 UTC
Perfidious Zine
Issue 3 was out in May, does anyone know if Issue 4 is out yet?
1 reply
Open
AJManso4 (2318 D)
03 Jun 20 UTC
Reliability
Is it possible to increase reliability rating? I know i can take over games to get unexcused turns back, but does that increase reliability?
4 replies
Open
Anon (?? D)
06 Jun 20 UTC
Cllasic game - all SC's are contries
please join this game:
https://vdiplomacy.com/board.php?gameID=44026
1 reply
Open
d.a.barchipelago (1726 D)
26 May 20 UTC
(+3)
Suggestion: Close chat option
On large variants, having lots of chats open makes the mobile interface quite cumbersome. It'd be nice if we could close chats, or if chats with dead people closed automatically.
9 replies
Open
Lagaroth (1073 D)
29 May 20 UTC
(+2)
How to do a proper convoy
Is this how to convoy? https://vdiplomacy.com/board.php?gameID=43159
9 replies
Open
Titus (1572 D)
21 Jun 19 UTC
(+3)
How to Create Group Chat Feature for the Site
Does anyone know how a group chat feature could be implemented to allow players to initiate a chat with a selected sub-group of the players in a game. Below is the idea i thought of how to implement. Maybe someone has some more specific useful ideas and any idea how easy and the cost to implement. My idea to implement is in following response

23 replies
Open
GOD (1791 D Mod (B))
24 May 20 UTC
TELECONFERENCE GAME SUNDAY
A friends of mine and I want to play a live game next Sunday, but with voice negotiations through discord. We've tried that once already and it works really well, pretty close to ftf.
6 replies
Open
Samuel, o Louco (824 D)
29 May 20 UTC
Which font do you use to make texts in diplomacy variants?
I'm trying to create a variant in Photoshop, but I don't know which font to use to write the names of the places; it gets foggy when it is very small. Does anyone know what kind of font I use?
4 replies
Open
Aelfred Smith (1051 D)
26 May 20 UTC
Absolute War
Hello all

We are trying to start a Gobble Earth game and we need a lot of people. You are welcome to join.
1 reply
Open
New Concept: Squadrons
This is just an idea I had: What if there were squadrons (a group of planes) in vDip?
They would be able to go on both land and sea territories, but of course there's a catch, or they would be OP.
The catch: They would not be able to capture territories.
Any thoughts?
6 replies
Open
Aelfred Smith (1051 D)
25 May 20 UTC
Absolute War
Hello all
We are trying to start a Gobble Earth game and we need a lot of people. You are welcome to join.
https://vdiplomacy.com/board.php?gameID=43798
1 reply
Open
Maluco Rasta (1147 D)
24 May 20 UTC
Choose Your Evil Empire
Good day.
I am playing this game and it happened to me to move a fleet and it took the order wrong and went elsewhere.
It may have been a mistake, but now, the next turn, she won't even let me give her an order. It doesn't show me like a fleet of mine in the game.
How do you do with these things?
1 reply
Open
Mercy (2131 D)
30 Apr 20 UTC
(+3)
Suggestion: new rating system
I propose a new rating system that will fix many of the problems people experience with vRating.
Page 1 of 3
FirstPreviousNextLast
 
Mercy (2131 D)
30 Apr 20 UTC
(+7)
Many of you may already be aware of some of the problems with the current way vRating works. To summarize, the main concerns with vRating are the following:

PROBLEMS WITH VRATING
1. Winners in large variants, like Divided States, make jumps in their rating that are too large.
2. Headhunting is encouraged in vRating, i.e. you gain more rating from eliminating a higher rated player than a lower rated player.
3. The way players who take over a country in Civil Disorder gain or lose vRating needs improvement.

These problems have existed for a long time. How should we go about fixing them, though?

If only there was a global pandemic that would cause somebody - preferably a Diplomacy player that recently graduated in Mathematics - to sit bored at home and spend time looking into this.

Well, we are in luck, because that person is me! After some puzzling, I have derived an alternative rating system that works in much the same way as vRating, except that all of the aforementioned problems are solved.

I am not a moderator and neither do I have the programming skills to actually implement this, though. The purpose of this topic is to gain feedback from the community. Oli only wants to change things if there is enough support for it.

Below, I will address each of the three problems and
- explain the problem in detail;
- argue why I think it is definitely a problem that needs solving;
- intuitively explain how my alternative rating system solves this problem.

But first, I will give a crash course in Elo rating because this is critical in understanding the logic behind everything else I will explain.


ELO RATING: HOW DOES IT WORK?

Elo rating was originally invented by Elo to rate players in the game of chess. The logic behind Elo rating is as follows. Every player has a rating that reflects their skill in playing chess. When two players play a match of chess against each other, the player with the higher rating is more likely to win. The winner of the match will get an increase to his rating and the loser will get a decrease, and the magnitude if this in- and decrease is bigger the more unexpected the outcome of the match was. For example, if the winner had a much higher rating than the loser, then the outcome was pretty much expected so the change to the ratings of the players will be small. However, if the winner had a lower rating than the loser, that was not expected so the change to the ratings of the players will be bigger. Many people consider this to be a fair system, as you do not lose too much rating from losing against a higher rated opponent.

Above was an intuitive explanation. In Elo rating, this is formalized mathematically. You can skip this paragraph if you wish and still understand most of the rest of this text, but for those interested, here is the mathematical formalization. In Elo rating a game of chess is modelled as two players drawing a random variable from a normal distribution with the same standard deviation but a different mean. The means are equal to the ratings of the respective players. The winner is whoever draws the higher number. In this model, one can calculate the theoretical probability of either player winning. The amount of rating the loser of the match loses is proportional to the theoretical probability he would lose, and the amount of rating the winner of the match gains is proportional to 1 minus the theoretical probability he would win. So if for instance the loser has a rating much lower than the winner, he is modelled to draw from a normal distribution with a much lower mean, so his theoretical probability to draw a higher number is small, so he loses only a small amount of rating from his loss. In some Elo-like systems another probability distribution is used instead of a normal distribution.

vRating does not work the same as Elo rating. The biggest difference is that it rates players in a game with more than 2 players. However, vRating is heavily inspired on Elo rating and has the same kind of logic behind it. In 2-player matches, for instance, vRating is the same as Elo rating except that the random numbers are not drawn from a normal distribution.

Now that we have an intuitive understanding of the logic behind rating systems, let us take a look at the problems with vRating.


PROBLEM 1: TOO LARGE JUMPS FOR WINNERS OF BIG GAMES

So far you may have thought: 'But Mercy! I don't think this is a problem. If you win a, let's say, Divided States game, then you DESERVE to gain a huge amount of rating!' Maybe. But something is definitely off with these large jumps. I will show you.

Consider the following example. Alex, Bob and Charly are all playing a different Divided States game, but with the same setting and against similarly rated opponents. Alex has a vRating of 1000, Bob of 1500, and Charly of 2000. All three of them solo. What happens to their ratings? You may naively think that all of them will gain a lot of points and Charly may be the new #1 rated on the side. You'd be wrong. In all likelihood, Alex, not Charly, will be the new #1. Bobs score will fall below the score of Alex and Charlies score will fall below the score of Bob.

How is this possible? It all follows from the simple way a victory is treated in vRating. A solo in a (WTA) Divided States game counts as a simultaneous win against all other 49 participants, a bit like as if you all beat them in a 2-player match at once. Charly is much more likely to win against anybody than Alex is. For the sake of the example, let's say that the ratings are such that Charly is 5x less likely to lose against any opponent in his Divided States game than Alex is. Then by the logic of vRating, Alex's victory is 5x more unexpected and thus Alex gains 5x as much points from his victory as Charly does. I don't see any problem with this so far; for instance I think that if Alex and Charly both beat Bob in a 2-player match, it is fair to award Alex with 5x the number of points Charly gets awarded due to Alex being much lower rated than Charly. But in the case of a Divided States solo it gets interesting. Alex gains 5x the number of points Charly does. Let's say for the sake of the example that Charly gains 600 points to his vRating. Then he jumps from a rating of 2000 to a rating of 2600. But meanwhile, Alex is awarded 5 x 600 = 3000 points for the same kind of victory and so he jumps from 1000 points to 4000 points, surpassing Charly (and everyone else on the site) by a ridiculous amount!

Clearly, this is nonsensical. What is the solution?

One word: Iterate. Let me illustrate what I mean by this by providing an example of an iteration of 10. We will award Alex extra rating 10 times in a row, but each time he gets awarded only 1 tenth of what someone of his rating would normally get. This means that the first time, Alex gains 3000 / 10 = 300 points and gets a rating of 1300. The second time, he wins 1 tenth the amount of rating a 1300 player would receive from winning a Divided States match. This is evidently less than 300. Let's say for the sake of argument that it is 200. Then Alex jumps to a rating of 1300 + 200 = 1500. The third time he gets 1 tenth the rating of what a 1500 player would receive from winning a Divided States match... which is even less than 200. After the tenth time, Alex will still have a rating that is really high, and he will probably be among the top rated players, but he will not have surpassed Bob, and neither will Bob have surpassed Charly.

Note that when I say that people get awarded extra rating 10 times in a row, this is only to explain what the computer is calculating. Practically, everyone will receive new points to their vRating immediately after finishing a game. Also, I gave the example of 10x in order to make the explanation easy to follow. In reality, I think the more the better, but computing power is a limiting factor. We may also choose to iterate more the more players are in the match.

The scenarios I outlined here are not far-fetched, by the way. Slypups jumped from a rating of 2177 to a rating of 2775 from winning a WTA Europa Renovatio game. Agnaar jumped from a rating of 1612 to a rating of 3109 from also winning a WTA Europa Renovatio game. It is only a matter of time before a low rated player wins a big match and gets catapulted to far above even the highest rating Agnaar has ever had.


PROBLEM 2: HEADHUNTING

Headhunting is the act of deliberately trying to eliminate high rated players in games in order to gain more rating. Headhunting is the reason many high rated players prefer to play anonymously, or just have left the site. Some of you may be thinking: 'Come on Mercy! What is the problem here? Obviously, eliminating a high rated player is worth more than a low rated player. It is only fair if that is reflected in the ratings. High rated players should just stop whining.' Some others yet may think: 'Exactly! Headhunting is no fun and that is why rating systems are stupid.' I am here to tell you that you are both wrong. vRating really does encourage people to specifically hunt down high rated players. There is no reason, though, why any rating system should encourage any kind of behaviour other than maximizing your own score in any given game. If a rating system encourages behaviour that is any different from this, then that is a flaw of the rating system that needs to be addressed.

At this point, let me make a remark. High rated players who complain about headhunting often do not seem to realize that under vRating, there is an opposite effect, too: If you think you will be eliminated, it is best for your rating to throw your centers to the highest rated player on the board, in the hope that this high rated player will achieve a result as good as possible. The reason behind this is that losing to a high rated player cost you less of your vRating than losing to a low rated player does. In the rest of this discussion, I will focus exclusively on the type of headhunting discussed in the previous paragraph, which incentivises players to eliminate top players in some circumstances. However, all of my arguments could be extended to the 'throw to high rated players' problem, which incentivises players to help top players in some other circumstances. The new rating system I propose solves both issues.

Curiously, I didn't actually have to invent a new rating system to solve these issues. There already exists a rating system that does not have them. It is called GhostRating, it is used on webDiplomacy and it is actually mathematically far more simple than vRating (though I will not explain the math here). I do not want to adopt GhostRating, and I will later explain why, but I do think GhostRating is an illustrative example of how headhunting is not necessarily encouraged in rating systems.

Speaking about examples, suppose Alex (a different Alex than the one who just won a Divided States game) wants to play a game of Classic with six of his friends from school. They decide to play on vDiplomacy. All of them have beginner's rating, except for Bob (again, a different one), who already has a few games in vDiplomacy under his belt, and he did well in these games. Alex plays well. He and his ally (not Bob) become the dominant alliance, but his ally refuses to try to 2-way draw with Alex, afraid Alex will betray him and solo. As such, they need a third power for balance and plan to end the game in a 3-way draw. (I know they are boring carebears. It is just an example.) Alex needs to make a decision whether he wants this third player to be Bob or not. If Alex eliminates Bob, he gains more rating than if he decides to draw with Bob, even though in both cases Alex ends the game in a 3-way draw. Therefore, Alex is encouraged to eliminate Bob. Curiously, if Alex decided to draw with Bob, then the rating of Bob becomes completely irrelevant for Alex's gain in rating. Bob may have a vRating of 0 or 3000: in both cases, Alex will gain the same rating from the game.

But if this entire scenario played out on webDiplomacy, things would be reversed. If Alex already knows he will end the game in a 3-way draw, it is irrelevant for his gain in rating whether Bob also draws or gets eliminated. What is not irrelevant, though, is Bobs rating, under any circumstance. The mere fact that a higher rated player like Bob is in the same game as Alex means that Alex will get more points from his 3-way draw, even if Bob is included in the draw.

I will argue why the scenario under GhostRating on webDiplomacy makes sense and under vRating on vDiplomacy does not. I think we can all agree that the presence of a high rated player in your game makes it more difficult to obtain a good result. Therefore, this presence should mean that you gain more points from getting a good result. GhostRating indeed gives you more points if you are in a game with a high rated player, but vRating does not; it only does if you achieved a different result than this high rated player. On this front, GhostRating is therefore better. On the other hand, if you draw, GhostRating does not care if you draw with the high rated player or if the high rated player is one of the eliminated players, but vRating does care. Some of you may think that this makes vRating better. After all, eliminating a high rated player is more difficult than drawing with one, so shouldn't you be rewarded for it? No, I strongly disagree. Eliminating players in alphabetical order is also more difficult than not doing so. Why then don't we award players for eliminating other players in alphabetical order? Because something that does not affect your score in a given game should NOT affect your rating as a player! So it is for eliminating (or helping) specific players because of their rating, as well.

I will proceed giving an intuitive explanation as to why vRating and GhostRating function so differently. vRating models winning any game as winning a 2-player game against all the other players on the board, and drawing as winning a 2-player game against all the eliminated players, with each 2-player game having a smaller weight than they would have if you won. In each fictitious 2-player game vRating looks at how unexpected the result is (a low rated player is unlikely to beat a high rated player) and awards more points to the winner the more unexpected the win was. GhostRating functions radically different. It looks at the game as a whole. Based on the ratings of the players in the game, it calculates the expected scores of all the players. For example, the highest rated player in a game of Classic can be expected to, on average, win more than 1 seventh of the pot. At the end of the game, it compares the actual scores to the expected scores, and awards rating proportional to the difference. For example, if your score is higher than would be expected from your rating, you get points, and if your score is a lot higher, you get a lot of points. I think looking at the game as a whole, like GhostRating does, is the correct way to do it.

However, GhostRating has one curious feature. Remember when I said that in a game of Classic, under GhostRating a high rated player is expected to win more than 1 seventh of the pot? This means that if a game of Classic ends in a 7-way draw, the higher rated players in the game lose GhostRating and the lower ranked players gain GhostRating. Under vRating, though, no one gains or loses any rating, as no one achieved a better result than anyone else. More generally, one can never lose vRating in a draw; but if a high rated player is part of a large draw, that player will lose rating. To tell you all a little anecdote: If I remember correctly, a few years ago the #1 rated player on webDiplomacy (VillageIdiot) temporarily lost his #1 spot because he was forced to accept a 4-way draw in a game (of Classic), and that cost him a lot of rating.

I do not mind this feature of GhostRating. To the contrary: I like it! It forces high rated players to actually try and do their very best to go for the solo if they want to keep their high rating. However, the new rating system I propose does not have this feature. More specifically, under the rating system I propose it is not possible to lose rating if no one in the game got a better result than you. Why didn't I just adopt the way GhostRating works? There are two reasons.
1. Any new rating system we introduce can and probably will be used retroactively. If we introduce a rating system that is too much different from the one we already had, this can mean that playstyles that worked in the past do no longer work, and that would be unfair to players. Plus, this point may cause the community to be divided. If I suggest a rating system that is basically the same as vRating except that it fixes some obvious errors, then probably almost everybody will be in favor of all the changes and will support the new rating system being introduced.
2. A rating system in which points are never lost in a draw probably translates better across multiple variants. In some variants a larger number of players tend to get eliminated than in others. If we would just have Ghostrating on vDip, then high rated players would have an incentive to only play on maps that see many eliminations.

Most of my thinking went into finding a solution to the headhunting problem that does still keep key features of vRating. How does my new rating system achieve this? It is complicated and mathematically quite involved. Here is an intuitive explanation.

The rating system looks at the ratings of the players in a match and the way the match finished, for example '3-way draw'. Then it tries to guess which player got which result based on their rating. For example, if there is a 3-way draw, then there is a big chance the highest rated player in the game was a part of it. Then it compares the actual scores of the players based on the scores that were guessed. The rating a player gains or loses is proportional to the difference between his actual score and the score that was expected/guessed. The key difference with GhostRating is that GhostRating does not take into account how the match ended before guessing the players scores, but my proposed rating system does.

I have to admit, though, that the above paragraph is a lie. This is almost how my proposed rating system works. In reality, I approximate something like this with a little trick. I can only explain this by going deeper into the mathematics. So for those interested, see the explanation in the next paragraph.

The scores of the game gets sorted from highest to lowest. For instance, in case of a 3-way draw the first three scores are 1/3 of the pot, and the rest are 0. We model the game as all of the players drawing a random variable from a probability distribution centered around their vRating. There exists a number (let's call it x) such that the expected number of players to draw a number higher than x is equal to 1. We make a model in which any player who draws a number higher than x gets the highest score (in case of a 3-way draw, this is 1/3 of the pot). There exists another number y such that the expected number of players to draw a number higher than y is equal to 2. In our model, any player who draws a number higher than y (but lower than x) gets the second highest score (which is also equal to 1/3 of the pot in case of a 3-way draw). Etcetera. In the end we compare the actual scores to the expected scores that followed from our model, and the change in vRating will be proportional to the difference.


PROBLEM 3: TAKEOVERS OF COUNTRIES IN CIVIL DISORDER

In order to talk about this subject well, we first need to define the concept of the 'worth' of a position. The worth of a position is equal to the number of centers from that position divided by the average number of centers of all the countries (including defeated ones) on the board. For example, if you have as many centers as everybody else, then your worth is 1. If you have twice as many centers as average, then your worth is 2, etc.

Previously on this site, it was not free to take over a position in Civil Disorder. You had to pay some dPoints. This was equal to 0.5 x the bet size of the variant x the worth of the position. The reason we multiplied with 0.5 was because we wanted there to be an incentive to take over positions. Its effect practically was that e.g. when you took over a position in a game that was still in its first year, you only had to pay half as much compared to the players who had been in the game from the start. Please remember this number 0.5 and what it used to be used for.

Let us now focus on the way taking over a country in CD affects your vRating. My information on this comes from tobi1, who wrote about this before in this thread: https://vdiplomacy.net/forum.php?viewthread=84011#84011

Tobi1 ends his post by stating that he does not know whether this works as intended. I go on a limb and claim that it does not work as intended. I will state one simple example.

Consider yet another Alex and Bob. Both have the same vRating and both take over a country in CD in the same game at the same time. Alex takes over a country with a worth of 1 and Bob takes over a country with a worth of 2, so Bob starts out with twice as many centers as Alex. Both play well and end up in the draw. Both gain vRating, but one of them gains more than the other. Can you guess which one of them gains more? Bob, of course. If this example made sense, I would not have presented it.

I admit that one can also come up with examples where the current system does work with regards to takeovers. However, the changes that need to be made in order to eliminate headhunting are from such a nature that we also need to change the way takeovers of powers in CD are treated, anyway. So let me explain how my new system treats it. My explanation is best understood if you have read everything so far and understood the parts that were more mathematically involved.

When guessing the results of the players, it treats the ratings of the players who took over a country in CD different than their actual rating. More precisely, if w is the worth of their country at the moment they took over, then during the guess a fictitious amount of A x log(Bw) gets added to their vRating, where the log has a base of 1/B and A and B are constants. I wrote this formula in such a way that the constants A and B are easy to understand intuitively. They are what matters, since they fine-tune how takeovers are handled. I will now explain what the numbers A and B mean intuitively.

The number B is comparable to the number 0.5 that the site used before. Previously, if you took over a position that was twice as strong as other positions, you would bet the same amount as players who had been in the game from the start. Similarly, if we choose B = 0.5 then if you take over a position that is twice as strong as other positions, then your vRating will be treated the same as everyone elses. If you take over a weaker position than this, you will win more if you end up winning and lose less if you end up losing. If you take over a position stronger than this, you will win less if you end up winning and lose more if you end up losing. I suggest to indeed choose B = 0.5.

The number A represents how much vRating we want the computer to guess you have less if you take over a position that is as strong as all others, let's say at the start of the game. If A = 500 and you take over a position that is as strong as everybody elses, then after the game is over, you will win and/or lose as much as a player who has a vRating of 500 points below yours. I suggest to choose A = 500. This may sound as much, but we have to remember that a position with a worth of 1 is not always as strong as everybody elses. Take for example the position of a player who has an average number of centers, but who just got stabbed and left the game as a result. We don't want to punish players who take over such positions.


LAST POINTS

In case anyone wonders: Yes, this rating system is compatible with literally all scoring systems, including PPSC. I feel dirty of having created a rating system that is compatible with an abomination like PPSC, and I do not mean dirty in the positive sense of the word when you are playing Diplomacy. Without this compatibility, I feared it would not be implemented, though.

I have been a bit vague about the mathematics behind the rating system. The following is a link to a text where I dive into the mathematics in more detail.

https://drive.google.com/open?id=1emavObpJ9GB3zjxZLlv0894_eOGmVau_

I hope people like my ideas and I'd be happy to answer questions.

I hope this rating system can be implemented some day.
Battalion (2386 D)
30 Apr 20 UTC
I've not had time to think through your suggestion thoroughly, but I wonder whether it could be extended to take into account the expected result of different nations (in addition to players). It's generally accepted that some nations have a better expected outcome than others in Classic, but it's even more pronounced in some variants. One extreme example is Edwardian3 - there have been 5 games, of which Germany achieved a solo in 4 (and was part of the draw in the 5th)! See https://www.vdiplomacy.com/stats.php?variantID=130
Battalion (2386 D)
30 Apr 20 UTC
Regardless, thanks for the proposal - it's an interesting discussion to be having and addresses some important issues with the current system.
Mercy (2131 D)
30 Apr 20 UTC
This could indeed be extended to take into account the expected result of different nations. One can simply treat a player who plays a weak nation similar to a player who takes over a weak position. However, one would need to implement the strength of the countries manually for each variant. It would be more effective if the program could learn the strength of different countries based on the results of finished games. Mathematically it is doable to write an algorithm for this, but someone would also need to do extra programming work and I don't know how time consuming that would be. Maybe @tobi1 can answer that.

If the above gets implemented, by the way, we might as well score variants for the average difference in score between players: a variant that often ends in a solo would get a high score and a variant that often ends in a large draw would get a low score. This could then be used to implement a rating system that is more like GhostRating (so players can lose rating in a large draw) but that is balanced across multiple variants. This thought has crossed my mind before.

But let's not get ahead of ourselves. :P
drano019 (2710 D Mod)
30 Apr 20 UTC
(+1)
Mercy -

Thanks for taking the time to look into this! I know the rating system frustrates a lot of people (including those who don't even like rating systems, but are caught up in people attacking them or making deals because of how it affects the ratings), and you have a lot of valid points. I won't try to pretend to understand all of the math (although the basic gist of it I understand), but if we can cut out the problems you've mentioned, it would be a major improvement to me.

The biggest issues in my opinion are the "headhunting" and the disparity in gains for different rankings with similar results.

On the headhunting - While this can be avoided by playing anon as people will probably point out, IMO, that's against the spirit of Diplomacy. Diplomacy isn't just a tactics game, it's a game of personality and interaction. Using people's own weaknesses to benefit yourself is, in my opinion, a key part of the game. Anon somewhat destroys that by not letting people get to know their opponents in the same way. IMO, it's totally fair to know that someone is a "carebear" and to play things up Diplomatically with them knowing that. Anon makes that not possible - or at least much more difficult unless they're more explicit in letting people know that's their playstyle. It's one of the reasons I like non-anon if I can (the other being I just like to get to know people personally more).

On the disparity of gains - This is the biggest issue in terms of the ratings I think. You pointed it out well in your examples how it breaks down especially on larger variants. We've had at least a few examples of this in recent history. Agnaar gained 1497 points in a solo victory on Europa Renovatio. Slypups gained 598 for a solo on the same map. Both games were WTA. A little while earlier, ooMatthew2000 gained 1272 points for a solo on WWIV in a WTA game. And more recently as well, Ambassador gained 760 for a Divided States solo, but that was PPSC and he would have gained a lot more in WTA.

In most of these games, people made massive jumps in ranking. Agnaar went from approximately #100 to #1. ooMatthew2000 jumped all the way to #1 from out of nowhere. Ambassador jumped just under #100 to #8 (and would have gone even higher in WTA). Only Slypups didn't jump massively in ranking, and that's because he was already in the top 15.

While all these players accomplished a great goal, it's not really realistic to say that one solo makes someone the best on the site. Especially when a lot of what happens in that large of a map isn't in your control. A 50 player Divided States has a lot happen on the other coastline that you simply have no control of. Even if you try, people will often ignore people across the map for much of the game since they have so many neighbors and regional powers to deal with. Sometimes, you're just lucky with how things turn out.

And that's not even considering that 3 of those 4 people soloed on gunboat. I think most people will agree that gunboat, once you reach the endgame stage, is often much easier to solo than full press. In full press, the best tactician on the alliance trying to stop a solo can dictate how to defend to the weaker tacticians to prevent the solo. In gunboat, the defending team is only as strong as the weakest player, and if that player is on the front line, a solo can often happen when it would be stopped in Full Press.

This is not to say they don't deserve accolades for their accomplishments, they do! But again, one solo on a Gunboat large map does not equal best on the site by a long shot.

Anyways, that's a long rant that I didn't intend to get on. TL;DR - Mercy, I fully support your ideas and would love to see it fleshed out and see if it might be possible to implement to fix these current issues.
alifeee (1229 D)
30 Apr 20 UTC
Well the stats page for each variant has a lot of stats on previous games like performance, which is higher in total with more solos (I think? It's tricky to think about).
Might I also point out another unbalanced map example: https://vdiplomacy.com/stats.php?variantID=24 . Here Colombia has very little choice to not expand into Venezuela.

In your ELO model, what exactly would the standard deviation be? That seems like an important point that would influence the change in vPoints quite a lot.
Also for the iteration, would you iterate everybody's vPoint change at the same time, N times? This would result in high-vPoint players gaining more points when they win against lower scoring players, and vice versa (like you want).
Mittag (1396 D)
30 Apr 20 UTC
It's nice that you have identified three problems and proposed a solution. But, solving problem A one ofter introduces problem B. And from this whole text, I did not get a good sense of what your scoring system is, and what problems it might have. (I assume that you are aware of Arrow's impossibility theorem.) Could you describe your scoring system?

Here is one question, whose answer was not obvious to me: Is your scoring system zero sum?
Mittag (1396 D)
30 Apr 20 UTC
(+2)
Oh, in regards to Battalions comment, please please do not implement weights based on country "strength".

I don't want to sound completely arrogant, but the performance of different countries on different variants are highly influenced by that a vast majority of players are fairly inexperienced.

Look at classic, for example. In low level games, Austria is often fucked, and its performance is on average is the worst. But in a game with experienced players, you almost always get A/I alliance in the beginning. And since A/I is actually slightly stronger than T/R, from a pure tactical perspective, Austria suddenly performs better than Turkey on average. This is particularly clear in gunboat games. Yet, Turkey is ranked as the strongest power...
Battalion (2386 D)
30 Apr 20 UTC
It was just a thought following a recent game of Edwardian3 I played. It's probably more applicable to the variants with large differences in expected outcome than the more balanced maps, but I maintain it's something which could be made fairer nonetheless.
DogsRule11 (866 D)
30 Apr 20 UTC
It might be nice to have on Africa, considering that Morocco has more solos than 4 other countries combined.
Mercy (2131 D)
30 Apr 20 UTC
@alifeee

"In your ELO model, what exactly would the standard deviation be? That seems like an important point that would influence the change in vPoints quite a lot."

The standard deviation is the same as it is currently under vRating. I deliberately made the new rating system such that there is no difference with vRating in 2-player games. What the exact standard deviation is I haven't calculated, but to give you an idea, if Bob has a vRating that is 500 points higher than Alex, then Bob is expected to win against Alex 75% of the time (in a 2-player match).

"Also for the iteration, would you iterate everybody's vPoint change at the same time, N times?"

Yes, I would iterate everybody's vPoint change at the same time. The reason why in my explanation I focused solely on the winners in my Divided States game examples was because their ratings are the only ones that change significantly during this process.

"This would result in high-vPoint players gaining more points when they win against lower scoring players, and vice versa (like you want)."

Either you are wrong or I just don't understand what you mean. The iteration process always causes players to gain fewer points from a win and lose fewer points from a loss, not more, but it is only significant with wins or small draws on large variants.


@Mittag

"It's nice that you have identified three problems and proposed a solution. But, solving problem A one ofter [sic] introduces problem B. And from this whole text, I did not get a good sense of what your scoring system is, and what problems it might have. (I assume that you are aware of Arrow's impossibility theorem.) Could you describe your scoring system?"

No, I am not aware of Arrow's impossibility theorem.

Also, be aware that I am not suggesting any new scoring system. I am instead suggesting a new rating system, which is something radically different. A scoring system scores players for their performance in a given game and a rating system rates players for their performance across multiple games that use a scoring system. Sorry for being nitpicky, but I think that for clarity, it is useful to not mix up these terms.

Frankly my initial post was my best effort in describing how my rating system works. To summarize though, in 2-player maps it works in the exact same way as normal vRating. The main difference is in how the rating system extends to variants for more than 2 players. I describe how this gets extended in the last three paragraphs under the title 'PROBLEM 2: HEADHUNTING'. Then in the other paragraphs I respectively describe the other differences, namely iteration and the way takeovers of Civil Disorders are handled. For the actual precise formulas, see the link at the end of my post.

"Here is one question, whose answer was not obvious to me: Is your scoring system zero sum?"

Yes. If it weren't, that would have been a huge mistake by me. I indeed did not specifically say that it was zero-sum, though it implicitly follows from the fact that the change in vRating of a player is proportional to the difference in his score and his theoretical, expected score. These expected scores are expected values. Clearly, the sum over all the expected values of the scores of all the players sum up to the same as the sum of the actual scores of all the players: namely the total pot size. Hence the differences sum up to zero.

@Mittags next post

I share your sentiment here. There are some positions that are bad for players of all ratings, but there are other positions that depend a lot on the ratings of the players in the game, such as the Austrian position in Classic. I think players are too often declaring variants to be unbalanced on the grounds of statistics.
BBQSauce123321 (2026 D)
30 Apr 20 UTC
(+1)
@drano I think that actually Mercy's fix makes it BETTER for what you are passionate about: non-anon games. Because now higher ranked players might still get headhunted, but now it's in the name of the game rather than in the name of a score. Having people play for a score makes playing the game less fun.
Mercy (2131 D)
30 Apr 20 UTC
(+1)
Yes. Under the ideal scoring system, players who play to maximize their score have a playstyle that is enjoyed by most players. Under the ideal rating system, players who try to maximize their rating play similar to players who try to maximize their score.
drano019 (2710 D Mod)
30 Apr 20 UTC
@BBQ - I concur. That's why I'm fully supporting his ideas. If it makes it better for non-anon games, and also helps take care of the disparity of gains from large maps, then I'm all for it.
Mittag (1396 D)
30 Apr 20 UTC
Adding a sic as well, Jeezus...

Are you saying we should consider these as three separate changes to the current rating system, and not as _one new_ rating system?

The principle of Arrow's impossibility theorem is simple to explain: if there are more than two players, then there is no fair rating system. Of course, you should specify what you mean by 'fair.' But the (somewhat surprising) point is that even with very mild requirements, 'fair' rating systems are nonexistent.

That of course doesn't mean that we should not try to fix the problems with the current rating system. We should. And I completely agree that the three things you listed are problems. My point is (and, in my experience) the more focused you are on solving problem A, the more likely you are to overlook problem B. So I would prefer to evaluate the rating system in itself, and not only in relation to what problems with the old rating systems it solves.

But let me be clear: Good initiative! I agree that these are problem, and it's nice to see somebody try to solve them. My criticism is that the description of your rating system is fragmented and (therefor) somewhat imprecise. It makes it hard to get a good overlook of exactly what it does.
Battalion (2386 D)
30 Apr 20 UTC
@Mercy

"I think players are too often declaring variants to be unbalanced on the grounds of statistics" - surely statistics are an excellent way to determine whether a variant is unbalanced? Granted a sample size of 5 as is the case for Edwardian3 means little, but where there have been a lot of games played I think statistics are very good grounds for suggesting a variant is unbalanced.
@Battalion I strongly disagree. Oftentimes most variants on this site have far too few games played to have any decent statistics. But then you have to consider those that do. Does gunboat affect things? Absolutely, but it's lumped in with press. What if you have maps that a lot of people play and then drop out of, like Divided States? Or people are new to the game and then try to play and instantly get demolished. As mentioned earlier with the example of Austria in Classic, looking at ALL statistics actually obscures the true statistics when it comes to knowledgable players
Fake Al (1747 D)
30 Apr 20 UTC
I believe another issue with the variant statistics is that they are not, well, static. Would retroactive changes occur to scores when more games finish? For example, would my score be lowered because someone playing New York soloed in Divided States game I'm not in and I played a game a while back where I had a draw as New York? Of course, this would only be more pronounced in newer variants. I think it should be up to the players to better research which countries are better favored and adjust their gameplay. That's unless you've got totally unbalanced variants like Rinascimento.
erikip107 (2543 D)
30 Apr 20 UTC
(+1)
I wasn't aware of these issues beforehand, but after reading the post, I can see that they are a problem and (though I don't really understand the math) seeing the support so far, it the proposed changes seem good to me!

Another potential issue (which may not exist, but I don't know the way the vdip rating system works fully) with the current system I've thought of before is of when a player is in two or more games at once, especially when the games take different amounts of time to complete.

My understanding of the rating system is that when two games are going to end at around the same time, if the result is going to be (or has been) a defeat and then a draw/solo, it is always better to have the defeat happen first and then the draw/solo, since you'll lose less from the defeat and then gain more for the win/draw (since you'll be rated lower in the calculation from the gain). This could lead to some timing shenanigans where player A purposely quickly throws a losing game (instead of playing it out) and/or stalls out a winning game such that the loss happens first, but not for any game-specific reason. I don't know if this actually does happen, but I noted that this is something that could happen.

Also, people can get better (or worse) at this game over periods of time. If someone plays a divided states variant as their first game (at 1000 rating) and is pretty much immediately defeated, but then continues to play games, and say, before the divided states game ends, reaches a rating of 1400, my understanding is that the Divided States rating result would base Player A's losses based on their 1400 rating instead of their 1000 when Player A was defeated in the game. This would cause him or her to lose more points than they perhaps should have.

Perhaps we could add to the changes, something that would account for a player's rating at the time of defeat when the game concludes, rather than their current rating at game conclusion? This might fix the second part of the problem but not the first. Also it would hurt more for players who consecutively lose shorter matches (say, if player A had gone down to 800 instead of 1400 during the time the Divided States game happened).

I don't know if it even needs to be addressed, but I thought I would bring it up.
G-Man (2466 D)
30 Apr 20 UTC
Great stuff Mercy! Thanks for all the thoughtpower and time towards improving the rating system. I think we've been flying a bit blind and it looks like your improvements would enable us to fly less blind ;-) No Diplomacy rating system will be perfect, but from what you've explained, I think your system would be better than what we have going. The headhunting and the unsightly ratings bump for lower-ranked players in large variants both really need addressing, and your system would make vast improvements to both. I am fully supportive of implementing your alternative rating system.
Mercy (2131 D)
01 May 20 UTC
@Mittag

"Adding a sic as well, Jeezus..."

I see you noticed. :P

"Are you saying we should consider these as three separate changes to the current rating system, and not as _one new_ rating system?"

If these three separate changes were made to the current rating system, you would end up with my rating system and that's why I wrote my explanation in this way. However, when implementing I think it would be cleaner to just write new code from scratch for the new rating system.

"The principle of Arrow's impossibility theorem is simple to explain: if there are more than two players, then there is no fair rating system. Of course, you should specify what you mean by 'fair.' But the (somewhat surprising) point is that even with very mild requirements, 'fair' rating systems are nonexistent."

Since (according to wikipedia) Arrow's impossibility theorem is specifically about voting, you should also specify what you define 'voting' to be in this instance. As far as I know, the rating of players is not the result of an election for best player. I don't know who you consider to be the 'voters', nor do I know if you even know that yourself. When I was pondering on this point however I concluded that if you want to apply Arrow's impossibility theorem to a rating system, the ranked games should be considered the 'voters', and they 'vote' by scoring players, i.e. distributing the pot. Then Arrow's impossibility theorem states that no rating system exists that satisfies four certain criteria: unrestricted domain, non-dictatorship, Pareto efficiency, and independence of irrelevant alternatives. Indeed, my rating system (and any other rating system I know of) does not satisfy the criterium of 'independence of irrelevant alternatives', as it should, since this criterium is silly. If this criterium were met, then since we have no history of games together, if you were to beat me in a 2-player game, you would instantly be higher rated than I, even if I had defeated several players in other games who in turn had defeated you.

"That of course doesn't mean that we should not try to fix the problems with the current rating system. We should. And I completely agree that the three things you listed are problems. My point is (and, in my experience) the more focused you are on solving problem A, the more likely you are to overlook problem B. So I would prefer to evaluate the rating system in itself, and not only in relation to what problems with the old rating systems it solves."

I agree with your general concern here. In my experience, that often happens if someone tries to fix something they themselves do not fully understand, causing them to be unaware of new problems they introduce.

"But let me be clear: Good initiative! I agree that these are problem, and it's nice to see somebody try to solve them. My criticism is that the description of your rating system is fragmented and (therefor) somewhat imprecise. It makes it hard to get a good overlook of exactly what it does."

The core of what it does is the same as in 2-player games and the way it gets extended to games with more players is described in the last three paragraphs under the title 'PROBLEM 2: HEADHUNTING'. The rest of the text describes extensions and extra fine-tuning on top of this core mechanic. You can also find a link to the precise formulas.


@erikip107

Good point! You are correct and I am aware of that issue, but unfortunately, this new rating system does not address it.
Oli (977 D Mod (P))
01 May 20 UTC
The current systems are easy to update from a coding perspective.
If someone would like to add a new system just create the additional code and make a pull-request on github so I can integrate this.

Implementing such a system is quite easy to code, as it's only the logic-part that's needs some coding, Mathematical code-expressions are similar in most coding-languages, so even if you have no PHP experience you should understand how this works. Adding this to the (quite complex) webdip-code is easy and I can help with this.

You can see how the webdip-scoring is implemented here:
https://github.com/Sleepcap/vDiplomacy/blob/master/objects/scoringsystem.php

And the code for the vDip-rating here:
https://github.com/Sleepcap/vDiplomacy/blob/master/lib/rating.php
Swede03 (1517 D)
01 May 20 UTC
What if instead of doing the ten times/10 idea, we instead treated the winner as having beat all 49 other players. The ratings are adjusted one by one, so after resolving his victory over the first player, his rating would go from say 1000 to 1020. Now since he is higher rated, he will probably gain less and less points as the rounds go on for beating each player. Obviously this has the downside of The players at the beginning losing more, but this can be compensated for at the end by taking the mean of what each player lost and subtracting that from every player, adjusted for rating of course.
Mittag (1396 D)
01 May 20 UTC
'nor do I know if you even know that yourself'
You definitely have a tendency to argue by insulting the competence of others. Excuse me if I'm not impressed.

'Voting' simply means that you aggregate separate rankings to one joint ranking. Yes, the games provide the separate rankings and, yes, since not all players play in all games, 'independence of irrelevant alternatives' cannot be fulfilled. This mean, as you say, that beating someone the first time you play does not automatically make you the higher rank (which is not controversial). It also means that you can be undefeated against a player over 20 games or more, yet still be ranked below that player. As Arrow said himself, most rating systems work well most of the time, but all rating system works badly at times.

The only thing you say in the three paragraphs you refer to, is that you condition the expected score of a player on the draw-size. If that's the only difference, then why didn't you just say so?
Chumbles (1380 D)
01 May 20 UTC
(+1)
I am on record as hating the WTA system... but the biggest and most intractable problem with ratings is that you are trying to derive a rating from performances in differing types of games. An analogy would be having a rating system that tried to quantify comparatively across Chess, Contract Bridge, Scrabble, Darts, Othello, Poker, Diplomacy. I once played in a Victor Ludorum with a group (ca. 20 or so) of friends. And there were pretty ferocious arguments about how many points would be awarded for each. But our group's de facto leader gave out a ukase that THIS is how it would run. End of.

Cross game comparison is really a mathematical exercise. It would be better to have a rating for each variant / player once a threshold of, say, 10 games, or athe number of players in a game... so in the latter case you would only rate Viking Diplomacy IV once there were 8 finishes ...Asymmetric variants like Rinascimento should NEVER be rated - why should the guy playing France get their stuffed up because they've drawn a power that can only win through unbelievable incompetence of others.

But don't introduce ratings influenced by countries' prior performance ffs, in a balanced variant the players are key; distorting ratings by introducing a random seed over time is insane. Typically you get the argument from folks who lose who want something to blame. Turkey is a great example of this Mercy.

But if you are going to have a pan-variant system, it seems to me that Mercy's proposals should be adopted in toto.

"The quality of Mercy's argument is not strained.
It droppeth as the gentle rain from heaven
Upon the this heated site. It is thrice blest
As far as this old fool can see"
Mercy (2131 D)
01 May 20 UTC
@Oli

Thanks, I will look into this. But don't underestimate how bad I am at coding! :P


@Swede03

The new rating system I suggest does not model a win or a draw as 'beating' defeated players individually anymore, but looks at the game as a whole. I only explain that in my text after I explain the iteration process, and when I illustrate why I think iteration is a good idea, I use examples from current vRating, which works differently in this particular respect. So this probably wasn't clear, but that is why your proposed solution won't work for the particular rating system I envision. Because of the compensation that needs to be done at the end, I think this method would be more complicated than the iteration method I wrote about anyway, though.


@Mittag

Looking into it some more, I see that actually Arrow's impossibility theorem does not apply. His theorem applies for ranked-choice voting, but games do not rank players from best to worst; they score them.

"The only thing you say in the three paragraphs you refer to, is that you condition the expected score of a player on the draw-size. If that's the only difference, then why didn't you just say so?"

Because that would not be accurate. What you say is the only difference is actually the difference between the logic behind what I propose and GhostRating, not vRating, and also, it doesn't condition on the draw-size, but more generally on the way the game ended (for example in a PPSC game that ends in a solo it conditions on all the different scores obtained). As I summarize in my text: "The key difference with GhostRating is that GhostRating does not take into account how the match ended before guessing the players scores, but my proposed rating system does." Other reasons why I use more words are because I use examples now and then and end with a more in-depth look at the mathematics (as 'conditioning on' and 'taking into account' might otherwise remain vague concepts).
CCR (1957 D)
01 May 20 UTC
Mercy, should add a rule brought from Doug Massey's JDPR. The simple rule: "in the case of a replacement player, he would NOT lower his rating based on that game -- it would remain unchanged." I don't know if it already works this way, but it should.

And does it work for the player who left the game? Think of a player who brings the power to sc count lead,, CD's, someone else replaces him and solo. Consider the last player played only 10% of the game. Shouldn't the cd'ed player get points from this game?

The same when the original players loses. The loss is his, not of the new player who came in taking over a bad position just to aid the game.

So, not only that "simple rule" above should be included; the original players must always respond for the results of his power; and another rule for positive results, shared by all the plaeyers who led the power to the final result, in a pro rata way , based on their share of seasons or years played.



Mittag (1396 D)
01 May 20 UTC
There are several versions of the theorem, and it's pretty obvious the principle applies.

You do realize that this is bizarre, right? You've referred twice to these three paragraphs as to explain how your rating system works. But now you say that these three paragraphs don't accurately describe how your rating system works.

Anyway. So, you condition on how the game anded. In a WTA game, that will be draw size, correct? If so, then let's be clear: You do not solve the headhunting problem. At least not in WTA.

If you condition on the draw size, then the highest ranked player will perform 'as expected' if they make the draw. In effect, the only way the highest ranked player can underperform is to be eliminated. In effect, players who want to maximize their rating should aim to eliminate the higher ranked players. Alas, headhunting.
Mercy (2131 D)
01 May 20 UTC
@CCR

Interesting points. Under my rating system, players who CD are always counted as having lost the game. I have to say that one of the reasons my rating system sometimes punishes players when they take over a position and fail is because of a conversation I have had about this in the past with tobi1 in private. He stated that 'a rating would be quickly reduced to absurdity if members who took over strong CDs for free can easily push the rating for nearly nothing' and I agree. However I think it is an interesting idea to make the loss of players who CD'd depend on how their country ends up scoring, and maybe even having them gain rating if their country ends up doing really well. You could also argue that someone who CDs needs to be punished regardless but I think your take has merit as well.


@Mittag

"There are several versions of the theorem, and it's pretty obvious the principle applies."

I don't think so. If you want to convince me otherwise, please state the exact version of the theorem you think applies, as well as to what aspects of our discussion the mathematical definitions in the theorem translate (such as who is voting and how they do it).

"You do realize that this is bizarre, right? You've referred twice to these three paragraphs as to explain how your rating system works. But now you say that these three paragraphs don't accurately describe how your rating system works."

No, I am merely saying that you do not accurately understand these three paragraphs. It could be that I just suck at explaining.

"Anyway. So, you condition on how the game anded. In a WTA game, that will be draw size, correct? If so, then let's be clear: You do not solve the headhunting problem. At least not in WTA.

If you condition on the draw size, then the highest ranked player will perform 'as expected' if they make the draw. In effect, the only way the highest ranked player can underperform is to be eliminated. In effect, players who want to maximize their rating should aim to eliminate the higher ranked players. Alas, headhunting."

This is not correct. The statement 'In effect, the only way the highest ranked player can underperform is to be eliminated.' is true for every player and thus trivial: if you make it to the draw, you don't underperform. The statement that follows it ('In effect, players who want to maximize their rating should aim to eliminate the higher ranked players.') is incorrect. I don't know how you came to that conclusion.

Maybe the following examples will clear up some confusion. Suppose Alex is rated high and Bob is rated really low. Let's assume you will make it to the draw and can choose between the following two options:
1. You draw with Alex and eliminate Bob;
2. You draw with Bob and eliminate Alex.
In option 1, Alex does not gain much rating (since he was already high rated) and Bob does not lose much rating (since Bob was already low rated). In option 2, Alex loses a lot of rating and Bob gains a lot of rating. In both options, though, your gain in rating is the exact same, so headhunting is not encouraged. Under old vRating, your gain in rating would be higher in option 2. If you think Bob gains rating for eliminating Alex: This is not true. Bob only gains rating from making it into the draw. He would have gained the same rating from drawing with Alex and eliminating you.
As another example of the opposite effect of headhunting, let's assume that you will be eliminated and have the option to throw the game to either Alex or Bob. The one you throw to will win the game. Under old vRating, you lose the least amount of rating if you throw to Alex. Under my proposed system, it doesn't matter.
Mittag (1396 D)
01 May 20 UTC
You did notice that the separate rankings, already in Arrow's theorem, are allowed to include ties? That is, scoring the players gives a rank of the players. Theorem still applies. Not that I really understand why you are contesting it, we already concluded that 'independence of irrelevant alternatives' is not fulfilled.

The first reaction to learning about Arrow's theorem (or, any theorem for that sake) should maybe not be to claim that it is false. If you are interested in ranking systems, and you are a mathematician, then probably you should just teach yourself the principles behind it. I bet you'll find it interesting!

Yeah, I think the confusion stems from that you haven't really explained how your system works, only how it relates to the other systems.

I get, in the last example you gave, that my score compared to my expected score does not change depending on who else is in the draw. Still, the last sentence of the three paragraphs we've been talking about reads "the change in vRating will be proportional to the difference." What is the scaling factor here?

Because, just by principle, if the 'bet' (i.e., maximal amount of points you can loose) only depends on the ratings of the players involved, then the only way to eliminate headhunting is to make it so that the payoff only depends on the score.

Comparing to 1v1 games (i.e., ELO) is a little deceitful here. Because in 1v1 it doesn't matter if it's the 'bet' or the 'payoff' that depends on the rating - the effect is the same. But in games with more than two player this in important distinctino.

Page 1 of 3
FirstPreviousNextLast
 

66 replies
Samuel, o Louco (824 D)
19 May 20 UTC
Thinking of creating a Divided States variant, but in Brazil
I really like the Divided States variant. And I think it would be really cool in a Brazilian version too, and I'm thinking about creating it. But would that be considered a copy or something?
11 replies
Open
Metramax23 (969 D)
21 May 20 UTC
Join Please!!!- Sticks and Stones
We have 4 open spots guys please join!!!
1 reply
Open
Tener (950 D)
19 May 20 UTC
Private Messaging
Hi all. I am a newbie to vDiplomacy though I have played WebDiplomacy. How does private messaging work in vDiplomacy? I don't see the same account settings for messaging that I am used to. Thanks for helping me.
4 replies
Open
Lukas Podolski (1234 D)
16 May 20 UTC
Middle Earth variant
I recall we used to have one here, albeit quite unbalanced? Will we have anything like this in the near future?
8 replies
Open
vDiplomacy
Why is it called "v"Diplomacy
8 replies
Open
CBro27 (1453 D)
16 May 20 UTC
Reporting potential cheating to Moderators
How do I report potential cheating in a gunboat game? GameID is 42792
1 reply
Open
TheBubonicTague (969 D)
14 May 20 UTC
Question About Custom Phase Lengths and Swaps
If I choose a custom phase length of 24 hours, then select "10 minutes" for time until phase swap, then "1 day" for phase length after swap, could someone describe what that would look like in-game?
10 replies
Open
umbletheheep (1023 D)
18 Oct 19 UTC
(+3)
New Weekly Diplomacy Newsletter
Hey guys. I've started a weekly diplomacy email. The goal is to keep everyone updated on the current events going on across the community both online and the local scene. I'll also be including weekly strategy articles. You can see the past issues and subscribe here https://bit.ly/2mvDOTX

I hope you'll sign up, but I also wanted to hear what are some things you'd like to see. Be sure to keep me updated also with new variants so I can announce them. Thanks!
12 replies
Open
TheBubonicTague (969 D)
14 May 20 UTC
New to Site: Playing Games With Fewer Players
So, I'm new to the site and am trying to play with a group of friends as a way to stay in contact with each other. There are 6 of us. All of us are new to Diplomacy, and figured it'd be best to play on the Classic map first, but noticed it's for 7 players. Is there a way to pick the Classic map variant but only play with 6 people instead of 7? I can't seem to find anything talking about that. Thanks!
4 replies
Open
Interactive map in variant testing
The interactive map in the variant I am testing does not work very well.
(Sometimes it does not show the movement arrows)
(When I try moving from Newfoundland to Quebec in interactive orderface it thinks I am trying to move to Quebec north coast
21 replies
Open
Shaurya (708 D)
11 May 20 UTC
How do you redeem yourself?
I have 8 unexcused delays and want to play more games as I only have games going wich are very slow paced.
3 replies
Open
Page 143 of 160
FirstPreviousNextLast
Back to top