Forum
A place to discuss topics/games with other webDiplomacy players.
Page 58 of 160
FirstPreviousNextLast
Oli (977 D Mod (P))
27 Mar 12 UTC
Some changes to the Reliability rating...
I did some tests/tweaks to the reliability rating...
Page 3 of 5
FirstPreviousNextLast
 
Oli (977 D Mod (P))
28 Mar 12 UTC
And their reliability is internally still 100%...
Guaroz (2030 D (B))
28 Mar 12 UTC
The System Oli just released introduces 2 great news:
1) CDs have the same weight whoever you are, no matter how much you played here.
2) Taking Over is encouraged since it makes you recover your RR

Although this is an impressive innovation, as well as now we got a page explaining the System, I expressed in the first post a couple of doubts that here I summarize:
Doubt #1 - NMRs are still basically unpunished, while some of them are even worse¹ than a CD;
Doubt #2 - Giving to TOs the same weight as CDs makes too easy abuse the system: I could CD a bad position and TO a better one, getting a meaningless penalty (only 2 missed phases) and basically my RR would't change if played enough phases.

So, following a sqrg's idea, I propose a little-big change to current formula in order to minimize these issues.

- - - - HOW THINGS ARE NOW - - - -

First of all, let's make clear what happens into your stats when you miss a phase.

1) You miss a single phase in the middle of the game (NMR).
Your stats count: missed phases: +1

2) You miss the first phase of a game (First turn CD)
Your stats count: missed phases: +1 and Left: +1

3) You miss the phase for the second time in a row (CD)
Your stats count: missed phases: +2 and Left: +1

Now, let's give a short name to each stat (our data):

MP = # of missed phases
CD = # of left games
G = # of played games
TO = # of taken over games
P = # of played phases

Give the weights (parameters). Into the current formula they are:
p = 200% | weight for a single Phase missed
l = 100% | Weight for a Left game
t = 100% | weight for a TO
h = 10% | Harshness parameter (how much the unbalancing [CD-TO] decreases the RR)


This is how the current formula looks like [Oli please correct me if I'm wrong]:

RR = [100 - p*100*(MP/P)] * [1- h*(l*CD - t*TO)]

(not to be lower than 0 or higher than 100)

Notice that being l = t = 100% = 1, in the original Oli's formula l & t are probably not even mentioned (and wisely, too).

- - - - A SMALL CHANGE - - - -

As sqrg said "The old system did not work because the ratio of phases missed to phases played is always very small."
True! The ratio of phases missed to GAMES played is much more meaningful. Players should miss a phase once in 50 games "because sometimes you're an idiot" and for no other reasons.

So I believe we should substitute P with G into the formula. ONLY this. Then:

RR = [100 - p*100*(MP/G)] * [1- h*(l*CD - t*TO)]

When you'll have played 1,000 games a NMR will be a very small issue for you. But it will take time and, after all, we'll really know if/how reliable you are then.

- - - - THE RIGHT PARAMETERS - - - -

If the Formula is the heart of the RR System, Parameters (p, l, t, h) are the blood. They are the tools we use to make the system fair. Hence they're widely debatable and I put here just my ideas.

Well, it looks self evident that, after the change, p can't be 200% anymore or a newcomer with only 2 games played will be wiped out after the first NMR (RR'd be 0%).
We should decrease it. With p=120% a player who completed 2 games and NMRs will have RR=40%, he'll be able to join 4 games and as he complete the 3rd game he'll jump at RR=60% being able to join 6 games. After the 4th game has ended, RR=70%.
Now Doubt#1 looks fixed.

The l & t parameters are strictly linked to each other as they measure if and how CDs and TOs have a different weight. Also, they're both linked to h, that measures how the whole CD&TO matter weights on the final RR.
As stated before, I believe that you should need at least 2 TOs to recover a CD. This can be translated in different ways, setting l t h parameters.
The simplest way is: l = 100% and t = 50%, keeping h = 10% that looks fair enough. Also Doubt#2 looks fixed now.
The new Parameters could be:

p = 120% | weight for a single Phase missed
l = 100% | Weight for a Left game
t = 50% | weight for a TO
h = 10% | Harshness parameter (how much the unbalancing [CD&TO] decreases the RR)

Let's see what happens with sqrg's examples:
>>Examples: you've played 60 games and missed just 1 phase because sometimes your an idiot. Calculate chance = 1 - ( 1 / 60 ) = .983 so a rating of 98%.>>
RR= [100-1.2*100(1/60)]*[1-0.1*(1*0-0.5*0)] = 98%
>>But what if you're really unrealiable with 3 missed phases in 4 games? Chance = 1 - ( 3 / 4 ) = 0.25 so a rating of 25%. We can call it asshole rating too.>>
RR= [100-1.2*100(3/4)]*[1-0.1*(1*0-0.5*0)] = 10%
>>Another example rating. If you missed 5 phases, left 1 game and finished 80 games the rating would be:= 95%>>
A CD inside? Nononono.
RR= [100-1.2*100(5/80)]*[1-0.1*(1*1-0.5*0)] = 83.25%

Notice that the latter example really encourages you to take over some CD. With a couple of TOs you'd be back to group "A" again:
RR= [100-1.2*100(5/80)]*[1-0.1*(1*1-0.5*2)] = 92.5%

- - - - IS IT COMPLEX ? - - - -

No, not really. The "What is the reliability rating?" page would have only a few changes:

Your rating is dependend on 2 important factors. How many phases you missed to enter orders in comparison to your total GAMES played, and how many games your country went into CivilDisorder, because you didn't even check the gamepage for 2 turns in a row.
The first part is 100 minus phases missed / games played * 120, not to be lower than 0.
Example: If a user missed a phase in 5% of their games, rating would be 94 (6 lost), 10% would be 88 (12 lost), etc
From this rating we substract 10% for each game you left bevore the end.
The penality for the "Left" games seems a bit harsh, but many games get totally screwed if a player does not play the game till the end. Most of the time some countries gain really big unearned advantages. But you can even out your lost reliability by taking 2 open spots from games [link here] other players left.

Simple, no?

- - - - A DEBATE? - - - -

Please tell me what you think.
Make your own examples², perhaps you'll find parameters' values better than mine.
NMRs are too penalized? Decrease p!
TOs are still worth too much? Decrease t or increase l! :-)


Finally, if it would be possible, I believe that the "half buy-in" rule would make this system really effective, encouraging TOs. A CD should cost you in Reliability, not in D-money.

- - - - § § § - - - -



______notes________
¹Consider these examples:
A) a player misses the first Autumn Diplomacy Phase of a game and in consequence of that he didn't either grab one more SCs he could grab or contest a SC to an opponent, letting him in.
He won't be able to build all the units he was expected to, or some neighbour will build more than he was expected. Or both. From second year on, this player will play in a very harsh position and his neighbours will be very advantaged.
Only because of a single NMR.
B) A game is coming to an end since 2 players can set an unbreakable stalemate line against a 3rd biggest Power. One of the first 2 players miss a crucial turn and the SM Line can't be made up anymore. The game ends in a Solo and not in a Draw.
C) Last year of a game in which, unless the biggest power makes some incredible mistake, the Solo is clear and everyone's just trying to survive, a Player with unit 1 unit disappears and CDs. Winner needs his SC and he can easily take it, nobody could support him, he could support nobody, he can't move anywhere useful.

All bad stories, but I don't think that the 'C' CD is worse than 'A' or 'B' NMRs.

- - -

² For Excel addicts:
Put data and parameters values into these cells:
C1 = MP
C2 = CD
C3 = TO
C4 = G
C5 = p
C6 = l
C7 = t
C8 = h
And somewhere else this formula:
=(100-((C1/C4)*C5*100))*(1-(C6*C2-C7*C3)*C8
Now change values into cells from C1 to C8 and make your tests!

To test current system just put into cell 'C4' the value of P and not of G (total phases instead of total games).
@Guaroz - Who the fuck cares? It's Oli's site and I appreciate everything he did. The system will do what it does and bitching abou tit here makes you look both petty and like some kind of pseudo intellectual. Most of us here can do the math and don't need to be hand guided. Let Oli work out the kinks. He said it is likely to hcange given time, and quit bitchin' about it.
Hi Fuzzy! I was going to join (even did) then noticed 10 minute phases starting in 1 hour. No can do, I'm afraid. So I left.
fuzzyhartle1 (856 D)
28 Mar 12 UTC
oh well I made another game EvT that is in half an hour. it starts sooner i think when someone joins.
It's 2 hours untikl I go home. I can't be getting in any live games right now.
Oli (977 D Mod (P))
28 Mar 12 UTC
(+1)
Gathering ideas is always a good think.

IMHO one of the biggest problems is people need to be able to recover from a bad RR in a reasonable time if they want to. This is just a small site for some funny diplomacy-variants and nothing really important. RL issues happen and people will get CDs and NMRs for issues they can't control. And receiving a Left because you can't access the site for some time and a badass-player does not vote for a pause has a really bad taste...

The RR is more a tool to make people aware of the problems that CDs and NRMs do to ongoing games. Ultimately someone with a bad rating can just create a new account and get a 100% rating in no time. And most people will just do that if the system put too much pressure on their shoulders.

Most CDs and NMRs are from new players that do not realize the harm they do to ongoing games. If their rating goes down too much (and at the moment the system does harm the new players much more than the older players) people will adapt and play better games if we let them play here and recover from their early mistakes. It's no help if they just leave the site (or make a new account and abandon their old one).
K, Oli. I just felt like he kept pushing this not liking your system a bit. But I know you can stand up for yourself.

The simple compromise could be to reduce how much a take over recovers. Instead of one to one, make 2 or 3 take overs compensate for a Resign.
Guaroz (2030 D (B))
28 Mar 12 UTC
@ YouCan'tHandleThisDick. Who the fuck are you? You joined VDip a week ago...
If you like to put your finger into assholes and then smell it, please use your own hole. Now.
Guaroz (2030 D (B))
28 Mar 12 UTC
Yes Oli I fully agree. The parameters' values I proposed, from this point of view, are too severe, but they were only an example. Parameters are the tools we use to make the system fair. If things will go as I believe and if the stats you're gathering will confirm it, a fine-tuning will be needed, I'm afraid.

About new players, a good tool could be setting a separate group for them. They'd be into this group ('N' for Newcomers?) say 1 or 2 months after sign up the site. During this time they'd have some light limitations in joining new games, but they would be protected against a RR's collapse in consequence of a CD or some NMRs that, due to their low # of phases (or games) played, would be catastrophic. This feature would also discourage opening new accounts because they'd start limited for a certain time and therefore it would probably be better trying to recover the first account.
Would it be doable?
@Guaroz - I have a long and distinguished history at WebDip WAY longer than you have been here. And I asked Oli tpo ban my old accoutn so I could come here under a new one (the old one never played anything anyhow). I've been here since almost day one under a different account. So you might want to stop making assumptions about me.

Just check userID=222

But I will apologize for going off on you. You are trying to improve things and I can appreciate that. So sorry about that.
Guaroz (2030 D (B))
28 Mar 12 UTC
@YouCan'tHandleTheTruth - In this case, your apology is welcome. Let's forget it. :)
Thanks. <*extends hand of friendship*>
Guaroz (2030 D (B))
28 Mar 12 UTC
Thanks to you! <*shacking it*>
G-Man (2466 D)
28 Mar 12 UTC
Nice gentlemen. Take it out on the battlefield!!
Decima Legio (1987 D)
28 Mar 12 UTC
If you're going to use #of games instead of the #of phases you shall be aware that you can't rate a newcomer until he finishes the first game.

YouCan'tDivideByZero...

I still think the #of phases fits better what is below the fraction line.
On the other hand you can roughly assume that 1 game (an "average" variant) is worth 40 phases, so you can just tune the coefficient in front of (MP/P) to have an equivalent formulation of yours, still using #of phases.
It's more stable, expecially for people with a small number of phases in the score.
Guaroz (2030 D (B))
28 Mar 12 UTC
Well, DL, divideByZero could be a further reason to set the new players' group I told before. Don't you like the idea?
sqrg (1186 D)
28 Mar 12 UTC
Haha nice discussion here. I like your ideas Guaroz, it takes a few more thing into account. And YCHTT, it's not that we're bitching. We just like to think about this kind of stuff. Not that any of it really matters that much.
You would never catch me taking any of the work Oli does for granted. In fact it was Oli who encouraged me to join this forum post with my idea ;)
In any case, no harm done.

@Decima: haha we can't ;) But it will only matter for very new players and it can be fixed by cheating it not to be zero in those cases.
What you suggest about taking the average number of phases for a game is something i posted a little earlier. It gives you a nice rating too.

That's really interesting, so many ways to look at this.
Decima Legio (1987 D)
29 Mar 12 UTC
Yes, exactly.
How are we going to handle the new users is the point where I was trying to bring the discussion with the word "YouCan'tDivideByZero"
Decima Legio (1987 D)
29 Mar 12 UTC
...but reading above, I found another argument why we should continue using #phases instead of #games.
Oli, please correct me if I'm wrong.
#phases stats are gathered since September 2011.
#games stats are there since the beginning I suppose (2 years ago).
Any N.M.R. or C.D. before Sept 2011 is not taken into account by the system,
so if you use a #games-based-formula you'll overestimate many "old guys'" RR.
Oli (977 D Mod (P))
29 Mar 12 UTC
@DC: Yes.
Oli (977 D Mod (P))
29 Mar 12 UTC
Any ideas how to name different "phases played"-groups?
0-50 = Beginner
50-100 = ??
100-500 = ??
500+ = ??
Or any other ideas?
fasces349 (1007 D)
29 Mar 12 UTC
Beginner
Rookie
???
Veteran
Decima Legio (1987 D)
29 Mar 12 UTC
newcomer or something similar fits better for 0-50. I wouldn't want to hurt real life experienced guys
Oli (977 D Mod (P))
29 Mar 12 UTC
Sounds good.
0-50 Newcomer
51 - 200 Rookie
201- 600 Practitioner
600+ Veteran
King Atom (1186 D)
29 Mar 12 UTC
I think they should be farther spaced than that. You can use those numbers if you want to, but I would prefer that we use something more along the lines of:
0-100 Newcomer
101 - 350 Rookie
351- 999 Practitioner
1000+ Veteran

But if not, that's fine. I just think that if you're going to rank people, you might as well make the highest one be something you have to really strive for...
Oli (977 D Mod (P))
29 Mar 12 UTC
This should just name the different options in the gamecreate-options.
And it does not make much a difference if you play a 500+ or a 1000+ player. Both committed quite a lot of time on the site and know how these things work.
And if you have only a few Veterans they do not need a special gamecreate-option. They can make a passworded game and a forum post.
It's just to make it easy to set a certain minimum-site-experinece for your games, not for people to feel special. That's what the DPoints are for. :-)
King Atom (1186 D)
29 Mar 12 UTC
Alright Oli, I was just letting my competitive nature push me into getting a little bit ahead of myself...
raapers2 (1787 D)
30 Mar 12 UTC
Since I'm on a roll, I thought I'd ask one other thing: is there any way to put the reliability rank/score next to players in anonymous games? I definitely understand if those two are linked code-wise, but it would be really nice to know before/during a game if players are Ds or Fs and might CD at any time. It all depends on your definition of "anonymous" too, I suppose.

Page 3 of 5
FirstPreviousNextLast
 

140 replies
So how do the points get split if your enemies concede?
I don't really care about points, but just curious how that works out in a PPSC environment. Does the winner still get half the points and the rest get divided between the survivors somehow? Or does the winner get his present worth which is less than a win would get him considering he must be shy of the win.
4 replies
Open
butterhead (1272 D)
11 Apr 12 UTC
Who else has ever done this:
So recently I played a classic game, and I found myself doing something strange, yet it seemed to pay off greatly... I found myself pulling out my board game copy of Diplomacy, putting the units on the map, and after every turn, I would try to find all the different options my neighbors could make, and find ways to counter them. I would use information from the game to pick the most likely moves to be made, and counteract them. so just a question, who else has finds themselves doing this?
8 replies
Open
Sid Meier Pirates! - EOG (of sorts)
13 replies
Open
goldfinger0303 (2136 D)
10 Apr 12 UTC
Need France, Excellent position
http://www.vdiplomacy.com/board.php?gameID=6734
4 replies
Open
PowMacP (889 D)
05 Apr 12 UTC
Pirates
Pirate Map
gameID=7321
Missing 6 players
PW: Purps
14 replies
Open
~ Diplomat ~ (1036 D X)
19 Mar 12 UTC
Well who created 9 games named: Brazil kicks butt!!!
He must have done something wrong .. He created same name game 9 times...lol
15 replies
Open
iLLuM (1569 D)
10 Apr 12 UTC
Bug in Pirates (A Pirate's Life For Me)
Some player noticed that two frigates did not bounce, but just changed places. Please check and correct.

http://www.vdiplomacy.com/board.php?gameID=7065
1 reply
Open
airborne (970 D)
09 Apr 12 UTC
Ante Up?
Is it possible to code having a new game setting where there is a minimum bet or an "ante" and anyone joining can bet the ante or more into the game's pot?
23 replies
Open
OatNeil (908 D)
10 Apr 12 UTC
Three more players
For this awesome 10 player game: gameID=7294
0 replies
Open
fasces349 (1007 D)
09 Apr 12 UTC
Golden Fascist Sheep EOG statements
gameID=6648

So longest of 1066 yet, finishing in 1083. Well played by Shep and Goldfinger, it was probably one of the most exciting games (at least imo) I have played in a while.
4 replies
Open
Nemesis17 (1709 D)
08 Apr 12 UTC
World War 4 map
We only need 6 more players please join gameID=7228
4 replies
Open
danr (989 D)
08 Apr 12 UTC
Relability rating
With only two games played on this page - both being victories with all moves received, I am a bit puzzled that my reliability rating is F. Further, it suggests that I have missed 1 of 3 phases, but in the two games I have played, we did more than 3 phases. Any ideas?
1 reply
Open
gopher27 (1606 D Mod)
06 Apr 12 UTC
Gopher walks down the street
So I saw a flyer today for Re-Occupy Minnesota, and it was emblazoned with the phrase "sic semper tyrannis". If I was trying to build a political mass movement, I think I would avoid referencing the assassination of Abe Lincoln in a manner that seems to support it. Am I crazy?
6 replies
Open
DEFIANT (1311 D)
07 Apr 12 UTC
CRASHED
Just got the message in my game where the time frame is like(4 hours), it says CRASHED there, what exactly does that mean, please?
Thanks
2 replies
Open
Scordatura (1396 D)
07 Apr 12 UTC
GreatLakes testgame
I made a new game on the lab. The first ever! Please join if you are able.
7 replies
Open
~ Diplomat ~ (1036 D X)
07 Apr 12 UTC
NEED SITTER PLEASE HELP...
I need someone to play my games for 5 days as I have to go to rural areas. So no connectivity..I hope someone will help.
12 replies
Open
King Atom (1186 D)
01 Apr 12 UTC
My Glorious Extravaganza!
PLUS: Introducing a game type to earn Honor ₧!
The "Fanfare" Game! Everyone earns Honor ₧ according to this formula:
(# of SC's you have when eliminated)-(Initial # of SC's)+(The highest # of SC's you ever had) It's designed so that everyone can earn as much Honor ₧ as they can!
24 replies
Open
~ Diplomat ~ (1036 D X)
07 Apr 12 UTC
(+1)
To Everyone: Please Help
I have to go out to nowhere where there is no internet connctivity so please vote extend or pause wherever you see one.
2 replies
Open
~ Diplomat ~ (1036 D X)
07 Apr 12 UTC
My First 34 SC win. lol
3 replies
Open
~ Diplomat ~ (1036 D X)
07 Apr 12 UTC
GAME CRASHED: gameID=7108
OLI please do something.
1 reply
Open
~ Diplomat ~ (1036 D X)
22 Mar 12 UTC
Number Of Games at a Time!
I am playing 15 GAMES ...Ah So many .... And In Midst Them is my 100 game!
27 replies
Open
goldfinger0303 (2136 D)
06 Apr 12 UTC
Replacement Holland needed
gameID=6945 Not a hopeless position by any means, but not the easiest pick of the barrel.
1 reply
Open
butterhead (1272 D)
03 Apr 12 UTC
The 360 league
So: We discussed this a couple months back, and then it died, but I just saw a webdip thread on this so I thought I'd bring it back up. who would be interested in forming a group on the Xbox 360 to join up and play some games together. if you are interested, Post here with your xbox live name and the games you like to play(online).
22 replies
Open
Wolfman (1230 D)
06 Apr 12 UTC
A bug or glitch just happened.
In this game: http://www.vdiplomacy.com/board.php?gameID=6012 Russia's move from Coast of Mexico to Guat should have worked. Anyone know why it didn't.
2 replies
Open
Dharmy (956 D)
06 Apr 12 UTC
2 good spots open in a HAVEN
http://www.vdiplomacy.com/board.php?gameID=6379
0 replies
Open
General Cool (978 D)
06 Apr 12 UTC
Leaving a Game
There is someone who left a game, but I can't join it in their place for some reason...
4 replies
Open
goldfinger0303 (2136 D)
05 Apr 12 UTC
What is the draw etiquette on this site?
Because as far as I've seen, there is none.
12 replies
Open
bozo (2302 D)
05 Apr 12 UTC
Replacement Players Needed for Chaos Game
We need two replacement players for a public press Chaos game: gameID=7006
3 replies
Open
Raffy (1482 D)
05 Apr 12 UTC
how many mods are there ?
or is Oli the only 1 ?
7 replies
Open
Page 58 of 160
FirstPreviousNextLast
Back to top