Monday, September 18, 2017

Turnbull's MonsterMark System

Following on from the discussion of systematized EHD (equivalent hit dice) last week, let's look at a much earlier attempt at the same idea. In the first three issues of White Dwarf magazine, Don Turnbull presented a measurement he called "The Monstermark System". This would be through the summer and fall of 1977, that is, exactly 40 years ago as I write this. (Thanks to Stephen Lewis for the tip-off to these articles!)

In the third article in the series, Turnbull writes:
 Although it has been said by quite a few D&D addicts that the Greyhawk system of experience points, which is based on monsters' hit dice, is too stingy I don't think this is something which can be considered in isolation...  So, circuitously, back to experience points. In my view they are intended to reflect risk. A character gets experience for meleeing with a monster because there is a finite, non-zero, risk that he will be killed or at least suffer wounds which could contribute to his eventual death. He gets experience for gold because he has taken risks to grab it... He should not, however, get experience for finding a magic sword or that seven-spell scroll since these things will assist him in getting experience by other means... Since the whole point of the Monstermark is to measure the risk inherent in tackling a particular monster, experience points should bear a linear relationship to M...
I fully agree with those observations, and my motivation for EHD is exactly the same: to provide a measure of risk, from of which we can support a simple, linear calculation for experience points. We both assume a protagonist fighter with a fixed armor type, shield, and a sword; we both give the fighter one attack per round. Now, the basis of his system is this: for the default fighter, compute the expected amount of damage he would expect to take fighting the monster (assuming the combat never ended from the fighter's death). In this case, the calculation is done by first computing the number of rounds the monster would expect to live (D); and then multiplying that by the expected damage per round (analogous to the DPS -- damage-per-second -- statistics in MOORPGs) for an overall aggression level (A). In the first article, Turnbull presents it like this:


This seems like a solid, undeniably valid base measure of monster risk level. As long as the monster has no special abilities. Which is, as you know, almost none of them. As soon as a monster has special abilities, then Turnbull is forced to step out of the methodical expected-value analysis and revert back to a purely discretionary set of multipliers, hoping to estimate the power of various abilities, to get the final MonsterMark score (M). As he writes, "All this is very subjective and I would be surprised not to meet with different views, but the following bonus relationships seem to give results which instinctively 'feel' right:"


Now, if you take nothing at all but one thing away from this blog, I hope that it's this: these kinds of a la carte scoring systems for game entities are always a lost cause.The inter-relationships of different abilities and powers are too complicated to be encapsulated in such a system; the true acid test can only be made by systematic playtesting (which is very hard).

Consider a few short counterexamples -- A giant rat given magic-to-hit defense is effectively unbeatable by the PCs it normally fights; but a very old red dragon, given the same ability, would have little effect against its high-level opponents (surely wielding magic weapons already). If ghouls have possibly paralyzing attacks, then it makes a huge difference if they have one attack for 1d6 damage, versus three attacks for 1d2 damage (even with nearly the same expected damage). Centipedes and carrion crawlers, with a base damage of zero, even with poison or paralysis, would generate a product that is still zero by this multiplicative system. And so on and so forth.

Nevertheless, Turnbull pushes forward with the tools he has, first presenting a table of basic humanoids without special abilities (of which there's really only a half-dozen), and then separate tables for various other categories of monsters from OD&D, the Greyhawk supplement, and a few magazine articles current at the time. For a few examples of his M scores: orcs get 2.2, ogres 29.9, trolls 158.4, and red dragons 675.5 (by comparison, I give those creatures EHD values, respectively, of 1, 4, 9, and 32; and no, I don't think that going into decimals here is a great idea). Ultimately he recommends giving XP of 10 times his M score, which is generally about double the low Greyhawk XP awards for these sample creatures (whereas I still prefer 100 times the EHD level, in the spirit of Vol-1).

There are 73 monsters for which Turnbull & I both are willing to give measurements. Consider the correlation between our assessments:


That's not very close at all. The data points are scattered all over the place, not close to any regular relationship; knowing one measure only allows you to predict about 50% of the variation in the other measure. On average, Turnbull's Monstermarks are about 20 times what I find for EHD levels, but that doesn't tell us much. He assumes plate armor for fighters whereas I assume chain (for reasons given last week), but that can't explain the low correlation either. Let's look at some specific cases for why this is.

The most obvious problem for Turnbull is this: The Monstermark system cannot handle area effect abilities at all. His model tries to do accounting on the hit points from breath weapons (in the 2nd article), but he steadfastly assumes just a single deathless fighter in melee against a given monster; so, if a red dragon breathes fire, then only damage to that one fighter is accounted. But that doesn't reflect the true risk or utility of area-effect weapons like that; our PCs don't adventure in solitude but in groups of some size. The examples of dragon combat in both OD&D and AD&D show three PCs being incinerated at once from a single breath attack; so the damage/risk multiplier should really be at least several times higher than Turnbull counts. Likewise, petrification weapons get no distinction for delivery by touch or wide-area gaze -- the cockatrice (touch), medusa (gaze), and basilisk (both!) each get an identical 2.5 multiplier for their abilities. This alone probably accounts for a massive skewing in many of his scores, downward from the true risk level. In contrast, my Monster Metrics program runs up to 64 opposition fighters simultaneously against any given monster, and they suffer appropriately from area or gaze weapons.

Some examples where the Monstermarks seem clearly too low:
  • Basilisk (EHD 25, MM 128), with its combined touch-and-gaze petrification, which only gets the same multiplier as a cockatrice does. 
  • Medusa (EHD 13, MM 56), likewise with her area-effect gaze petrification.
  • Carrion Crawler (EHD 14, MM 120); as noted above, the multiplication system from zero damage should come out to zero, so I think he just made this up from whole cloth (note the round number). 
  • Harpy (EHD 9, MM 22), with her mass charm song ability, shouldn't be weaker than an ogre.
Another rather egregious issue is this, although it affects only two creatures: Summoning abilities are entirely left out of the accounting. As noted before, we find these abilities to be among the most potent in the game! But the Monstermark system actually overlooks them entirely, giving no bonus at all for them.
  • Vampire (EHD 39, MM 440), given no summoning abilities.
  • Treant (EHD 33, MM 420), which actually appears in Turnbull's first table of "simple human-type monsters" without any special abilities, and yet its tree-controlling ability allows it to effectively triple its own brute strength. (As an aside, consider a vampires-vs-treants scenario, in which we find two of the most powerful opposition monsters in the game due to their parallel summoning abilities.)
Meanwhile, there are some other monsters with nothing but brute strength that appear too highly scored -- like the Fire Lizard (EHD 14, MM 758), and Hydra with 10 heads (EHD 18, MM 707) -- but I think that this is only an artifact of the special ability monsters being relatively too low. Also, the Mind Flayer's score seems ridiculous (EHD 20, MM 700), granted that he doesn't even note its mind blast power, and was probably again just a raw guess (another suspiciously round number).

Now, there are two other cases that literally jumped off the chart above, such that I felt compelled to remove them as outliers -- and on inspection they are rather obviously in error. These were:
  • Roper (EHD 16, MM 3,750). This is clearly a mistake. Turnbull notes the creature in part 2, p. 15: "These calculations make the Ropers the most fearsome beasts we have met so far; I don't recall ever meeting them down a dungeon, and I devoutly hope I never will." The problem, if I'm reading his attack notation correctly, is that he's applied the Roper's 5d4 damage factor -- which should be just for its mouth -- to every single one of its 6 ranged tentacle attacks. That really would be horrifying! While the Roper is a tough customer, it obviously shouldn't be worth the same as 5 or 6 Red Dragons; that doesn't pass any kind of sanity check.
  • Flesh Golem (EHD 21, MM 1,920). In this case, the problem is that Turnbull shows a radically different AC for the monster than I see in the books: My copy of Sup-I (with correction sheet) gives it AC 9, as does the AD&D Monster Manual. Turnbull shows it has having an AC of -1, which is obviously the diametrical opposite. I'm not sure where he got that from, maybe from a wild guess before the Sup-I correction sheet was available to fill in that statistic? 
There were some other things I had to leave out of the analysis, such as those other golems and elementals that are hit by only +2 or better magic weapons, which have undefined EHD in my model. Turnbull gives medium and large elementals a score of 1,000-2,000, stone golems nearly 13,000, and iron golems just shy of 33,000 (but again their ACs are treated as much harder than in the rulebooks, namely AC -3 and -5, so there are multiple reasons to leave them out of our comparison).

In conclusion, while the motivations are exactly the same, the scores that Turnbull & I come up with a radically different, effectively incommensurable. (If you want the full data, my Monster Database from last week has Turnbull's MonsterMarks entered in hidden column Q.) Of course: while Turnbull's instinct was noble, he didn't have the immense computing power all around us to simulate playtests the way we can today. Now, maybe someone will come back to critique my work in another 40 years -- someone who has access to a complete game engine with all the special abilities, full wizard spell selection, mixed-class PC party simulator, and hard Artificial Intelligence to optimize the best tactical choices on each side -- and in that light my suggestions might look totally naive. We can only hope for such continuity and progress.

Saturday, September 16, 2017

Saturday Software: Monster Metrics v.103

Previously, we've looked at the output of my "Monster Metrics" program (a branch off the "Arena" codebase), which simulates thousands of fighters in combat against specified monsters, so as to gauge their physical power level in terms of Equivalent Hit Dice (EHD). Last Monday, I presented the OED Monster Database, including pretty much every monster in OD&D and the first few supplements, which served as a platform to comprehensively assess every monster's EHD. Of course, the program needs to get updated every time a monster with a new special ability is added, so here is the current codebase with a few comments.

First, I added a couple command-line options which you see below if you want to play around with them.


Usage: MonsterMetrics [monster] [options]
By default, measures all monsters in MonsterDatabase file.
Skips any monsters marked as having undefinable EHD (*)
If monster is named, measures that monster at increased fidelity.
Options include:
-a armor worn by opposing fighters: =l, c, or p (default Chain)
-b chance for magic weapon bonus per level (default =15)
-f number of fights per point in search space (default =100)
-r display only monsters with revised EHD from database
-u display any unknown special abilities in database




For each monster, the program runs through fighters of level 1 to 12 and does a binary search at each level for the number of such fighters which provide the closest to a fair fight (i.e., 50/50 chance of either side winning). Each step in the search runs 100 fights by default to determine the winning percentage (which you can adjust with the -f switch above for greater fidelity and slower running time, if you wish). Then a total EHD is assessed across all levels by computing the weighted total \(EHD = (\sum_{n = 1}^N n \cdot f(n))/N\), where \(f(n)\) is the fair number of fighters at level \(n\) in the table above, and \(N\) is the maximum level considered (in this case, \(N = 12\)). Note that this is likely different from "best fighter level to provide a fair fight", in that special abilities that can wipe out an army of of 1st-level fighters, but are impotent against high-level fighters, do get accounted here.

Every fighter in the simulated combat gets a sword, shield, and chain mail. One might ask, "Why chain mail by default, when most fighters after 1st level will be wearing plate?". But the thing is, I wanted the EHD ratings to actually be scaled to units of monster hit dice, e.g., the number of orcs that a monster is really worth, and those low-monster like humanoids all have chain-like armor (goblins/orcs AC 6, gnolls/ogres AC 5, trolls/giants AC 4), so we want to keep the simulation in that scale without adjusting other factors. Doing it this way, the EHD for those low-level types (lacking any special abilities) does in fact match their normal HD (orcs 1, gnolls 2, bugbears 3, ogres 4, etc.). If we switched the default fighter armor to plate, then that would devalue the monster risk, and even the simple monsters would see their EHD fail to synch up with their HD (in that case: gnolls 1, bugbears 2, ogres 3, etc.).

The next important consideration is: what level of magic weapon to give each fighter? Previously, I just assumed a +1 magic sword for every fighter, so as to not make creatures hit by magic totally invulnerable. But we will really don't want fixed bonuses like that (or fixed bonuses by level), because it creates singularity dropoffs between level or steps of bonus (e.g., makes the protection of lycanthropes and gargoyles totally useless, even to 1st-level fighters). So in this version I switched that to a probabilistic factor of 15% per level to get an extra magic boost, and also a silver dagger as a backup weapon. Having tried several levels between 5% (as seen in Vol-2) and 25% (as suggested by some comments online), I found that 15% overall gave the best match to the prior version, while giving a reasonable boost to lycanthropes, etc. And that's also what I do in my OED house rules, giving a 1-in-6 chance per level for a magic boost to characters created at higher levels.

Now, that leaves another problem, namely that any monster hit only by +2 magic or better weapons (e.g., golems or elementals in Sup-I) is totally invulnerable to 1st-level fighters, who can theoretically only have at best a +1 bonus. This means that the number of 1st-level fighters, and thus our EHD clculation, becomes technically infinite. That is: in these cases our model simply fails.

It's for reasons like this that the OED Monster Database shows an asterisk (*) under EHD for some very exotic monsters, to note that the EHD is effectively undefined in our current model. More generally, this is done for any creatures with wizard-like spell capability that isn't implemented in the program (triton, titan, lich, lammasu, gold dragon, beholder), creatures hit only by +2 or better magic weapons (golems, elementals), and creatures totally immune to blows from weapons (various oozes).

So that's the skinny on what the program is now doing under the hood, and why the Monster Database appears the way it does. Note that the included data file MonsterDatabase.csv is an exact duplicate of the Monster Database from Monday (just in CSV format so it can be read in by the software). Hopefully this makes it easy to investigate or add other monster in the future when we need them.



Monday, September 11, 2017

OED Monster Database

One of the places that OD&D can be most successfully criticized is in its presentation of the monster listing. In many places in the original work key details are missing, contradictory, or left to the DM (e.g., all of the normal and giant animals that appear in the encounter charts). As of the first supplement, one had to look in at least four different places for all of a monster's information: (1) the main table (HD, AC, MV, etc.), (2) the alternative attacks/damage listing, (3) the alignment category, and (4) the main text description, each of which appeared in different far-flung sections, even different books for one monster. The overall situation is what motivated release of the Monster Manual as the first hardcover book, which compiled all the statistics from OD&D monsters in one place, before any other volumes of the AD&D rules.

Below you'll find the OED Monster Database, a compilation I've made over the years for OD&D monster statistics, in line with my OED games and house rules. This gives me a convenient one-stop resource for OD&D monster statistics when I'm writing other material. One might ask: Why not just use the Monster Manual? One reason is that I very much like to stick with the original d6-based hit dice, attacks, and damage, as found in the LBBs (i.e., we do not recognize the alternative monster hits starting in Supplement I). Moreover, here are other reasons why I think this exercise was worthwhile:
  1. Provides a consolidated listing of OD&D-style monster statistics.
  2. Creates automatic summary stat blocks for insertion to adventures (see sheet 2).
  3. Software can provide data-integrity checks for monster records.
  4. Software also assesses "equivalent hit dice" valuations for encounter balancing and XP awards.
  5. Forced me to think through any ambiguous adjudication cases in code (this prompted many "rules archeology" investigations and margin notes that you see on this blog).
Generally I've compiled everything I could find from OD&D Vol-2, Sup-I, land creatures from Sup-II, and TSR (The Strategic Review) No. 1 and 2 -- for a total of 147 monsters. Not included are aquatic monsters from Sup-II, demons from Sup-III, or deities from Sup-IV. For things like giant animals I turned to the Monster Manual and back-ported the information there, translating variant damage into units of d6's as we would normally expect/prefer.

Among the things you'll see is that any kind of special ability is given a keyword (and optionally one numerical parameter) for readability by my "Monster Metrics" program (more on that this Saturday); hopefully a knowledgeable DM can parse what those notes mean. The third-to-last column shows the EHD (equivalent hit dice) as determined by that program. A number of monsters are fundamentally outside the ability of my model to determine EHD, and so indicated by an asterisk (*). These would include monsters with expansive wizard-type spell ability or need spells to defeat -- for example: oozes with weapon immunity, or elementals/golems hit only by +2 or better magic weapons.

XP Awards by EHD


Let's consider XP awards for a minute. My preference is to award XP by simply multiplying HD by 100 (and this is supported by evidence of a fundamentally linear relationship between risk and HD). Of course, this method from OD&D Vol-1 overlooks the value of special abilities. Sup-I introduced a variant XP table, and a secondary column to award bonuses for special abilities (which was of course carried forward into later editions like Holmes, B/X, AD&D, etc.). But in most of these works the DM still needs to make a subjective decision about what abilities warrant this bonus -- and I would argue in many cases it vastly undervalues some very nasty abilities (esp. on creatures with low HD).

There's a much easier way to do this, without any new tablature required or DM subjectivity, by just assigning a revised hit die value for XP purposes, which I've taken to calling "equivalent hit dice" (EHD). I gauge this with my Monster Metrics program by running several thousand fighters of different levels at each monster until we can determine a "fair" fight in each case. But the core of this idea (as simple as it is) predates the Sup-I and later variant XP charts, appearing first (briefly) in The Strategic Review, Vol. 1, No. 2 (Summer 1975), p. 4:
For purposes of experience determination the level, of the monster is equivalent to its hit dice, and additional abilities add to the level in this case. A gorgon is certainly worth about 10 level factors, a balrog nut [sic] less than 12, the largest red dragon not less than 16 or 17, and so on. The referee's judgement must be used to determine such matters, but with the foregoing examples it should prove to be no difficulty.
This seems like a much more elegant way to assess the risk/reward of exotic monsters, and it's also the simplest measurement model I could construct in my software, so this is what I now include for each monster (where appropriate) in the Monster Database. Let's just briefly compare the sample evaluations from TSR #2 to what comes out of my program for EHD:


So we can see that, compared to our model, Gygax seemed to to undervalue the more exotic special abilities of monsters (in this case: petrification, immolation, and fire breath), which is consistent with his low valuation for specials in the variant Sup-I XP tables. Perhaps on closer inspection we could generously grant that Gygax's numbers correctly assess the minimum fighter level which could possibly match the creature man-to-man, but this is not an ideal metric of the danger to parties of several men of lower or higher levels.

Finally, let's look at the distribution of the EHD values currently recorded in the monster database:


Here we see that the distribution is a nice logarithmic curve, with the largest number of monsters at the lowest level (EHD 1-5), and decreasing regularly after that. (If we zoom in a bit more we'd see that the highest number of monsters are actually the level of EHD 2, outnumbering EHD 1 by a small margin.) This seems to be useful for a fantasy campaign, and nicely echo the curves we'd expect to see in demographic and economic statistics from real-world data.


That being said, here's the link to the OED Monster Database, in the ODS Open Document Spreadsheet format: 


Tuesday, September 5, 2017

OED Fantasy Rules v1.04 Released

As many of us head back to work and school, I wanted to share the labors of some gaming work that I've done over the summer. In that spirit, I've updated my Original Edition Delta house rules to version 1.04, and made them available on the main page of the OED Games website.

If you check it out, among the main changes you'll see is that I've split off the Player's Rules from the Judge's Rules into two separate documents. This allows you to hand the short core version of the rules to interested players (the former fits on one sheet of paper, actually), and consider the slightly longer set of behind-the-screen suggestions for judges (what I personally play by) on your own. Another reason this seemed to make sense is that the player's rules seem to have become pretty stable in the last few years of playtesting, while the judge's rules are still somewhat in flux (in fact, at the end you'll see a short list of planned still-to-come future expansions).

The other big formatting change is that I've started adding extensive endnotes to all the rules, citing classic rules, outside articles and interviews, pulp literature, and blog posts where these ideas germinated and got tossed around -- along with various difficult points considered, "proud nails", and so forth. The hope is that this helps others ("gaming archeologists", as Prof. L. Schwarz calls us) to track down where these ideas came from more quickly, help them get a grip on the various issues being balanced, and save others time from re-doing the same scholarly research over and over again. (Of course, just ignore the section at the end if that's not your bag.)

As for the game rules themselves, you'll find some very-small edits to the Player's Rules, like a slightly streamlined presentation of the weapon and encumbrance mechanics. The Judge's Rules has a lot more new stuff, like consolidations of the research on player statistics, monster metrics, exploration, combat, and rewards that we've seen here on the blog (and some more besides). I've tried to go through and share most of the copious margin-notes that I have in my copies of the OD&D LBBs, that no one has ever seen before. There's also a new Player Aid Card and even a promotional flyer (under Add-Ons) if you want to share the drama with others.

Hope that's helpful to some of your games! As always, thoughtful feedback and comments on what you see are warmly welcomed here. Hope everyone has a safe and rewarding season coming up.




Thursday, August 31, 2017

Gygax on Slings

Earlier this year, I had a post inspired by some scholarly research that said, perhaps counter-intuitively, that slings were at least as powerful a missile weapon as bows, and perhaps moreso -- although the amount of training required for slings was far more extensive and difficult than that needed for later types. I just realized that, like many other topics, Gygax was far out ahead of this one, with a two-page article on exactly that subject in the last Strategic Review, Vol. II, No. 2 (April 1976). He writes:
With great practice the slinger could achieve respectable accuracy — perhaps as excellent as that performed by a well-trained bowman. So on the counts of range and effectiveness the sling was at least the equal to the ancient bow (and just as equal to the medieval bow too), but it was somewhat slower in its rate of fire. Perhaps the telling factor regarding the sling was usage. While it was known by most peoples, few really specialized in its use. Because, like the bow, it required constant training and practice to use effectively, certain peoples constantly supplied most of the slingers to ancient armies — notably the Rhodians and Balaerics. As so many more peoples used the bow, it is natural that the latter would be more commonly found. Also, while it is possible to train troops to the use of the bow so as to make them at least passable archers within a reasonable period of time, the sling (as do the longbow and composite horsebow) requires familiarity and training from youth. Perhaps the disadvantages of slower rate of fire, fewer users, and long training for accuracy eventually caused the sling to be completely displaced by the bow in the Middle Ages, but it certainly wasn’t due to that weapon’s ineffectiveness against the armor of that period. Had slingers been available during the medieval period their ability to employ the shield, their ability to function in wet weather, and the relative ease of procuring or manufacturing missiles (as opposed to arrows or quarrels) would have made them popular contingents until plate armor came into fashion again in the Fourteenth Century. It is worth noting that the Spaniards who encountered the sling in America found this Incan weapon but little inferior to their own arquebuses, that it could hurl a missile which would kill a horse with a single blow, and these slung stones could shatter a sword at 30 yards.
In short, he agrees with all of our recent scholarship except on the issue of slings also possibly being as fast or faster in fire rate than bows (which is reflected in his AD&D rule that gives slings half the rate of bows). He even includes the following illustration, with the caption, "ASSYRIAN SLINGERS, swinging their slings parallel to their bodies, stand behind the archers in this drawing based on a relief from Nineveh showing one of the campaigns of Sennacherib (704-681 B.C) Their place in battle suggests that they outranged archers.":




Monday, August 28, 2017

Testing unbalanced dice in water

I've written about how to use standard statistical procedures to test for unfair dice a few times in the past (one, two, three, four). As noted in the last of those linked articles, for a d20 this probably involves some hundreds of dice-rolls at a minimum to get a test of sufficient power.

Here's a clever and much faster way of doing a check for unbalanced dice. This is from a video sent to me a while back by reader Ro Annis. Get a bowl of water, pour salt in to increase the buoyancy factor, and throw your dice in. If they repeatedly and consistently spin up the same face, then that die is obviously unbalanced. Like the second die in the video here.

video


I can imagine a few corner-cases where this may not suffice -- like if the die is balanced by weight, but the faces are malformed so as to bias the rolls on a table. But this is a great and fast way to do a first-pass check. Thanks, Ro!


Saturday, August 26, 2017

Saturday Software: Giant Packs in Javascript

A few weeks back I shared my Java application for generating giant packs in Gygax's classic G1-3 adventures. Reader Random Wizard then took my code and made an online version in Javascript at his Kirith.com site. Pretty sweet!