posted on Wednesday, October 11, 2006 4:26 AM by Endie

Datamining as Data Valuation

Over on Terra Nova, Timothy Burke has posted about mining massively multiplayer online game forums for valuable feedback, in the light of SWG Creative Director ChrisCao's recent pained outburst.  His post got me thinking, especially since my own last few posts have been on kinda-related topics: analysis of forums and the dismissal of some forum opinion by the Star Wars Galaxies team.

What Tim is talking about is a little different from what I did with the Terra Nova posting analysis, of course.  He uses the term "datamining" to describe the rapid assessment of the value of an archive, and the the act of extracting useful information from such an archive through manual but rapid skim-reading, informed by contextual knowledge that allows one to quickly .  As he suggests, it is a skill that comes with practise, especially when exposed by one's role to primary texts, although having two very different areas of study, I find that much the same ability develops with abstract symbology - when scanning tens of thousands of lines of code - as when quickly reading, for instance, the Corpus Inscriptionum Latinarum.

The article, however, got me thinking about how automation might be used in extracting value from game forums.  At first glance this is an unlikely task.  Game forums are places of natural language, for one thing, and I am not hugely interested in natural language analysis.  To use software to parse and extract actual content from posts is quite possible, but expensive and cumbersome.  I think that Tim is right in bypassing any attempt by technological means to come to conclusions on qualitative grounds.  But weighted quantitative analysis, with a very few specific and easily maintained qualitative

But I suspect that, so long as one accepts a very lossy process of transmissio, then a few easy tools could be scripted.  What a community relations rep should be able to do is to give empirical data to the dev team on what people are discussing (where are the problems) and what the trend of opinion seems to be.

How to get the data is trivial: forums are stored in relational databases, so retrieval is easy, especially if you warehouse to a de-normalised form (one that is bigger but quicker for specific queries).

To find out what people are discussing, have a list of keywords to be scanned for.  As long as this list can be maintained easily, then there is value here.  If 12% of posters in a given day are mentioning an obscure monster in a mid-level zone then it is worth checking whether there is a possible exploit.  If the trade forum is flooded with mentions of a previously rare loot item, then check for a bug in loot tables.

You don't want to get a list of every unique word, and you want some terms to be synonymous.  A feedback loop is useful.  Use a bayesian algorithm to allow skewed weighting of poster value: if a community rep reads a post and finds it valuable, they should be able to rate it higher (or lower, of course).  This value would, of course, be uber-hidden, and data protection legislation would probably require it to be double-blind.  Have this value decay over time (or probably over a number of posts) towards a norm.  Similarly, allow for weighting by admins of specific words, so that their value drops beneath a threshold that excludes them from reports.  One might rank "the" at zero and "exploit" at 85, for instance.

Look for the first and second derivatives.  If there are 1100 mentions of widgeting per day every day, then be alert when that starts to spike or plummet.  And over a period of days, look for accelerating growth (say, as more people tell their friends about how to make teh fr33 l00t).  If token-counts remain steady over time, you are aware already: exclude them from the report on the basis of the first derivative of the postcount.  When the rate of change itself varies, then include it in the report.

You have data that can help shape your analysis.  Weight the scoring given to a mention by a forum member with 4000 postings as compared to someone with 20.  I would suggest that the latter would receive a massively higher scoring by default: when twenty non-regulars come and ask about a certain game feature, then suspect a problem.

Some forums allow fellow-posters to evaluate posts.  SWG had this, but it was removed when the boards split into the Guelphs and the Ghibellines.  You want to know what your players think: use it!  But analysis of this data should be shaded in some way.  One useful method is to balance votes for those with persistently extreme voting patterns: those who consistently vote 1/2 or 9/10 for everything have their votes weighted towars the average, just as those who vote with a very high or low average have their overly discriminating or undiscriminating tastes allowed for accordingly.  A report on the posts rated most extremely (lots of very low and very high ratings) by large numbers of people would show where the current fault-lines are.  Of course, the data consumers would be similarly skewed when assessing their valuation practises: if dev Steve always rates low without ever balancing this, then his opinions are to be discounted by a similar factor, or else others will miss data thanks to him.

I would present a report on threads with an original post over n00 words where there are more than x replies of y words.  This will show manifesto pieces that strike a note with the community, for better or worse.  Burke himself (posting, I think, as Khaldun) presented a series of posts on SWG's systems called (I think) the seven deadly sins of SWG (now deleted in a forum wipe) which provoked very large numbers of responses.  Lots of people post huge lists of suggested gameplay enhancements, but most fall off the front page in hours.  But those which get large numbers of responses beyond "/Agree" are worth looking at.

Mine for common terms (weighted by bayesian means as mentioned above: you don't want a list of swear words and obscene puns on senior dev names) in admin-closed threads and present the top dozen.  This will yield information on what players are angry about, and what reps are shutting down discussion of.

Of course, polls work too, but are conducted in most MMOs with the same care for the result as pre-1989 Warsaw Pact plebiscites.  No poll will risk asking questions where a truly undesirable result could be returned (witness a recent SWG poll where the most popular option for a penalty on character death, item decay, was excluded).

In summary, use of decently-trained bayesian filtering, some weighted user valuation based on simple arithmetic posting practices, together with observation of the rate of change of token ("word") appearance and some specific, tailored reports (who is getting lots of lengthy feedback?), one could gather some decent info.  But someone still has to go in there and read and talk.

Comments