Hi Subfader and everybody,
thanks for the great feedback. There are many cases in which ratings will display this kind of polarization, not just in the case of trending topics. This is a common issue in online rating systems and what this implies, among other things, is that averages do a very poor job as indicators of the typical quality score for these articles. There are several possible ways of addressing this issue, but the polarization per se is not necessarily a "bug" in the system, it just reflects a strong diversity of opinions on the quality of the article among raters.
There are different issues that you and others have raised in the AFT threads that should not be conflated:
- how to represent scores of articles that display highly polarized ratings compared to articles that don't
- how to tell apart genuine ratings on the quality of an article from hate/love ratings about the topic of the article
- how to cope with the skewedness produced by trending topics or controversial articles on ratings (assuming they attract visits from users that are substantially different from visitors of non-trending/non-controversial topics)
- how to increase the number of observations for non-trending articles
- how to cope with explicit gaming
Each of these issues needs a dedicated solution and we shouldn't assume, for example, that 1) necessarily follows from 2) unless this results from the data.
As for the idea of arbitrarily switching off the tool on some articles, WhatamIdoing hit the nail on the head: this won't allow us to gather data that we need to address the problems AFT is currently facing. As I said before, it's not by reducing the amount of data we collect that we can analyze and solve these problems, but by deciding what we do with this data once it's been collected (how we generate aggregate scores from raw rating data, how we use this data to rank/filter articles etc.). These are problems for which expertise from editors familiar with community dynamics would be extremely useful.
We definitely do not want to exclude anonymous ratings as the tool is designed to get a broader source of data about quality and to work as a vector of new user engagement. There are ideas we can experiment with at the moment, such as looking at rater consistency across multiple ratings, but there's definitely value in anonymous ratings and we shouldn't throw out the baby with the bath water.
One last word: there's a lot of concern expressed on this page about negative ratings, but no one seems to consider the fact that Article Feedback could (and should) be used to praise editors who do a great job. Our community currently works in a gratification void, the only kind of explicit praise (a very small part of) our community gets is via barnstars, personal messages on user talk pages and other community-centric mechanisms such as WP1.0 ratings. But shouldn't we allow readers to express gratitude to editors for their hard work? Shouldn't we let editors know, for example, that an article to which they substantially contributed was visited by 100K users every day and was consistently rated high on quality scales? I believe there are smart ways of using AFT for this purpose.