Jump to content

管理工具/Automoderator

From mediawiki.org
This page is a translated version of the page Moderator Tools/Automoderator and the translation is 27% complete.

Automoderator 是由Moderator Tools 团队开发的自动反破坏工具。 该功能基于机器学习模型对编辑评分,并根据评分自动回退该笔编辑。管理员可启用及调整其配置。 Automoderator与现有的反破坏机器人类似,如ClueBot NGSeroBOTDexbot,但在所有语言社区都可以使用。 本扩展的技术细节请见Extension:AutoModerator

此项功能正作为WE1.3.1 WMF假设的一部分: If we enable additional customization of Automoderator's behavior and make changes based on pilot project feedback in Q1, more moderators will be satisfied with its feature set and reliability, and will opt to use it on their Wikimedia project, thereby increasing adoption of the product.

社群已可申请 将Automoderator部署到维基百科站点。

更新

  • 2024年9月 - 印度尼西亚语维基百科开始使用 AutoModerator (Automoderator)。
  • 2024年6月 - 土耳其语维基百科开始使用Automoderator (Otomoderatör)。
  • 2024年2月 - Designs have been posted for the initial version of the landing and configuration pages. 欢迎提出思考和建议!
  • 2024年2月 - 我们已经发布了测试阶段的初步成果
  • 2023年10月 - 我们正在收集对测量计划的反馈,以决定我们应使用哪些数据来评估该项目,并已提供测试数据以收集对Automadator决策过程的意见。
  • 2023年8月 - 我们近期在维基媒体国际会议演示了本计划和其他用于站点管理的计划。 您可以在此处找到演示记录。

計劃動機

Wikimania presentation (13:50)

A substantial number of edits are made to Wikimedia projects which should unambiguously be undone, reverting a page back to its previous state. Patrollers and administrators have to spend a lot of time manually reviewing and reverting these edits, which contributes to a feeling on many larger wikis that there is an overwhelming amount of work requiring attention compared to the number of active moderators. We would like to reduce these burdens, freeing up moderator time to work on other tasks.

Indonesian Wikipedia community call (11:50)

Many online community websites, including Reddit, Twitch, and Discord, provide 'automoderation' functionality, whereby community moderators can set up a mix of specific and algorithmic automated moderation actions. On Wikipedia, AbuseFilter provides specific, rules-based, functionality, but can be frustrating when moderators have to, for example, painstakingly define a regular expression for every spelling variation of a swear word. It is also complicated and easy to break, causing many communities to avoid using it. At least a dozen communities have anti-vandalism bots, but these are community maintained, requiring local technical expertise and usually having opaque configurations. These bots are also largely based on the ORES damaging model, which has not been trained in a long time and has limited language support.

目標

  • Reduce moderation backlogs by preventing bad edits from entering patroller queues.
  • Give moderators confidence that automoderation is reliable and is not producing significant false positives.
  • Ensure that editors caught in a false positive have clear avenues to flag the error / have their edit reinstated.

設計研究

To learn about the research and design process we went through to define Automoderator's behaviour and interfaces, see /Design .

模型

Automoderator uses the 'revert risk' machine learning models developed by the Wikimedia Foundation Research team. There are two versions of this model:

  1. A multilingual model, with support for 47 languages.
  2. A language-agnostic model.

These models can calculate a score for every revision denoting the likelihood that the edit should be reverted. Each community can set their own threshold for this score, above which edits are reverted (see below).

The models currently only support Wikipedia, but could be trained on other Wikimedia projects in the future. Additionally they are currently only trained on the main (article) namespace. We would like to investigate re-training the model on an ongoing basis as false positives are reported by the community. (T337501)

Before we moved forward with this project we provided opportunities for testing out the language-agnostic model against recent edits, so that patrollers could understand how accurate the model is and whether they felt confident using it in the way we proposed. The details and results of this test can be found at /Testing.

How it works

Diagram demonstrating the Automoderator software decision process.

Automoderator scores every main namespace edit on a Wikimedia project, fetches a score for that edit based on how likely it is to be reverted, and reverts any edits which score above a threshold which can be configured by local administrators. The revert is carried out by a system account, so it looks and behaves like other accounts - it has a Contributions page, User page, shows up in page histories, etc.

To reduce false positives and other undesirable behaviour, Automoderator will never revert the following kinds of edits:

  • An editor reverting one of their own edits
  • Reverts of one of Automoderator's actions
  • Those made by administrators or bots
  • New page creations

Configuration

Automoderator's configuration page (September 2024)

Automoderator is configured via a Community Configuration form located at Special:CommunityConfiguration/AutoModerator, which edits the page MediaWiki:AutoModeratorConfig.json (the latter can be watchlisted so that updates show up in your Watchlist). After deployment, Automoderator will not begin running until a local administrator turns it on via the configuration page. In addition to turning Automoderator on or off, there are a range of configurations which can be customised to fit your community's needs, including the revert threshold, minor and bot edit flags, and whether Automoderator sends a talk page message after reverting (see below).

Certain configuration, such as Automoderator's username, can only be performed by MediaWiki developers. To request such a change, or to request other kinds of customisation, please file a task on Phabricator.

Localisation of Automoderator should primarily be carried out via TranslateWiki, but local overrides can also be made by editing the relevant system message (Automoderator's strings all begin with automoderator-).

Caution levels

One of the most important configurations to set is the 'Caution level' or 'threshold' - this determines the trade-off Automoderator will make between coverage (how many bad edits are reverted) and accuracy (how frequently it will make mistakes). The higher the caution level, the fewer edits will be reverted, but the higher the accuracy; the lower the caution level, the more edit will be reverted, but the lower the accuracy. We recommend starting at a high caution level and gradually decreasing over time as your community becomes comfortable with how Automoderator is behaving.

Talk page message

To ensure that reverted editors who were making a good faith change are well equipped to understand why they were reverted, and to report false positives, Automoderator has an optional feature to send every reverted user a talk page message. This message can be translated in TranslateWiki and customised locally via the Automoderator-wiki-revert-message system message. The default (English) text reads as follows:

Hello! I am AutoModerator, an automated system which uses a machine learning model to identify and revert potentially bad edits to ensure Wikipedia remains reliable and trustworthy. Unfortunately, I reverted one of your recent edits to Article title.

If the same user receives another revert soon after the first, they will be sent a shorter message under the same section heading. Default (English) text:

I also reverted one of your recent edits to Article title because it seemed unconstructive. Automoderator (talk) 01:23, 1 January 2024 (UTC)

误报报告

Automoderator's 'report false positive' link.

Because no machine learning model is perfect, Automoderator will sometimes accidentally revert good edits. When this happens we want to reduce friction for the user who was reverted, and give them clear next steps. As such, an important step in configuring Automoderator is creating a false positive reporting page. This is a normal wiki page, which will be linked to by Automoderator in the talk page message, and in page histories and user contributions, as an additional possible action for an edit, alongside Undo and/or Thank.