Jump to content

Collaboration/Team/Processes

From mediawiki.org
maximize project velocity!

We (the Collaboration team working on Flow) are an agile team. We come up with processes that work to deliver software, donning suits as required.

TODO: The team is also responsible for Thanks, PageTriage, MoodBar, WikiLove, etc. plus extensions written by the old Growth team such as GettingStarted. Update all these links plus bug triage query to include all these extensions.

Project

[edit]

Review means release!

[edit]
Echo and Flow open patches to review

When you +2 a change to Flow in gerrit, Jenkins will merge it to master. It's now on the release train to appear in front of users!

Meanwhile...
Our team tests new features (primarily on the Beta cluster). After features are merged, they go through QA Review. After that, they go through Product Review, after which the Product Manager "Accepts" a story by marking it Resolved.

So when you +2 any Flow change in gerrit, you must do some housekeeping:

  • In the Phabricator workboard for the sprint, drag the task for the change to "QA Review" and make sure it's obvious how to test.

Master has to work

[edit]

Everything we merge to Flow master is deployed on en-beta within minutes, and then our Jenkins Continuous Integration setup runs our simple tests/browser browser tests against it every 12 hours or so.

So merged commits must not break.

  • You should rebase against latest master and verify it works locally and run tests before you move a card to "Needs Review" (because it's awaiting final code review).

Any problems you encounter with master should be filed as bugs in phabricator:, because master is externally visible on en-beta.

Flow releases to production

[edit]

Flow is enabled on all of OfficeWiki, a whole namespace on Catalan, and growing number of pages on MediaWiki.org, and several language wikis including enwiki (see Flow/Rollout which also explains the process).

So whatever is in master as of Tuesday morning PST will be deployed to mediawiki.org around noon Tuesday as part of the new 1.2xwmfNN release in the WMF's one-week release cadence. This then "flows" to group 1 wikis Wednesday, and then to all Wikipedias on Thursday.

  • We have an informal code freeze 24 hours before the Tuesday deployment. Starting Monday 11am PDT do not +2 a gerrit change unless it's vital, safe, and tested.
  • Any patches requiring DB updates should be flagged (do they need to be in DB review?).

Maintenance scripts

[edit]

Any maintenance scripts run on production should first be run on a test wiki, such as Beta Cluster and/or testwiki. This applies whether or not the maintenance script is intended to do an update.

Backwards and forwards compatibility

[edit]

Note that mw.org, meta, and enwiki will be running different releases yet talking to the same Flow DB and external store, so all DB updates must support later and older code.

Likewise, Flow information in the cache (memcache/redis) may be shared between different code releases, so we have to carefully consider bumps to $wgFlowCacheVersion.

The Release readiness review ranger

[edit]

On Monday after our unofficial code freeze, an engineer on the team reviews the state of Flow prior to the "release train" phase 0.

  • Git log Echo and Flow (and other extensions) to see changes since the last branch. Try out changes.
  • Review browser tests on beta labs on the Echo+Flow CI dashboardtests dashboard, they should be passing. If not, try Flow on the Beta cluster.
  • In particular Flow interacts with Echo, Parsoid, Thanks, AbuseFilter, ConfirmEdit & SpamBlacklist.

Fixes and backports

[edit]

Flow no longer has a standing window, but we can request one as needed, or use SWAT. We usually use it for DB updates and other complicated feature deployments, and to roll Flow out to new pages and wikis.

If we need to backport a fix:

  • prioritize with Danny and quiddity
  • get OK from greg-g to add it to Deployments calendar, probably in a SWAT deploy window.
  • Danny or the responsible engineer notes on the task that it will be backported.
  • get it merged to master well in advance
  • prepare backports, get them merged, the chosen (the bump should be done automatically, though)
  • use the Etherpad http://etherpad.wikimedia.org/p/Flow_Deployments to keep track of complicated deployments
  • do prolific manual testing afterwards

Project management

[edit]

Phabricator notes

[edit]

Each of our extensions (Flow, Echo, Thanks, PageCuration) is a project in Phabricator. Bugs and feature requests and work items for each extension go in there. Each extension project has a workboard, but we don't use these workboards to plan work.

Instead we have a Collaboration-Team project for managing work, phab:/tag/collaboration-team/, and we do use this projects's a workboard phab:project/board/65/ extensively.

This is automatically added to most extensions that our team is responsible for, whenever any change is made to the task.

There are separate sprint boards, which are listed here.

More details

[edit]

Story grooming

[edit]
  • All design issues should be wrapped up before the next iteration, so their due date is the Tuesday before the iteration starts, so engineers can estimate at the Wednesday poker meeting.
  • want designers to be working two iterations out,
  • When cards move to Flow current iteration board, the designer remains on the card.

Story review and estimation

[edit]

We use http://hatjitsu.wmflabs.org for estimation

Engineers READ THIS BIT

[edit]

Developers work in the current iteration.

  • Work right-to-left – help with code review in Gerrit (focusing on Needs Review and tasks where you've already started review), make progress on the cards you have "In Development"

After that,

  • Take the top-most card from "Sprint ZZ column", add your name as a member to it, drag it to the "In Development" column.
  • When you think you're done, rebase your change on latest master, make sure it still works (run all tests), and move to the "Needs Review" column.
  • If you need design guidance or have questions about the design (or other issues), ask on IRC.

When you +2 a change in Gerrit and it's merged:

  • move the card to QA Review.

When you need something from design, talk on IRC, and:

  • Add Pau Giner to the card.
  • Add a comment with @username I need XYZ to create a mention.
  • If you're blocked, mention this on IRC and/or at standup, and move on.

Bugs

[edit]

We review bugs in Phabricator

Bug triage

[edit]

Takes place asynchronously, with catch up in the Bug triage meeting.

When you triage a bug, edit it.

  • add informative tags to Projects field, e.g. design, easy, Performance. Note "Project" encompasses tags, teams, extensions, etc. – the icon and color distinguish different kinds.
  • The Collaboration-Team project, which is automatically added, defaults to the backlog column. If applicable, drag it to a better column such as "Story grooming/estimation" or "Ready for next sprint".

Every Phabricator project has a workboard, but we don't want to have to visit multiple extensions' workboards to manage team work. So we don't use each extension's default workboard. We should view these workboards occasionally to see if anyone outside the team is doing project management in them.

Definition of Unbreak Now

[edit]

A task is Unbreak Now if it meets one of the following criteria:

  • Anything that makes it impossible to use one of the following features:
    • Reading
    • Posting new topics
    • Replying
    • Viewing history
    • Moderation
  • Any fatal
  • Major performance problems
  • Anything that impacts other teams' products in similar ways
  • Anything that breaks continuous integration, preventing patches from being merged

Definition of Done

[edit]

For a software development task to be marked Resolved and Done, a patch must be:

  • Merged to master.
  • If it meets the definition of Unbreak Now: A cherry-pick must be created for all deployed branches that are affected. (There are always two branches deployed, but in some cases only one is affected). In addition, the deployment of the cherry-picks must be scheduled.
  • If it is user-facing (and generally for internal issues too), it must also be QA-reviewed and product-reviewed. For items in the sprint, simply move the task to the QA Review column. It will later be Product Reviewed.

After all of the above is done, mark then task Resolved then move it to the Done column. The Done column is a requirement of the Sprint extension, so our Burndown charts are accurate.

Gerrit notes

[edit]
  • We need to review patches to all the extensions we maintain; however, prioritize ones connected to the sprint board. Here's a dashboard for Echo and Flow; same dashboard for 1+ week old (this time period can be tweaked freely according to the search documentation).
    • TODO: Update this link, also need to review and monitor Thanks, PageTriage, MoodBar, WikiLove, etc. plus extensions written by the the old Growth team such as GettingStarted.
  • Your commit message must have a line about testing. What unit test or browser test exercises the feature, or why there's no test.
  • Use comments to be clear in what you want. "I want Moriel to review the data change before I'm comfortable +2ing"
  • +1 means "OK to merge but I want someone else to take a look"
  • two +1s means it's good enough to merge. The second +1 on a patch should either +2 it or explain why not in comment.
  • Gerrit forgets +1s when someone submits a new patch set, so whoever does so should summarize the state, e.g. "mlitn +1'd patch 5 and this addresses his concerns, so need one more approval."

Working on a long-lived branch

[edit]
  • Create the branch in gerrit, e.g. frontend-rewrite
  • Check it out locally
  • modify .gitreview in it to add defaultbranch=frontend-rewrite and commit the fix
  • To update the branch to latest master
    1. $ git checkout origin/frontend-rewrite; git merge origin/master; git review
    2. git review will ask if you want to submit all the commits on the branch; it's OK to do so, these are the existing commits that were merged in, gerrit doesn't really submit them as new.
  • to set a labs machine to check out that branch, crontab -e and add
*/5 * * * * cd /mnt/vagrant/mediawiki/extensions/Flow && git checkout frontend-rewrite && git pull --ff-only

Code hygiene

[edit]
  • The Jenkins jobs for the extension does PHP and jshint checks, and runs our PHP unit tests
    • To see the tests, look at the comments from jenkins-bot on a change, e.g. gerrit:115110 and follow the links to the CI run
  • we could enable php codesniffer (phpcs --standard=path/to/codesniffer/MediaWiki . ), but lots of warnings
  • Engineers must set up the pre-review and pre-commit git hooks:
    • make installhooks
    • these run similar tests upon git commit.
  • ToDo: everyone commit automation improvements (git commit hooks, local test scripts, Makefile commands, etc.) to Makefile and scripts

Test

[edit]

We have various tests (see Makefile) but not great coverage. Before you make a change:

  • write tests for it
  • when fixing a regression, try to write a test that fails without the fix.
  • make phpunit, run QUnit and browsertests if applicable
  • PHPUnit and QUnit also run on Jenkins.

Browser testing

[edit]

It's not hard to run Flow's tests/browser, see Quality Assurance/Browser testing/Running tests Every developer needs to understand the features we're testing and proactively fix tests as part of changes so the tests remain green. Assume that any front-end change will require updating tests. Just read the high-level test scenarios in tests/browser/features/*.feature.

To run tests yourself, read Quality Assurance/Browser testing/Running tests. In a nutshell:

  • It's a role in MediaWiki-Vagrant, or install rvm, then Ruby, gem, and bundle manager.
  • On you local wiki, add 'Talk:Flow QA' to $wgFlowOccupyPages; optionally you can create a "Selenium_user" account so your own username don't get all its Echo notifications.
    • Give the username running the tests (yourself or "Selenium_user") block-user, suppress and delete rights in order to test these features. In a typical configuration, use Special:UserRights, and add the user to the "administrator" and "oversight" groups.
  • To run the tests
    • in MW-Vagrant against MW-Vagrant: enter make vagrant-browsertests
    • locally against a web server such as ee-flow:
$ cd core/extensions/Flow/tests/browser
$  SCREENSHOT_FAILURES=true \
   HEADLESS=true \
   BROWSER=chrome
   MEDIAWIKI_USER=Selenium_user \
   MEDIAWIKI_PASSWORD=AskSpage \
   MEDIAWIKI_URL=http://ee-flow.wmflabs.org/wiki/ \
   MEDIAWIKI_API_URL=http://ee-flow.wmflabs.org/w/api.php \
   bundle exec cucumber -f pretty features/''somefeatureFileName''.feature
    • change the variables above to run the browser tests against your own wiki
    • the default browser is Firefox (note Firefox 32 is broken), there's also BROWSER=phantomjs
    • you can append :<linenumber> to run the test with that line number, e.g. ... bundle exec cucumber features/flow_logged_in.feature:24
  • if you just supply features, all browser tests will run, and some will fail.
  • The browser tests assume an existing Flow QA board with existing posts, so on a new wiki run the tests three times and thereafter they should stabilize.

Browser tests are annotated with the browser(s) and server(s) on which they should pass, such as

 @chrome @en.wikipedia.beta.wmflabs.org @firefox @internet_explorer_10 @phantomjs @test2.wikipedia.org

(and with other annotations like @clean, @wip, etc.). Continuous integration passes the appropriate tags to cucumber to only run the tests with these tags, e.g.

 ... bundle exec cucumber --tags @en.wikipedia.beta.wmflabs.org --tags @firefox

If a browser test fails, check its annotations, maybe it's not expected to work in your configurations; conversely if it passes, make sure it has annotations for your browser.

Running more than one test often fails on phantomjs version 1.9.7 because the its WebDriver test doesn't reset session state between tests, a known issue.

A big risk of a change is it degrades user experience on one of many browsers none of us use. To try Flow out on alternate browsers, see Browser testing and design tools on officewiki.

Performance

[edit]

Flow is slow. It has to get faster, it mustn't get slower.

Labs machines

[edit]

The Collaboration team has several labs instances in the editor-engagement project.

  • ee-flow was an old instance with the mediawiki-install::labs role and extensive local mods, so we had to manually maintain it (see wikitech:Help:Single_Node_MediaWiki)
    • User ebernhardson has a crontab on ee-flow that pulls master of extensions/Flow and extensions/Echo. The /etc/motd message that you see when you ssh to ee-flow should tell you if this is running.
  • flow-tests is a newer instance with the role::labs::vagrant, so you can sort of administer it like a MediaWiki-Vagrant host (see wikitech:Labs-vagrant), e.g. labs-vagrant enable-role and labs-vagrant git-update

Do not check out git files on either machine as root

We store changes to LocalSettings.php and other files on these labs instances in a common git repo that they all mount:

 ee-flow$ /srv/mediawiki/orig/LocalSettings.php -> /data/project/git_files/ee-flow/orig/LocalSettings.php

so commit changes to these files to git.

Security

[edit]

Read and follow Security for developers and its accompanying Security checklist for developers.

Templating security

[edit]

Chris Steipp commented on Flow's templating security review bug "please follow my suggestion about policy (here), and make sure the team has a policy that,

  • If substitutions are used in html attributes, those attributes must be quoted with double quotes.
  • Make sure any SafeString objects have their escaping as close to the creation of the SafeString as possible, and that should be as close to the output as possible. It would be really helpful if I don't have to trace back more than 1 (or at most 2) function calls to see the escaping.

Refactoring

[edit]
  • If it feels more scope than a bug fix, make a card for it, use meetings to discuss adding to iteration.
  • Engineers need to campaign for their ideas – bring up major code changes in the "daily" standup, referring to the task

Retrospectives

[edit]

Currently at the Flow_retrospective etherpad. We've also tried dogfooding Flow by doing retrospectives on a test instance Flow board. In late 2013 we did some retrospectives in the Afterparty etherpad.

During the meeting, start by adding issues you want to discuss in one of the three sections ("What worked", "What didn't work", "What's still confusing"). Then, if you're interested in discussing someone else's issue, add +1.

Meetings

[edit]
  • Collaboration team calendar resource: In Google Calendar, add wikimedia.org_megi4kcdmff8tvglq32291snj0@group.calendar.google.com to Other calendars, see Calendars page on office wiki for help.

Absences

[edit]

If you're going on vacation or will be absent for 1/2-day,

  1. Add the absence to the Google Collaboration calendar, so others see it. (add Collaboration calendar to My calendars and show it)
  2. Invite yourself to the event, so others will see you're busy when they try to schedule you
  3. Decline the meetings while you're out so others don't wait for you.

See also

[edit]
  • Flow/Rollout has the list of Flow-enabled pages and process for adding another.