Last Time

Current Quarter Goals

TEC3:O6:O:6.1:Q3: Deployment Pipeline Documentation
TEC3:O3:O3.1:Q3: Move cxserver, citoid, changeprop, eventgate (new service) and ORES (partially) through the production CD Pipeline

General

Pipeline cabal meetup as part of All-Hands?
- do we have any current projects (2 weeks from now) that would benefit from in-person high-bandwidth?
- Hacking on ORES maybe?
- Productivity aside, it was fun at the previous Hackathon. :)
- Joe: We have some time Monday evening (maybe) but ideally we'd work at the hackaton that no one goes to :(
- Lars: is anyone opposed to hanging out?
- Alex: maybe a small group since that's more productive
- Joe: we should come some ideas for stuff to work on

TODO: start an email thread

Lars's email > I'm looking for feedback on whether the vision I'm describing is where we want to end up.
- fselles: sounds like a comprehensive plan. re:deployment velocity we need existing metrics for this
  - thcipriani: aside RelEng is thinking about this
  - alex: don't we have statsd counters
  - thcipriani: we do, but we can't trace individual scap commands to patches or windows etc
  - jeena: don't we care about how long it takes an individal patch to hit production?
  - dan: I think elasticsearch might be better than a statsd since we need to tie this together using logged metadata (repo deployed, scap commands, patchsets deployed, etc)
- Joe: one thing I didn't see was self-servicing, i.e., create a repo and everything is setup for a developer -- how much toil is needed for this?
- Alex: SRE/serviceops, RelEng, 4 different commits in 4 different projects -- there is quite a lot of friction here
- Joe: something more is needed in terms of UI from the point of view of the developer, we should think about setup from all points of view, maybe when someone creates a .pipeline file it sets up the pipeline for them
- Lars: I agree that the developer experience should be massively simpler that current, as I was thinking about this I hadn't gotten as far (as UI yet)
- Joe: we want to take it further!
- Lars: there is a proposal to start continuous deployment with the Blubberoid service, i.e., not load balancers and k8s; is there any objection to having a continuously deployed Blubberoid?
- Joe: what you are proposing is > CDep, it is total ownership of a service -- Icinga needs work to allow this -- but I think we can experiment with CDep with Blubberoid
- Lars: Proposal for this is due to Blubberoid being a nice, safe, small, and friendly service -- it has no dependencies or databases; input over http and output over http -- can't get more simple
- fselles: icinga does need work, but we need metrics
- Lars: RelEng will start thinking about what we need to make this happen
- Joe: we need to work on permissions for k8s
- Antoine: or we get an Icinga container in the pod that runs the service and deploy it ourselve via helm?
- Joe: Pearson does a namespace per project including a Jenkins instance but let's not do this :) Sadly CDep may not possible for some services since there are many interdependant services, so it's best to start with something simple

RelEng

Dan working on Manually defining artifacts results in default copy of all project files

Code proposal: .pipeline/blubber.yaml:

development:
  copies:
   - from: build
     source: /bin/foo
     destination: /bin/foo
  copies:
    - from: local
      source: ./config.dev.yaml
      destination: ./config.yaml

and a short-hand format/structure that expands
```
development:
  copies: [build, local]
```

The continuous release pipeline should support more than one service per repo
- Implicit assumption that every repo is one service Counterexample: MediaWiki! (good point :))
- Implicit assumption that there is one test entrypoint per repo
- Code proposal: .pipeline/config.yaml

pipelines:
  serviceOne:
    blubberfile: serviceOne/blubber.yaml # could be the default based on service name for the dir
    helmConfig: serviceOne/helm.yaml # ditto
    directory: src/serviceOne
    variants:
      test: [phpunit, mocha] # defaults to ["test"]
      production: foo # defaults to "production", also supports false for test-only runs
  serviceTwo:
    directory: src/serviceTwo

- Joe: let's keep everything that developer needs to control in the repo, what dan is proposing seems sane to me
- Dan: this is the inverse pattern of mediawiki

Added Wikimedia Portals to tbd on the migration to k8s task https://phabricator.wikimedia.org/T198901#4881831
- seems self-contained
- gets it out of the mediawiki deployment tree (/srv/mediawiki-staging)
- no more portals deploy in SWAT

Minor update things

Blubber docs sparkle: https://wikitech.wikimedia.org/wiki/Blubber
Blubber binary downloads on releases: https://releases.wikimedia.org/blubber/
- Thanks Alex for the review!
ASIDE: Moving scap back to gerrit -- going to use the test portion of the pipeline to run tests -- was really nice and simple (for a person who has contributed to blubber anyway) https://phabricator.wikimedia.org/D1138

Serviceops

Zotero has been handled over to Marrielle today \o/
- Managed to deploy, rollback and get changes through the pipeline
- One issue that did come up was the difficulty of finding out the version/tag of the image.
- Should jenkins-bot comment on the change and say "Here's the newly created image: <version>" +1 +1 (even better if it's not in a comment but somewhere more visible)+1
  - Joe: main technical points are there, but we need to polish the ui of the pipeline
  - Jeena: is there no visual indication in jenkins?
  - Dan: Kinda sorta -- we have the blue ocean dashboard, but it's not the default, it needs work -- feedback needs to be addressed sooner rather than later
- fsells: I ran a patch through the pipeline and it failed, which is fine, but I had no way to rerun it
- thcipriani: currently you can comment "recheck" on a patch, but that is totally not discoverable, I want a gerrit plugin for this

Services

As Always