Jump to content

Wikimedia Release Engineering Team/Deployment pipeline/2019-01-17

From mediawiki.org

Last Time

[edit]

Current Quarter Goals

[edit]

General

[edit]
  • Pipeline cabal meetup as part of All-Hands?
    • do we have any current projects (2 weeks from now) that would benefit from in-person high-bandwidth?
    • Hacking on ORES maybe?
    • Productivity aside, it was fun at the previous Hackathon. :)
    • Joe: We have some time Monday evening (maybe) but ideally we'd work at the hackaton that no one goes to :(
    • Lars: is anyone opposed to hanging out?
    • Alex: maybe a small group since that's more productive
    • Joe: we should come some ideas for stuff to work on

TODO: start an email thread

  • Lars's email > I'm looking for feedback on whether the vision I'm describing is where we want to end up.
    • fselles: sounds like a comprehensive plan. re:deployment velocity we need existing metrics for this
      • thcipriani: aside RelEng is thinking about this
      • alex: don't we have statsd counters
      • thcipriani: we do, but we can't trace individual scap commands to patches or windows etc
      • jeena: don't we care about how long it takes an individal patch to hit production?
      • dan: I think elasticsearch might be better than a statsd since we need to tie this together using logged metadata (repo deployed, scap commands, patchsets deployed, etc)
    • Joe: one thing I didn't see was self-servicing, i.e., create a repo and everything is setup for a developer -- how much toil is needed for this?
    • Alex: SRE/serviceops, RelEng, 4 different commits in 4 different projects -- there is quite a lot of friction here
    • Joe: something more is needed in terms of UI from the point of view of the developer, we should think about setup from all points of view, maybe when someone creates a .pipeline file it sets up the pipeline for them
    • Lars: I agree that the developer experience should be massively simpler that current, as I was thinking about this I hadn't gotten as far (as UI yet)
    • Joe: we want to take it further!
    • Lars: there is a proposal to start continuous deployment with the Blubberoid service, i.e., not load balancers and k8s; is there any objection to having a continuously deployed Blubberoid?
    • Joe: what you are proposing is > CDep, it is total ownership of a service -- Icinga needs work to allow this -- but I think we can experiment with CDep with Blubberoid
    • Lars: Proposal for this is due to Blubberoid being a nice, safe, small, and friendly service -- it has no dependencies or databases; input over http and output over http -- can't get more simple
    • fselles: icinga does need work, but we need metrics
    • Lars: RelEng will start thinking about what we need to make this happen
    • Joe: we need to work on permissions for k8s
    • Antoine: or we get an Icinga container in the pod that runs the service and deploy it ourselve via helm?
    • Joe: Pearson does a namespace per project including a Jenkins instance but let's not do this :) Sadly CDep may not possible for some services since there are many interdependant services, so it's best to start with something simple

RelEng

[edit]
  • Dan working on Manually defining artifacts results in default copy of all project files
    • Code proposal: .pipeline/blubber.yaml:
      development:
        copies:
         - from: build
           source: /bin/foo
           destination: /bin/foo
        copies:
          - from: local
            source: ./config.dev.yaml
            destination: ./config.yaml
      
    • and a short-hand format/structure that expands
      development:
        copies: [build, local]
      
pipelines:
  serviceOne:
    blubberfile: serviceOne/blubber.yaml # could be the default based on service name for the dir
    helmConfig: serviceOne/helm.yaml # ditto
    directory: src/serviceOne
    variants:
      test: [phpunit, mocha] # defaults to ["test"]
      production: foo # defaults to "production", also supports false for test-only runs
  serviceTwo:
    directory: src/serviceTwo
    • Joe: let's keep everything that developer needs to control in the repo, what dan is proposing seems sane to me
    • Dan: this is the inverse pattern of mediawiki

Minor update things

[edit]

Serviceops

[edit]
  • Zotero has been handled over to Marrielle today \o/
    • Managed to deploy, rollback and get changes through the pipeline
    • One issue that did come up was the difficulty of finding out the version/tag of the image.
    • Should jenkins-bot comment on the change and say "Here's the newly created image: <version>" +1 +1 (even better if it's not in a comment but somewhere more visible)+1
      • Joe: main technical points are there, but we need to polish the ui of the pipeline
      • Jeena: is there no visual indication in jenkins?
      • Dan: Kinda sorta -- we have the blue ocean dashboard, but it's not the default, it needs work -- feedback needs to be addressed sooner rather than later
    • fsells: I ran a patch through the pipeline and it failed, which is fine, but I had no way to rerun it
    • thcipriani: currently you can comment "recheck" on a patch, but that is totally not discoverable, I want a gerrit plugin for this

Services

[edit]

As Always

[edit]