Jump to content

Wikimedia Discovery/Meetings/Search retrospective 2017-01-25

From mediawiki.org

Format

[edit]

Glad/Sad/Mad: http://retrospectivewiki.org/index.php?title=Glad,_Sad,_Mad

Note that "mad" and "sad" don't have to mean literally angry or saddened. They can be used in a playful way as well.

The previous retrospective was a Team Health Check: https://www.mediawiki.org/wiki/Wikimedia_Discovery/Meetings/Search_team_health_check_2016-11-30

Action items from the previous retro:

[edit]
  • Dan: work with apps team on full-text searching thing Yes Done (Passed along info)
  • Dan: helping slow moving UI work move faster -- Seems to be going faster now, although not because of specific action by Dan

What has happened (since 2016-10-27)?

[edit]

(Mostly pulled from Discovery/Status updates)

  • Inter-wiki search progress (http://sistersearch.wmflabs.org/ )
  • Closed many backlog tasks based on earlier work
  • EPIC: Review current ElasticSearch configuration, and use relevance lab to run tests to optimise the configuration to improve search result relevance
  • Implement a new fulltext query
  • Image search by file size  and and file type
  • BM25 is now enabled on the ten wikis with largest traffic.
  • Determined goals for the upcoming quarter
  • A workshop was held in Germany on advanced search syntax on Wikipedia
  • Load tests for cross-project searching were completed successfully.
  • We've put together a draft proposal for how to deal with the interaction of all the possible additional search options
  • Upgraded to Java 8
  • The time needed to restart our elasticsearch clusters is improving (T145065)
  • Holidays
  • Dev Summit/All Hands
  • Secondary search results are now possible over the API!
  • Finalized the second BM25 testing analysis
  • Finished writing up, summarizing, and recommending extensive changes to TextCat for language identification
  • Refactoring and cleanup, including moving phan to Jenkins
  • Guillaume investigating on I/O performance of elasticsearch servers
  • New elasticsearch and WDQS servers racked and (almost) configured
  • Katie asked us about how we manage quality
  • Created a new search/Learn to Rank (LTR) plugin (not prod-ready)
  • Various investigations of using dynamic bayessian networks for estimating relevance

What has made you mad?

[edit]
  • Timeout issues with insource (regex search) in production. Still not fixed.
  • Google hangouts being stupid and slow+1 oh yes
    • (Google hangout is SOO much better than the proprietary solution I was used to)
  • My untameable hair
  • My webcam not working after OS update (mine too - had to get a new one from tech)
  • security patch breaking production - something is missing there in the process
    • Maybe we should nudge a bit to see if the process could be improved

What has made you sad?

[edit]
  • Realizing there is so much I do not understand about disk IO
  • Mikhail was unable to attend today's retro
  • Yuri leaving Discovery (no longer staff; remaining as a volunteer)
  • Discussion on quality died too soon (was it wrapped up or just left hanging?)
    • This was from a question from Katie. Seems to have just been left hanging. Maybe we are happy?
    • Main focus of Katie may have been regarding other teams
  • Results of initial learning to rank experiments were only promising for popular queries (because it was trained on popular queries)
  • kerfluffle over Interactive Team (and timing of announce+vacation)
  • sticking my nose back in GC tuning (I already know this is going to eat time like crazy)
  • BM25 being delayed by technical difficulties (rightly so, but still disappointed)
  • Elasticsearch major upgrades still require full cluster restart
    • That's the ES plan, so not something we can necessarily fix
  • Somewhat weak attendance (by everyone else, not Discovery people) at the Discovery quarterly review... hard to interpret whether this is confidence or indifference in our work
    • Reading was more full
    • Probably doesn't even mean anything, honestly

What has made you glad?

[edit]
  • Seeing people in person at the All Hands! +1 +1+1+1+1+1
  • A lot of good planning and designing stuff done @ dev summit & all hands
  • Eating in-n-out burger during the All Hands!+1 (note: In-n-Out != White Castle) (so true!)
  • Starting to understand *some* things about disk IO
  • Seeing the progress on the Labs instance for the front end/back end of the new sister search +1
  • talking with other PMs (outside of Discovery) about how to improve discovery of articles written in other langages (other than English)
  • The excellent conversation and documentation around solving problems and researching solutions (a.k.a. Trey's notes) +that!+1+1+1
  • Knowing it is possible to restart an elasticsearch node in < 2 minutes
  • Big improvements in TextCat performance (on test data, but still)+1
  • ES 2 -> 5 doesn't seem to be as big a change as 1 -> 2
  • Quarterly review went well and the audience seemed excited about the new search stuff coming out in Q3+1
  • Getting more involved with the (entire) Search team again (Deb)+1
  • Dan's untamable hair

What else is on your mind?

[edit]
  • Do we have longer-term maintenance plan for things interactive team has been doing (maps, graphoid, etc.)?

Discussion

[edit]
  • Will Discovery continue to make sense if we have just 2 projects, one of which only has Stas working on it?
    • We still have portal and analysis as well
    • We're not aware of any thoughts or discussions about Discovery going away
  • Wasn't aware of stuff happening with the interactive team until the announcement
    • Doesn't directly affect my work
    • Says something about Discovery that one team is being disbanded and that isn't affecting the other teams. Low cohesion.
      • Not unique to Discovery. Even worse in reading, since they have a micro-vertical. Same in editing.
    • Being remote leads to hearing less gossip (not that gossip is a good thing)
      • These teams (search and interactive) were pretty siloed. Maybe we should look for ways for members to support other teams.
      • Remoteness is not actually a huge factor in these kinds of things, so this is a broader communication issue
      • People in the office might have noticed other people looking unhappy
    • Seemed like it happened so quickly
      • It had been brewing for months
      • Some had been documented
    • Communication is bidirectional. The interactive team didn't communicate out as much as they could have.
  • We used to have all-Discovery retrospectives every month
    • We stopped due to scheduling problems, retros getting longer, and more conversations not relevant to most people in the room
    • Maybe all-Discovery retros would have shared knowledge of some of the issues with the interactive team
    • If more of us had known about problems, how might we have helped?
    • We don't have a lot of people moving between teams or asking for help from other teams (which has pros and cons)
  • Do we have a plan for long-term maintenance of the work the interactive team was doing?
    • Plans are being made; discussions are active
  • We chose an imperfect early announcement rather than a later perfect one.

Action items

[edit]
  • Stas: Look into (talking about) improving how security patches are handled
  • Kevin: Consider scheduling a work-centric version of the unmeeting
    • or maybe a hangout that's always on? or on for a period of time to talk about stuff?
    • Maybe every 4th unmeeting?
    • Maybe allow non-tech work conversations in unmeetings?