Jump to content

Wikimedia Release Engineering Team/DataDataData Sync Up/2019-06-04

From mediawiki.org

2019-06-04

[edit]

Phab task

[edit]

Last time

[edit]

Today's Agenda

[edit]
    • Intro email was sent to Analytics. Nuria responded, asking for a more formal use case document.
    • Dan is on leave starting next week.

Schema and use cases

[edit]

What data we have currently or are planning to collect

[edit]
  • Schema
  • Data samples

How we might want to query that data

[edit]
  • Our data is highly structured (see schemas)
    • Is Hadoop or ES more appropriate for that? Would we lose structure by putting it in Hadoop?
    • How much do we have to know about how data's structure before we put it in ES?
      • Can relationships/schema be changed after data is stored?

TODOs (by next meeting)

[edit]