Problem: how do teams collaborate on data?
Pro:
- need is apparent, and can be intertwined with Netcheck work
- can work with Netlify + RKI data team on it
con:
- not necessarily hot topic for Renard / academia in general (SWE research problems ...)
Topics that Renard / Ferdous / Athar are publishing on:
- local hierarchical classification (publishes a Python Library!)
- Deep learning with Pathogens
- detecting viruses in genome sequencing
The Python Library one could be a good point to convince BYR.
Idea: Create a tool / workflow that statisticians can use to verify + document their data.
Topic Draft: tooling to document and verify assumptions on evolving data
Statistical Tests for Verification + Explanation
Goal: easily keep track of knowledge about data
DACS Fit:
- DACS entwickelt statistische und informatische Methoden, um automatisiert große Datenmengen auszuwerten, relevante Signale herauszufiltern und Vorwissen passend einzubinden. [...] Wichtig ist dabei das zielgerichtete Maßschneidern von Verfahren auf spezifische, praktische Probleme.
How will I work?
- survey different teams (across different companies + industries) to find how they keep track of their data knowledge
- identify problems