Heya, y’all! It’s the time of the week where I tell you what I did this week, what I learned, and what I’ll do next week.
Coming out of easter holidays, Tuesday and Wednesday were packed with interviews: Theresa Kocher and Pia Rissom from RKI, Katharina Baum of DACS, and Paul Sieben from the Naumann BP. I didn’t get to writing up my learnings fully, but here’s the gist:
With those learnings, I started working on a first implementation. The current working title is “datafox”, and the code lives at http://github.com/skn0tt/datafox.
Here’s a code snippet for how developers will be able to use it:
import pandas as pd
import datafox
datafox.connect(server="lynx.services.rki.de", api_key="abcde")
income_df = pd.read_csv("some_csv")
with datafox.test(income_df).describe(
title="Income", description="Income description"
) as income:
with income.age as age:
age.expect_numbers()
age.expect_between(0, 100)
with income.salary as salary:
salary.expect_numbers()
salary.expect_not_null()
salary.expect_between(0, "300k")
salary.expect_normal(alpha=0.05)
## alternative, without with statements:
income_datafox = datafox.test(income_df)
income_datafox.describe(
title="Income Distribution", description="Used for analyzing market changes"
)
income_datafox.age.describe(title="age of participants")
income_datafox.age.expect_numbers()
income_datafox.age.expect_between(0, 100)
income_datafox.salary.describe(unit="$")
income_datafox.salary.expect_numbers()
income_datafox.salary.expect_not_null()
income_datafox.salary.expect_between(0, "300k")
income_datafox.salary.expect_normal(alpha=0.05)
I started some brainstorming for a domain model that I can use to capture all expectations & errors:
As always, here’s the list of other TODOs I accomplished this week:
Next week will hopefully contain two more interviews with NetCheck staff, and getting started on implementing all of this!
That’s it for the week. Off to friday beers, have a wonderful weekend y’all! Simon