Frits Hermans
I work as a data scientist for a financial institution. My main topics of interest are entity resolution, fuzzy matching, classification for imbalanced data problems and aggregation learning.
Some of the libraries I created or co-created:
Deduplipy - Entity resolution package (deduplipy.com, GitHub, PyData Global presentation)
Spark-Matcher - Entity resolution and fuzzy matching at scale in Spark (GitHub)
PyMinHash - Minhashing in Python (GitHub)
Other: