A selection of projects I have been working on.
Not-for-profit¶
100 Queries
In the context of SlimZoeken, we used web search as a lens to investigate the following question:
- What is the content that (Dutch) children find on the internet?
For this study, we selected one hundred queries from a list of two hundred random, authentic searches by primary school pupils.
We have manually annotated the search results to evaluate the quality and suitability of web search results for a specific audience. The presented approach
- provides real-world insights into the web as seen through the eyes of Dutch children, and
- introduces a method that can be applied to various cases.
Output:
- Git repository with notebook for the data analysis
- Presentation at the i&i spring conference 2024 (in Dutch; Conference for Computer Science & Digital Literacy Teachers)
- Presentation at the HSN Conference 2024 (in Dutch; Conference for Teaching Dutch)
The Syllabus
The Syllabus is a project related to the Center for the Advancement of Infrastructural Imagination (CAII):
We are a non-profit knowledge curation platform committed to defending and strengthening a well-informed public sphere. To do that, we strive to unearth, disseminate, and highlight high-quality information – without deepening public dependence on opaque algorithmic solutions pushed by Big Tech. We do so both by offering individual subscribers a “clean” feed of high-quality content – and by working with institutions (think-tanks, foundations, media, companies) who have their own bespoke information needs.
I have been involved since the early stage of the project, designing and implementing Natural Language Processing (NLP) solutions to semi-automatically support the curation process.
Research and Development¶
In my role as a research software engineer at the Netherlands eScience Center, I have been involved in the following projects:
As part of our fellowship programme I have been a mentor for:
- 4Cat: a research tool that can be used to analyse and process data from online social platforms.
- Inseq: a Pytorch-based hackable toolkit to democratize the study of interpretability for sequence generation models.
Journalism¶
Until 2014, I worked as an editor for various Open Source and Software magazines (in German). Since then, I have been an author writing occasionally about Open Source Software, Large Language Models and Artificial Intelligence, and other (mostly) technology-related topics.
- LinuxMagazine (in English)
- Linux-Magazin (in German)
- iX (in German)
- LinuxUser and EasyLinux (in German)
- Jungle World (in German)