11/06/2018: Learning from Partitioned Data
Monday 11/06/2018 at 15:00 in room B1.09
The next MSDSlab meetings will be on Monday (Instead of Thursday) the 11th of June and will be presented by Lianne Ippel from Maastricht University. Lianne will present on two themes within the topic of learning from partitioned data.
- Row-by-row (streaming) learning (horizontal) and
- Privacy preserving machine learning (vertical).
- Check out the NWA big data route, especially the VWdata project: https://wetenschapsagenda.nl/route/waardecreatie-door-verantwoorde-toegang-tot-en-gebruik-van-big-data/ and/or (this one is in English) https://www.dutchdigitaldelta.nl/en/big-data/vwdata
- For those unfamiliar with FAIR data, read: ‘The FAIR Guiding Principles for scientific data management and stewardship’ Wilkinson, Dumontier …, & Mons (2016) doi: 1038/sdata.2016.18
Over the last decade, social research workflow has greatly changed. While previously data were often collected using paper-and-pencil questionnaires, nowadays data are often collected using webpages and smartphone applications. This change in gathering data has had many consequences, though in this talk I focus in particular on the partitioning of data. I will discuss two types of partitioned data. Horizontally partitioned data implies that the same variables are available for each respondent, however, not all respondents are available in one central place (e.g., like streaming data). On the other hand, vertically partitioned data means that the same respondents are available at different sites, or institutes. However, each site can have its own set of features, which might or might not be sharable with other sites, e.g., due to the sensitive nature of the features. For these non-sharable features, privacy-preserving data mining/machine learning techniques are required. While discussing this, your input at this part of the talk will be much appreciated!
Lianne Ippel recently started as a Postdoctoral researcher at the Institute for Data Science at Maastricht University. She received her PhD degree from Tilburg University for her thesis “Multilevel Modeling for Data Streams with Dependent Observations”, for which she won ‘Best Thesis Award’ at the General Online Research conference in Cologne (2018). Her research interests are centered around ethical and responsible use of Machine Learning and Machine learning models in relation to methodological issues such as response style, measurement invariance, and missing data.
You must be logged in to post a comment.