This paper proposes a range of probabilistic models of local expertise based on geo-tagged social network streams. We assume that frequent visits result in greater familiarity with the location in question. To capture this notion, we rely on spatio-temporal information from users’ online check-in profiles. We evaluate the proposed models on a large-scale sample of geo-tagged and manually annotated Twitter streams. Our experiments show that the proposed methods outperform both intuitive baselines as well as established models such as the Iterative Inference scheme.
This paper has been accepted for presentation at ECIR 2016.
The use of crowdsourcing for document relevance assessment has been found to be a viable alternative to corpus annotation by highly trained experts. The question of quality control is a recurring challenge that is often addressed by aggregating multiple individual assessments of the same topic-document pair from independent workers. In the past, such aggregation schemes have been weighted or filtered by estimates of worker reliability based on a multitude of behavioral features. We propose an alternative approach by relying on document information. Inspired by the clustering hypothesis, we assume textually similar documents to show similar degrees of relevance towards a given topic. Following up on this intuition, we propagate crowd-generated relevance judgments to similar documents, effectively smoothing the distribution of relevance labels across the similarity space.
Our experiments are based on TREC Crowdsourcing Track data and show that even simple aggregation methods utilizing document similarity information significantly improve over majority voting in terms of accuracy as well as cost efficiency. Combining methods for both aggregation and active learning based on document information improves the results even further.
This paper has been accepted for presentation at the 24th ACM Conference on Information and Knowledge Management (CIKM) in Melbourne, Australia.
Information about a user’s domain knowledge and interest can be important signals for many information retrieval tasks such as query suggestion or result ranking. State-of-the-art user models rely on coarse-grained representations of the user’s previous knowledge about a topic or domain. We study query refinement using eye-tracking in order to gain precise and detailed insight into which terms the user was exposed to in a search session and which ones they showed a particular interest in. We measure fixations on the term level, allowing for a detailed model of user attention. To allow for a wide-spread exploitation of our findings, we generalize from the restrictive eye-gaze tracking to using more accessible signals: mouse cursor traces. Based on the public API of a popular search engine, we demonstrate how query suggestion candidates can be ranked according to traces of user attention and interest, resulting in significantly better performance than achieved by an attention-oblivious industry solution. Our experiments suggest that modelling term-level user attention can be achieved with great reliability and holds significant potential for supporting a range of traditional IR tasks.
The full version of this work has been accepted for presentation at the 38th Annual ACM SIGIR Conference in Santiago, Chile.
Many generative language and relevance models assume conditional independence between the likelihood of observing individual terms. This assumption is obviously naive, but also hard to replace or relax. There are only very few term pairs that actually show significant conditional dependencies while the vast majority of co-located terms has no implications on the document’s topical nature or relevance towards a given topic. It is exactly this situation that we capture in a formal framework: A limited number of meaningful dependencies in a system of largely independent observations. Making use of the formal copula framework, we describe the strength of causal dependency in terms of a number of established term co-occurrence metrics. Our experiments based on the well known ClueWeb’12 corpus and TREC 2013 topics indicate significant performance gains in terms of retrieval performance when we formally account for the dependency structure underlying pieces of natural language text.
The full version of this work has been accepted for presentation at the 38th Annual ACM SIGIR Conference in Santiago, Chile.
Crowdsourcing has developed to become a magic bullet for the data and annotation needs of modern day IR researchers. The number of academic studies as well as industrial applications that employ the crowd for creating, curating, annotating or aggregating documents is growing steadily. Aside from the multitude of scientific papers relying on crowd labour for system evaluation, there has been a strong interdisciplinary line of work dedicated to finding effective and efficient forms of using this emerging labour market. Central research questions include (1) Estimating and optimizing the reliability and accuracy of often untrained workers in comparison with highly trained professionals; (2) How to identify or prevent noise and spam in the submissions; and (3) How to most cost-efficiently distribute tasks and remunerations across workers. The vast majority of studies understands crowdsourcing as the act of making micro payments to individuals in return for compartmentalized units of creative or intelligent labour.
Gamification proposes an alternative incentive model in which entertainment replaces money as the motivating force drawing the workers. Under this alternative paradigm, tasks are embedded in game environments in order to increase the attractiveness and immersion of the work interface. While gamification rightfully points out that paid crowdsourcing is not the only viable option for harnessing crowd labour, it is still merely another concrete instantiation of the community’s actual need: A formal worker incentive model for crowdsourcing. Only by understanding individual motivations can we deliver truly adequate reward schemes that ensure faithful contributions and long-term worker engagement. It is unreasonable to assume that the binary money vs. entertainment decision reflects the full complexity of the worker motivation spectrum. What about education, socializing, vanity, or charity? All of these are valid examples of factors that compel people to lend us their work force. This is not to say that we necessarily have to promote edufication and all its possible siblings as new paradigms, they should merely start to take their well deserved space on our mental map of crowdsourcing incentives.
In this talk, we will cover a range of interesting scenarios in which different incentive models may fundamentally change the way in which we can tap the considerable potential of crowd labour. We will discuss cases in which standard crowdsourcing and gamifica-tion schemes reach the limits of their capabilities, forcing us to rely on alternative strategies. Finally, we will investigate whether crowdsourcing indeed even has to be an active occupation or whether it can happen as a by-product of more organic human behaviour.
If you are interested in the full talk, please join us for the GamifIR Workshop at ECIR 2015 in Vienna, Austria.
Last week, I defended my PhD thesis entitled “Contextual Multidimensional Relevance Models” at TU Delft in The Netherlands. This is, in brief, what awaits the reader:
Information retrieval systems centrally build upon the concept of relevance in order to rank documents in response to a user’s query. Assessing relevance is a non-trivial operation that can be influenced by a multitude of factors that go beyond mere topical overlap with the query. This thesis argues that relevance depends on personal (Chapter 2) and situational (Chapter 3) context. In many use cases, there is no single interpretation of the concept that would optimally satisfy all users in all possible situations.
We postulate that relevance should be explicitly modelled as a composite notion comprised of individual relevance dimensions. To this end, we show how automatic inference schemes based on document content and user activity can be used in order to estimate such constituents of relevance (Chapter 4). Alternatively, we can employ human expertise, harnessed, for example, via commercial crowdsourcing or serious games to judge the degree to which a document satisfies a given set of relevance dimensions (Chapter 5).
Finally, we need a model that allows us to estimate the joint distribution of relevance across all previously obtained dimensions. In this thesis, we propose using copulas, a model family originating from the field of quantitative finances that decouples observations and dependency structure and which can account for complex non-linear dependencies among relevance dimensions (Chapter 6).
Get your copy here.
Modern relevance models consider a wide range of criteria in order to identify those documents that are expected to satisfy the user’s information need. With growing dimensionality of the underlying relevance spaces the need for sophisticated score combination and estimation schemes arises. In this paper, we investigate the use of copulas, a model family from the domain of robust statistics, for the formal estimation of the probability of relevance in high-dimensional spaces. Our experiments are based on the MSLR-WEB10K and WEB30K datasets, two annotated, publicly available samples of hundreds of thousands of real Web search impressions, and suggest that copulas can significantly outperform linear combination models for high-dimensional problems. Our models achieve a performance on par with that of state-of-the-art machine learning approaches.
This paper was accepted for presentation at the 23rd ACM Conference on Information and Knowledge Management (CIKM) in Shanghai, China.
Data visualization and exploration tools are crucial for data scientists, especially during pilot studies. In this paper, we present an extensible open-source workbench for aggregating, summarizing and filtering social network profiles derived from tweets. We demonstrate its range of basic features for two use cases: geo-spatial profile summarization based on check-in histories and social media based complaint discovery in water management.
This paper was accepted as a demonstrator at the 6th Information & Interaction in Context Conference in Regensburg, Germany.
Crowdsourcing is often applied for the task of replacing the scarce or expensive labour of experts with that of untrained workers. In this paper, we argue, that this objective might not always be desirable, but that we should instead aim at leveraging the considerable work force of the crowd in order to support the highly trained expert. Here, we demonstrate this different paradigm on the example of detecting malignant breast cancer in medical images. We compare the effectiveness and efficiency of experts to that of crowd workers, finding significantly better performance at greater cost. In a second series of experiments, we show how the comparably cheap results produced by crowdsourcing workers can serve to make experts more efficient AND more effective at the same time.
The full version of this article has been accepted for presentation at the ECIR 2014 Workshop on Gamification for Information Retrieval (GamifIR) in Amsterdam, The Netherlands.
The Internet is the largest source of information in the world. Search engines help people navigate the huge space of available data in order to acquire new skills and knowledge. In this paper, we present an in-depth analysis of sessions in which people explicitly search for new knowledge on the Web based on the log files of a popular search engine. We investigate within-session and cross-session developments of expertise, focusing on how the language and search behavior of a user on a topic evolves over time. In this way, we identify those sessions and page visits that appear to significantly boost the learning process. Our experiments demonstrate a strong connection between clicks and several metrics related to expertise. Based on models of the user and their specific context, we present a method capable of automatically predicting, with good accuracy, which clicks will lead to enhanced learning. Our findings provide insight into how search engines might better help users learn as they search.
This work has been accepted for publication at the 7th ACM Conference on Web Search and Data Mining (WSDM) in New York City.