Virtual worlds are a topic of steadily growing relevance. Some of the providers report user numbers that exceed the population of entire nations in the real world. Virtual worlds typically provide a high degree of complexity, which in some areas approaches the real world’s richness of detail. Without “living” in any given virtual world it is hard to get insights about that world and its inhabitants. Knowledge about users’ roles within a virtual world can be of socio-economical and scientific interest. Our ACM Computers in Entertainment article describes an automatic means of inferring such roles based on textual communication. We give an introduction into virtual worlds and formalize the task of virtual world role detection as well as evaluating its performance against a manually annotated large-scale corpus. We present and discuss various approaches towards virtual world role detection, showing that performance close to that of human judges can be achieved. We close by demonstrating a successful application of role detection in a free to play MMORPG.
Crowdsourcing successfully strives to become a widely used means of collecting large-scale scientific corpora. Many research fields, including Information Retrieval, rely on this novel way of data acquisition. However, it seems to be undermined by a significant share of workers that are primarily interested in producing quick generic answers rather than correct ones in order to optimize their time-efficiency and, in turn, earn more money. Recently, we have seen numerous sophisticated schemes of identifying such workers. Those, however, often require additional resources or introduce artificial limitations to the task. We took a different approach by investigating means of a priori making crowdsourced tasks less attractive for cheaters.
“Increasing Cheat Robustness of Crowdsourcing Tasks” has been accepted for publication in Information Retrieval.
Relevance assessment of query document pairs is expensive and boring? It does not have to be. We are currently developing a playful means of determining relevance. We select keywords from the pages and ask judges to associate them to topics in a game.
Please follow the link to the current version of the GeAnn annotation game.
Twitter is a widely-used social networking service which enables its users to post short text-based messages, so-called tweets. POI (Point of Interest) tags on tweets can show more human-readable high-level information about a place that is more meaningful and better interpretable than a pair of coordinates. We studied the prediction of POI tags based on a tweet’s textual content and time of posting. Potential applications include accurate positioning when GPS devices fail or disambiguating places located near each other. We consider this task as a ranking problem, i.e., we rank a set of candidate POIs according to a tweet by using statistical models of language use and temporal distribution of tweets. To tackle the sparsity of tweets tagged with POIs, we use web pages retrieved by search engines as an additional source of evidence. Our experiments show that tweets indeed have relationships with their places of origin in both textual and temporal dimensions.
This initial exploratory study will be presented as a poster at the 20th ACM International Conference on Information and Knowledge Management (CIKM) in Glasgow, UK.
Crowdsourcing is frequently used to obtain relevance judgments for query/document pairs. To get accurate judgments, each pair is judged by several workers. Consensus is usually determined by majority voting and malicious submissions are typically countered by injecting gold set questions with known answers. We put the performance of gold sets and majority voting to the test. After an analysis of crowdsourcing results for a relevance judgment task, we design and evaluate an alternative method to reduce spam and increase accuracy. By using large-scale simulations, we compare performance between different algorithms, inspecting accuracy and costs for different experimental settings. The results show that gold sets and majority voting are less robust to malicious submissions than many believe and can easily be outperformed.
The full study has been accepted for publication in the SIGIR 2011 Workshop on Crowdsourcing for Information Retrieval (CIR) in Beijing, China.
Lately, crowdsourcing has become an accepted means of creating resources for tasks that require human intelligence. Information Retrieval and related fields frequently exploit it for system building and evaluation purposes. However, malicious workers often try to maximise their financial gains by producing generic answers rather than actually working on the task. Identifying these individuals is a challenging process into which both crowdsourcing providers and requesters invest significant amounts of time.
Based on our experience from several corwdsourcing experiments Arjen P. de Vries and I compiled an overview of typically observed malicious strategies on crowdsourcing platforms and evaluated how careful HIT design can discourage cheaters. Based on a range of experiments, we conclude that malicious workers are less frequently encountered in novel tasks that involve a degree of creativity and abstraction. While there are various means of identifying forged submissions, setting tasks up in a non-repetitive way and requiring creative input can greatly increase the share of faithful workers.
The Internet plays an important role in people’s daily lives. This is not only true for adults, but also holds for children; however, current web search engines are designed with adult users and their cognitive abilities in mind. Consequently, children face considerable barriers when using these information systems. At ECIR 2011 in Dublin, we will demonstrate the use of query assistance and search moderation techniques as well as appropriate interface design to overcome or mitigate these challenges. This demonstrator builds on the modular base infrastructure of the open source PuppyIR Framework.
Content-sharing platforms such as YouTube or MyVideo are experiencing huge user numbers that are still rising very quickly. Among the users there is a steadily growing share of children. In spite of this tendency the content of many popular videos is not suitable for children and should therefore not be shown to them. In our work we present an automatic method for determining a shared video’s suitability for children based on non-audio-visual data. We evaluate its performance on a corpus of web videos that was annotated by domain experts. We finally show how community expertise in the form of user comments and ratings can yield better prediction results than directly video-related information.
“Identifying Suitable YouTube Videos for Children” together with Arjen P. de Vries was accepted as a full paper at the NEM Summit.
“Web Page Classification on Child Suitability” together with Pavel Serdyukov and Arjen P. de Vries was accepted as a short paper at CIKM 2010.