User Modeling, Interaction
and Experience on the Web

List of accepted papers :

  • Aspect-Aware Latent Factor Model: Rating Prediction with Ratings and Reviews
    Authors: Zhiyong Cheng, Ying Ding, Lei Zhu and Mohan Kankanhalli

    Keywords: Aspect-aware, Matrix Factorization, Recommendation, Topic Model

    Although latent factor model (e.g., matrix factorization) achieves good accuracy in rating prediction, it suffers the problems including cold-start, non-transparency, and suboptimal recommendation for local users or items. In this paper, we exploit textual review information with ratings to tackle these limitations. Firstly, we apply a proposed aspect-aware topic model (ATM) on the review text to model user preferences and item features from different aspects, and estimate the aspect importance of a user towards an item. The aspect importance is then integrated into a novel aspect-aware latent factor model (ALFM), which learns user’s and item’s latent factors based on ratings. In particular, ALFM introduces a weighted matrix to associate those latent factors with the same set of aspects discovered by ATM, such that the latent factors could be used to estimate aspect ratings. Finally, the overall rating is computed via a linear combination of the aspect ratings, which are weighted by the corresponding aspect importance. To this end, our model could alleviate the data sparsity problem and gain good interpretability for recommendation. Besides, an aspect rating is weighted by an aspect importance, which is dependent on the targeted user’s preferences and targeted item’s features. Therefore, it is expected that the proposed method can model a user’s preferences on an item more accurately for each user-item pair locally. Comprehensive experimental studies have been conducted on 19 datasets from Amazon and Yelp 2017 Challenge dataset. Results show that our method achieves significant improvement compared with strong baseline methods, especially for users with only few ratings. Moreover, our model could interpret the recommendation results in depth.

  • Aesthetic-based Clothing Recommendation
    Authors: Wenhui Yu, Huidi Zhang, Xiangnan He, Xu Chen, Li Xiong and Zheng Qin

    Keywords: Clothing recommendation, side information, aesthetic features, tensor factorization, dynamic collaborative filtering

    Recently, product images gain increasing concern in clothing recommendation since the visual appearance of the items have a significant impact on consumers’ decision. Existing models usually extract conventional features, such as convolutional neural network (CNN) features, scale-invariant feature transform (SIFT) features, and color histograms, to represent item image characters and capture user visual preferences. However, one important feature, aesthetic feature, is typically ignored. It is vital in recommendation since users’ decision depends largely on if the clothing is in line with their aesthetic while the conventional image features cannot portray this directly. To bridge this gap, we propose to introduce aesthetic information which is more related with users’ preference into the field of clothing recommender system. To do so, we first present the aesthetic features extracted by an pre-trained neural network, which is a brain inspired deep structure trained for aesthetic assessment task. Considering the aesthetic preference shows diversity with different people and time, we propose a novel tensor factorization model as a basic model and then incorporate the aesthetic features into it. Finally, extensive experiments on real-world datasets demonstrate that our approach can capture the aesthetic preference of consumers and outperform several state-of-the-art models significantly.

  • On the Causal Effect of Badges
    Authors: Tomasz Kusmierczyk and Manuel Gomez Rodriguez

    Keywords: badges, social platform, statistical testing, causality, natural experiment, bootstrap, difference-in-differences, point processes, confounders, counterfactual world

    A wide variety of online platforms use digital badges to encourage users to take certain types of desirable actions. However, despite their growing popularity, their causal effect on users’ behavior is not well understood. This is partly due to the lack of counterfactual data and the myriad of complex factors that influence users’ behavior over time. As a consequence, their design and deployment lacks general principles.
    In this paper, we focus on first-time badges, which are awarded after a user takes a particular type of action for the first time, and study their causal effect by harnessing the delayed introduction of several badges in a popular Q&A website. In doing so, we introduce a novel causal inference framework for first-time badges whose main technical innovations are a robust survival-based hypothesis testing procedure, which controls for the heterogeneity in the benefit users obtain from taking an action, and a bootstrap difference-in-differences method, which controls for the random fluctuations in users’ behavior over time. Our results suggest that first-time badges steer users’ behavior if the initial benefit a user obtains from taking the corresponding action is sufficiently low, otherwise, we do not find significant effects. Moreover, for badges that successfully steered user behavior, we perform a counterfactual analysis and show that they significantly improved the functioning of the site at a community level.

  • Robust Factorization Machines for User Response Prediction
    Authors: Surabhi Punjabi and Priyanka Bhatt

    Keywords: Factorization Machines, Field Aware Factorization Machines, Robust Optimization, Computational Advertising, Response Prediction, Interval Uncertainty

    Factorization machines (FMs) are state-of-the-art model class for user response prediction in the computational advertising domain. Rapid growth of internet and mobile device usage has given rise to multiple customer touchpoints. This coupled with factors like high cookie churn rate results in a fragmented view of user activity at the advertiser’s end. Current literature assumes procured user signals as the absolute truth which is contested by absence of deterministic identity linkage across a user’s multiple avatars. This is the first work advocating the application of Robust Optimization (RO) principles to design approaches that account for these data uncertainties and are immune against perturbations. We propose two novel algorithms: robust factorization machine (RFM) and its field aware variant (RFFM), under interval uncertainty. These formulations are generic and can find applicability in any classification setting under noise. We provide a distributed and scalable Spark implementation using parallel stochastic gradient descent. In the experiments conducted on three real world datasets, the robust counterparts outperform the baselines significantly under perturbed settings. Our experimental findings reveal interesting connections between choice of uncertainty set and the noise-proofness of resulting models.

  • Bayesian Models for Product Size Recommendations
    Authors: Vivek Sembium, Rajeev Rastogi, Lavanya Sita Tekumalla and Atul Saroop

    Keywords: Personalization, Size Recommendation, Bayesian, Polya-Gamma, Probit, Variational Inference

    Lack of calibrated product sizing in popular categories such as apparel and shoes leads to customers purchasing incorrect sizes, which in turn results in high return rates due to fi€t issues. We address the problem of product size recommendations based on customer purchase and return data. We propose a novel approach based on Bayesian logit and probit regression models with ordinal categories {Small, Fit, Large} to model size fits as a function of the difference between latent sizes of customers and products. We propose posterior computation based on mean-field variational inference, leveraging the Polya-Gamma augmentation for the logit prior, that results in simple updates, enabling our technique to efficiently handle large datasets. O„ur experiments with real-life shoe datasets show that our model outperforms the state of the art in 5 of 6 datasets and leads to an improvement of 17-26% in AUC over baselines when predicting size fit outcomes.

  • Variational Autoencoders for Collaborative Filtering
    Authors: Dawen Liang, Rahul Krishnan, Matthew Hoffman and Tony Jebara

    Keywords: Recommender systems, collaborative filtering, implicit feedback, variational autoencoder, Bayesian models

    We extend variational autoencoders (VAEs) to collaborative filtering for implicit feedback. This
    non-linear probabilistic model enables us to go beyond the limited modeling capacity of
    linear factor models which still largely dominate collaborative filtering research.
    We introduce a generative model with multinomial likelihood
    and use Bayesian inference to learn this powerful generative model.
    Despite widespread use in language modeling and economics, the multinomial
    likelihood receives less attention in the recommender systems literature.
    We introduce a different regularization parameter for the learning objective, which proves to be crucial for achieving competitive performance.
    Remarkably, there is an efficient way to tune the parameter using annealing.
    The resulting model and learning algorithm has information-theoretic connections to maximum entropy discrimination and the information bottleneck principle.
    Empirically, we show that the proposed approach significantly outperforms several state-of-the-art baselines,
    including two recently-proposed neural network approaches, on several real-world datasets.
    We also provide extended experiments comparing the multinomial likelihood with other commonly
    used likelihood functions in the latent factor collaborative filtering literature and show
    favorable results. Finally, we identify the pros and cons of employing a principled Bayesian
    inference approach and characterize settings where it provides the most significant improvements.

  • Learning causal effects from many randomized experiments using regularized instrumental variables
    Authors: Alexander Peysakhovich and Dean Eckles

    Keywords: causal inference, experimentation, instrumental variables, machine learning

    Scientific and business practices are increasingly resulting in large collections of randomized experiments. Analyzed together, these collections can tell us things that individual experiments in the collection cannot. We study how to learn causal relationships between variables from the kinds of collections faced by modern data scientists: the number of experiments is large, many experiments have very small effects, and the analyst lacks metadata (e.g., descriptions of the interventions). Here we use experimental groups as instrumental variables (IV) and show that a standard method (two-stage least squares) is biased even when the number of experiments is infinite. We show how a sparsity-inducing l_0 regularization can — in a reversal of the standard bias–variance tradeoff in regularization — reduce bias (and thus error) of interventional predictions. Because we are interested in interventional loss minimization we also propose a modified cross-validation procedure (IVCV) to feasibly select the regularization parameter. We show, using a trick from Monte Carlo sampling, that IVCV can be done using summary statistics instead of raw data. This makes our full procedure simple to use in many real-world applications.

  • Camel: Content-Aware and Meta-path Augmented Metric Learning for Author Identification
    Authors: Chuxu Zhang, Chao Huang, Lu Yu, Xiangliang Zhang and Nitesh Chawla

    Keywords: Author Identification, Heterogeneous Networks, Representation Learning, Metric Learning, Deep Learning

    In this paper, we study the problem of author identification in big scholarly data, which is to effectively rank potential authors for each anonymous paper by using historical data. Most of the existing de-anonymization approaches predict relevance score of paper-author pair via feature engineering, which is not only time and storage consuming, but also introduces irrelevant and redundant features or miss important attributes. Representation learning can automate the feature generation process by learning nodes’ embeddings in academic network to infer the correlation of paper-author pair. However, the learned embeddings are often for general purpose (independent of the specific task), or based on network structure only (without considering the node content). To address these issues and make a further progress in solving the author identification problem, we propose a content-aware and meta-path augmented metric learning model. Specifically, first, the directly correlated paper-author pairs are modeled based on distance metric learning by introducing a push loss function. Next, the paper’s content embedding encoded by the gated recurrent neural network is integrated into the distance loss. Moreover, the historical bibliographic data of papers is utilized to construct an academic heterogeneous network, wherein a meta-path guided walk integrative learning module based on the task-dependent and content-aware Skipgram model is designed to formulate the correlations between each paper and its indirect author neighbors, and further augments the model. The results of extensive evaluations and analytical experiments on the well known AMiner dataset demonstrate that the proposed model achieves better performance, comparing to the state-of-the-art baselines. It achieves an average improvement of 8.3\% over the best baseline method.

  • Prediction of Sparse User-Item Consumption Rates with Zero-Inflated Poisson Regression
    Authors: Moshe Lichman and Padhraic Smyth

    Keywords: consumption rate modeling, repeat consumption, explore-exploit, zero-inflated poisson

    There are a variety of applications where user behavior consists of a combination of both repeat item consumption and new item consumption, such as listening to music artists, visiting Web sites, purchasing groceries, and so on.
    In this paper we address the problem of building user models that can predict the rate at which individuals consume both old and new items. We use zero-inflated Poisson (ZIP) regression models as the basis for our modeling approach, leading to a general framework for modeling user-item consumption rates over time. We show that these models are more flexible in capturing user behavior than alternatives such as well-known latent factor and embedding models. We compare the performance of ZIP regression and latent factor and embedding models on three different data sets involving music, restaurant reviews, and social media. The ZIP regression models are systematically more accurate across all three data sets across different prediction metrics.

  • Latent Relational Metric Learning via Memory-based Attention for Collaborative Ranking
    Authors: Yi Tay, Anh Tuan Luu and Siu Cheung Hui

    Keywords: Collaborative Filtering, Recommender Systems, Neural Networks, Deep Learning, Attention Mechanism

    This paper proposes a new neural architecture for collaborative ranking with implicit feedback. Our model, LRML (Latent Relational Metric Learning) is a novel extension of metric learning approaches for recommendation. More specifically, instead of simple push pull mechanisms between user and item pairs, we propose to learn latent relations for each user item interaction. This helps to alleviate the potential geometric inflexibility of existing metric learning approaches. This not only enables better performance but also a greater extent of modeling capability, allowing our model to scale to larger number of interactions. In order to do so, we employ a augmented memory module and learn to attend over these memory blocks to construct latent relations. The attention module is controlled by the user-item interaction, making the learned relation vector specific to each user-item pair. Hence, this can be interpreted as learning an exclusive and optimal relational translation for each user-item interaction. The proposed architecture not only demonstrates the state-of-the-art performance across multiple recommendation benchmarks. LRML outperforms other metric learning models by 6%-7.5% in terms of Hits@10 and nDCG@10 on large datasets such as Netflix and MovieLens20M. Moreover, qualitative studies also demonstrate evidence that our proposed model is able to infer and encode explicit sentiment, temporal and attribute information despite being only trained on implicit feedback. As such, this ascertains the ability of LRML to uncover hidden relational structure within implicit datasets.

  • AdaError: An Adaptive Learning Rate Method for Matrix Approximation-based Collaborative Filtering
    Authors: Dongsheng Li, Chao Chen, Qin Lv, Hansu Gu, Tun Lu, Li Shang, Ning Gu and Stephen Chu

    Keywords: recommender systems, collaborative filtering, matrix approximation

    In matrix approximation (MA)-based collaborative filtering (CF) algorithms, gradient-based learning methods, e.g., stochastic gradient descent (SGD), are widely adopted to learn MA models based on observed user-item ratings. However, one of the common issues in existing gradient-based learning methods is how to determine proper learning rates, because the model convergence will be inaccurate or very slow if the learning rate is
    too large or too small, respectively.
    This paper proposes AdaError — an adaptive learning rate method for matrix approximation-based collaborative filtering. AdaError can reduce the learning rates for noisy ratings to prevent the learned models from overreacting to the noises. Meanwhile, AdaError can adaptively shrink the learning rates to eliminate the need of manually tuning the learning rates. Our theoretical and empirical analysis shows that the generalization performance of learned MA models can be improved using AdaError. Experimental studies on MovieLens and Netflix datasets demonstrate that the proposed method can outperform state-of-the-art adaptive learning rate methods in matrix approximation-based collaborative filtering. Meanwhile, by applying the proposed AdaError method on standard matrix approximation method, we can achieve statistically significant improvements in both rating prediction accuracy and top-N recommendation accuracy compared with state-of-the-art collaborative filtering methods.

  • Anxiety and Information Seeking: Evidence From Large-Scale Mouse Tracking
    Authors: Brit Youngmann and Elad Yom-Tov

    Keywords: Mouse tracking, Relevance, User interaction, Anxiety

    People seeking information through search engines are assumed to
    behave similarly, regardless of the topic which they are searching.
    Here we use mouse tracking, which is correlated with gaze, to show
    that the information seeking patterns of people differ dramatically
    depending on their level of anxiety at the time of the search.
    We investigate the behavior of people during searches for medical
    symptoms, ranging from benign indications, where users are
    not usually anxious, to ones which could harbinger life-threatening
    conditions, where extreme anxiety is expected. We show that for the
    latter, 90% of people never saw more than the top 67% of the screen,
    compared to over 95% scanned by people seeking information on
    benign symptoms, even though relevant documents are similarly
    distributed in the results pages to these queries. Based on this observation,
    we develop a model which can predict the level of anxiety
    experienced by a user, using attributes derived from mouse tracking
    data and other user interactions. The model achieves Kendall’s Tau
    of 0.48 with the medical severity of the symptoms searched.
    We show the importance of using information about the users’
    level of anxiety as predicted by the model, when measuring search
    engine performance. Our results prove that ignoring this information
    can lead to significant over-estimation of performance. Additionally,
    we show the utility of the model in three special instances: where
    multiple symptoms are searched concurrently; where the searcher
    has an underlying medical condition; and when users seek information
    on ways to commit suicide. In the latter, our results demonstrate
    the importance of help-line notices, and emphasize the need to measure
    the effective number of results seen by the user.
    Our results indicate that measures of relevance which use anxiety
    information can lead to more accurate understanding of the quality
    of search results, especially when delivering potentially life-saving
    information to users.

  • Through a Gender Lens: Learning Usage Patterns of Emojis from Large-Scale Android Users
    Authors: Zhenpeng Chen, Xuan Lu, Wei Ai, Huoran Li, Qiaozhu Mei and Xuanzhe Liu

    Keywords: Emojis, Gender, User profiling, Language-independent

    Based on a large dataset of emoji usage collected from smartphone users across the world, this paper investigates usage of emojis from the gender perspective. We present various interesting findings that evidence a considerable difference in emoji usage between male and female users. Such a difference is significant not just in a statistical sense; it is sufficient for a machine learning algorithm to accurately infer the gender of a user purely based on the emojis used in their messages. In real-world scenarios where gender inference is a necessity, models based on emojis have unique advantages over existing models that are based on the textual or contextual information. Emojis not only provide the language-independent indicator, but also alleviate the risk of leaking private user information through the analysis of text and context.

  • Coevolutionary Recommendation Model: Mutual Learning between Rating and Reviews
    Authors: Yichao Lu, Ruihai Dong and Barry Smyth

    Keywords: Recommender Systems, User Experience, Natural Language Processing

    Collaborative filtering (CF) is a common recommendation approach that relies on user-item ratings. However, the natural sparsity of user-item rating data can be problematic in many domains and settings, limiting the ability to generate accurate predictions and effective recommendations. Moreover, in some CF approaches latent features are often used to represent users and items, which can lead to a lack of recommendation transparency and explainability. User-generated, customer reviews are now commonplace on many web sites, providing users with an opportunity to convey their experiences and opinions of products and services. As such, these reviews have the potential to serve as a useful source of recommendation data, through capturing valuable sentiment information about particular product features. In this paper, we present a novel deep learning recommendation model, which co-learns user and item information from ratings and customer reviews, by optimizing matrix factorization and an attention-based GRU network. Using real-world datasets we show a significant improvement in recommendation performance, compared to a variety of alternatives. Furthermore, the approach is useful when it comes to assigning intuitive meanings to latent features to improve the transparency and explainability of recommender systems.

  • How to Impute Missing Ratings? Claims, Solution, and Its Application to Collaborative Filtering
    Authors: Youngnam Lee, Sang-Wook Kim, Sunju Park and Xing Xie

    Keywords: Recommender Systems, Collaborative Filtering, Data Sparsity, Data Imputation

    Data sparsity is one of the biggest problems faced by collaborative filtering used in recommender systems. Data imputation alleviates the data sparsity problem by inferring missing ratings and imputing them to the original rating matrix. In this paper, we identify the limitations of existing data imputation approaches and suggest three new claims that all data imputation approaches should follow to achieve high recommendation accuracy. Furthermore, we propose a deep-learning based approach to compute imputed values that satisfies all three claims. Based on our hypothesis that most pre-use preferences (e.g., impressions) on items lead to their post-use preferences (e.g., ratings), our approach tries to understand via deep learning how pre-use preferences lead to post-use preferences differently depending on the characteristics of users and items. Through extensive experiments on real-world datasets, we verify our three claims and hypothesis, and also demonstrate that our approach significantly outperforms existing state-of-the-art approaches.

  • When Sheep Shop: Measuring Herding Effects in Product Ratings with Natural Experiments
    Authors: Gael Lederrey and Robert West

    Keywords: product reviews, ratings, herding, social influence, natural experiment, observational study

    As online shopping becomes ever more prevalent, customers rely increasingly on product rating websites for making purchase decisions. The reliability of online ratings, however, is potentially compromised by the so-called herding effect: when rating a product, customers may be biased to follow other customers’ previous ratings of the same product. This is problematic because it skews long-term customer perception through haphazard early ratings.
    The study of herding poses important methodological challenges. Observational studies are impeded by the lack of counterfactuals: simply correlating early with subsequent ratings is insufficient because we cannot know what the subsequent ratings would have looked like had the first ratings been different. Experimental studies are rarely an option because either they manipulate real customers’ attitudes toward real products, or they examine lab settings that might differ fundamentally from real settings.
    The methodology introduced here exploits a situation that comes close to an experiment, although it is purely observational – a natural experiment. Our key methodological device consists in studying the same product on two separate rating sites, focusing on products that received a high first rating on one site, and a low first rating on the other. This largely controls for confounds such as a product’s inherent quality, advertising, and producer identity, and lets us isolate the effect of the first rating on subsequent ratings. In a case study, we focus on beers as products and jointly study two beer rating sites, but our method applies to any pair of sites across which products can be matched. We find clear evidence of herding in beer ratings. For instance, if a beer receives a very high first rating, its second rating is on average half a standard deviation higher, compared to a situation where the identical beer receives a very low first rating. Moreover, herding effects tend to last a long time and are noticeable even after 20 or more ratings. Our results have important implications for the design of better rating systems.

  • Modeling Interdependent and Periodic Real-World Action Sequences
    Authors: Takeshi Kurashima, Tim Althoff and Jure Leskovec

    Keywords: real-world behavior, human action sequence, real-world actions, periodic behavior, user modeling, activity tracking, activity logging, quantified self, mobile health, point process

    Mobile health applications, including those that track activities such as exercise, sleep, and diet, are becoming widely used. Accurately predicting human actions in the real world is essential for targeted recommendations that could improve our health and for personalization of these applications. However, making such predictions is extremely difficult due to the complexities of human behavior, which consists of a large number of potential actions that vary over time, depend on each other, and are periodic. Previous work has not jointly modeled these dynamics and has largely focused on item consumption patterns instead of broader types of behaviors such as eating, commuting or exercising.
    In this work, we develop a novel statistical model, called TIPAS, for Time-varying, Interdependent, and Periodic Action Sequences. Our approach is based on personalized, multivariate temporal point processes that model time-varying action propensities through a mixture of Gaussian intensities. Our model captures short-term and long-term periodic interdependencies between actions through Hawkes process-based self-excitations. We evaluate our approach on two activity logging datasets comprising 12 million real-world actions (e.g., eating, sleep, and exercise) taken by 20 thousand users over 17 months. We demonstrate that our approach allows us to make successful predictions of future user actions and their timing. Specifically, TIPAS improves predictions of actions, and their timing, over existing methods across two datasets by up to 156% and 37%, respectively. Performance improvements are particularly large for relatively rare and periodic actions such as walking and biking, improving over baselines by up to 256%. This demonstrates that explicit modeling of dependencies and periodicities in real-world behavior enables successful predictions of future actions, with implications for modeling human behavior, app personalization, and targeting of health interventions.

  • The Effect of Ad Blocking on User Engagement with the Web
    Authors: Ben Miroglio, David Zeber, Jofish Kaye and Rebecca Weiss

    Keywords: Ad Blocking, Propensity Scoring, Natural Experiment, Web engagement

    Web users are increasingly turning to ad blockers to avoid ads, which are often perceived as annoying or an invasion of privacy. While there has been significant research into the factors driving ad blocker adoption and the detrimental effect to ad publishers on the Web, the resulting effects of ad blocker usage on Web users’ browsing experience is not well understood. To approach this problem, we conduct a retrospective natural field experiment using Firefox browser usage data, with the goal of estimating the effect of adblocking on user engagement with the Web. We focus on new users who installed an ad blocker after a baseline observation period, to avoid comparing different populations. Their subsequent browser activity is compared against that of a control group, whose members do not use ad blockers, over a corresponding observation period, controlling for prior baseline usage. In order to estimate causal effects, we employ propensity score matching on a number of other features recorded during the baseline period. In the group that installed an ad blocker, we find significant increases in both active time spent in the browser (+28% over control) and the number of pages viewed (+15% over control), while seeing no change in the number of searches. Additionally, by reapplying the same methodology to other popular Firefox browser extensions, we show that these effects are specific to ad blockers. We conclude that ad blocking has a positive impact on user engagement with the Web, suggesting that any costs of using ad blockers to users’ browsing experience are largely drowned out by the utility that they offer.