Autonomous systems on the Web
Unsupervised Anomaly Detection via Variational Auto-Encoder for Seasonal KPIs in Web Applications
Authors: Haowen Xu, Wenxiao Chen, Nengwen Zhao, Zeyan Li, Jiahao Bu, Zhihan Li, Ying Liu, Youjian Zhao, Dan Pei, Yang Feng, Jie Chen, Zhaogang Wang and Honglin Qiao
Keywords: variational auto-encoder, anomaly detection, seasonal KPI
To ensure undisrupted business, large Internet companies need to closely monitor various KPIs (e.g., Page Views, number of online users, and number of orders) of its Web applications, to accurately detect anomalies and trigger timely troubleshooting/mitigation. However, anomaly detection for these seasonal KPIs with various patterns and data quality has been a great challenge, especially without labels. In this paper, we proposed Donut, an unsupervised anomaly detection algorithm based on VAE. Thanks to a few of our key techniques, Donut greatly outperforms a state-of-arts supervised ensemble approach and a baseline VAE approach, and its best F-scores range from 0.75 to 0.9 for the studied KPIs from a top global Internet company. We come up with a novel KDE interpretation of reconstruction for Donut, making it the first VAE-based anomaly detection algorithm with solid theoretical explanation.
Sharing Deep Neural Network Models with Interpretation
Authors: Huijun Wu, Chen Wang, Jie Yin, Kai Lu and Liming Zhu
Keywords: deep neural network, model sharing, model interpretation
Despite outperforming humans in many tasks, deep neural network models are also criticized for the lack of transparency and interpretability in decision making. The opaqueness results in uncertainty and low confidence when deploying such a model in model sharing scenarios, where the model is developed by a third party. For a supervised machine learning model, sharing training process including training data provides an effective way to gain trust and to better understand model predictions. However, it is not always possible to share all training data due to privacy and policy constraints. In this paper, we propose a method to disclose a small set of training data that is just sufficient for users to get the insight of a complicated model. The method constructs a boundary tree using selected training data and the tree is able to approximate the complicated model with high fidelity. We show that traversing data points in the tree gives users significantly better understanding of the model and paves the way for trustworthy model sharing.
DRN: A Deep Reinforcement Learning Framework for News Recommendation
Authors: Guanjie Zheng, Fuzheng Zhang, Zihan Zheng, Yang Xiang, Nicholas Jing Yuan, Xing Xie and Zhenhui Jessie Li
Keywords: Reinforcement learning, Deep Q-Learning, News recommendation
In this paper, we propose a novel Deep Reinforcement Learning framework for news recommendation. Online personalized news recommendation is a highly challenging problem due to the dynamic nature of news features and user preferences. Although some online recommendation models have been proposed to address the dynamic nature of news recommendation, these methods have three major issues. First, they only try to model current reward (e.g., Click Through Rate). Second, very few work consider to use user feedback other than click / no click labels (e.g., how frequent user returns) to help improve recommendation. Third, these methods tend to keep recommending similar news to users, which may cause users to get bored. Therefore, to address the aforementioned challenges, we propose a Deep Q-Learning based recommendation framework, which can model future reward explicitly. We further consider user return pattern as a supplement to click / no click label in order to capture more user feedback information. In addition, an effective exploration strategy is incorporated to find new attractive news for users. Extensive experiments are conducted on the offline dataset and online production environment of Chinese Bing News and have shown the superior performance of our methods.
ROSC: Robust Spectral Clustering on Multi-scale Data
Authors: Xiang Li, Ben Kao, Siqiang Luo and Martin Ester
Keywords: Spectral clustering, robustness, multi-scale data
We investigate the effectiveness of spectral methods in clustering multi-scale data, which is data whose clusters are of various sizes and densities. We review existing spectral methods that are designed to handle multi-scale data and propose an alternative approach that is orthogonal to existing methods. We put forward the algorithm ROSC, which computes an affinity matrix that takes into account both objects’ feature similarity and reachability similarity. We perform extensive experiments comparing ROSC against 9 other methods on both real and synthetic datasets. Our results show that ROSC performs very well against the competitors. In particular, it is very robust in that it consistently performs well over all the datasets tested. Also, it outperforms others by wide margins for datasets that are highly multi-scale.