Program
of
track
Cognitive Computing

List of accepted papers :

  • AI Cognition in Searching for Relevant Knowledge from Scholarly Big Data, Using a Multi-layer Perceptron and Recurrent Convolutional Neural Network Model
    Authors: Saeed-Ul Hassan, Iqra Safder and Naif Aljohani

    Keywords: AI enabled Search Engine, Algorithm search, Cognitive computing, Scholarly big data, Multi-layer perceptron (MLP), Recurrent convolutional neural network (RCNN)

    Abstract:
    Although, over the years, information retrieval systems have shown tremendous improvements in searching for relevant scientific literature, human cognition is still required to search for specific document elements in full text publications. For instance, pseudocodes pertaining to algorithms published in scientific publications cannot be correctly matched against user queries, hence the process requires human involvement. AlgorithmSeer, a state-of-the-art technique, claims to replace humans in this task, but one of the limitations of such an algorithm search engine is that the metadata is simply a textual description of each pseudocode, without any algorithm-specific information. Hence, the search is performed merely by matching the user query to the textual metadata and ranking the results using conventional textual similarity techniques. The ability to automatically identify algorithm-specific metadata such as precision, recall, or f-measure would be useful when searching for algorithms. In this article, we propose a set of algorithms to extract further information pertaining to the performance of each algorithm. Specifically, sentences in an article that convey information about the efficiency of the corresponding algorithm are identified and extracted using a recurrent convolutional neural network (RCNN). Furthermore, we propose improving the efficacy of the pseudocode detection task by using a multi-layer perceptron (MLP) classification trained with 15 features, which improves the classification performance of the state-of-the-art pseudocode detection methods used in AlgorithmSeer by 27%. Finally, we show the advantages of the AI-enabled search engine (based on RCNN and MLP models) over conventional text-retrieval models.

  • A framework for human-in-the-loop monitoring of concept-drift detection in event log stream
    Authors: Sylvio Barbon Junior, Gabriel Marques Tavares, Victor Guilherme Turrisi Da Costa, Paolo Ceravolo and Ernesto Damiani

    Keywords: Process Mining, DBScan, concept-drift, Clustering, Stream Mining

    Abstract:
    One of the main challenges of Cognitive Computing (CC) is reacting to evolving environments in near-real time. Therefore, it is expected that CC models provide solutions by examining a summary of past history, rather than using full historical data. This strategy has significant benefits in terms of response time and space complexity but poses new challenges in term of concept-drift detection, where both long term and short terms dynamics should be taken into account. In this paper, we introduce the Concept-Drift in Event Stream Framework (CDESF) that addresses some of these challenges for data streams recording the execution of a Web-based business process. Thanks to CDESF support for feature transformation, we perform density clustering in the transformed feature space of the process event stream, observe track concept-drift over time and identify anomalous cases in the form of outliers. We validate our approach using logs of an e-healthcare process.

  • Analyzing and Predicting Emoji Usages in Social Media
    Authors: Peijun Zhao, Jia Jia, Yongsheng An, Jie Liang, Lexing Xie and Jiebo Luo

    Keywords: emoji, GRU, multimodality, multitask

    Abstract:
    Emojis can be regarded as a language for graphical expression of emotions, and have been widely used in social media. They can express more delicate feelings beyond textual information and make the communication more harmonious. Recent advances in machine learning enable automatic composition of emojis to text messages. However, the usages of emojis are so complicated that analyzing and predicting emojis is a challenging problem. In this paper, we first construct a benchmark dataset of emojis with tweets and systematically investigate emoji usages with tweet content, tweet structure and user demographics. Inspired by the investigation results, we further propose a multitask multimodality gated recurrent unit (mmGRU) model to predict the categories and positions of emojis. The model leverages not only multimodality information such as text, image and user demographics, but also the strong correlations between emoji categories and their positions. Our experimental results show that the proposed method can significantly improve the accuracy for predicting emojis for tweets (+9.0% in F1-value for category and +4.6% in F1-value for position on the average). Based on the experimental results, we conduct a series of case studies to further unveil how emojis are used in social media.

  • Automatic Hierarchical Table of Contents Generation for Educational Videos
    Authors: Debabrata Mahapatra, Ragunathan Mariappan and Vaibhav Rajan

    Keywords: Table of Contents, Educational Video, Shot Segmentation, Text Summarization, Tree Knapsack

    Abstract:
    The number of freely available online educational videos from universities and other organizations is growing rapidly. Accurate indexing and summarization are essential for efficient search, recommendation and effective consumption of videos. In this paper, we describe a new method of automatically creating a hierarchical table of contents for a video. It provides a summary of the video content along with a textbook–like facility for nonlinear navigation and search through the video. Our multimodal approach combines new methods for shot level video segmentation and for hierarchical summarization. Empirical results demonstrate the efficacy of our approach on many educational videos.

  • The Grass is Greener on the Other Side: Understanding the Effects of Green Spaces on Twitter User Sentiments
    Authors: Kwan Hui Lim, Kate E. Lee, Dave Kendal, Lida Rashidi, Elham Naghizade, Stephan Winter and Maria Vasardani

    Keywords: Green Spaces, Urban Areas, Empirical Study, Twitter

    Abstract:
    Green spaces are believed to improve the well-being of users in urban areas. While there are urban research exploring the emotional benefits of green spaces, these works are based on user surveys and case studies, which are typically small in scale, intrusive, time-intensive and costly. In contrast to earlier works, we utilize a non-intrusive methodology to understand green space effects at large-scale and in greater detail, via digital traces left by Twitter users. Using this methodology, we perform an empirical study on the effects of green spaces on user sentiments and emotions in Melbourne, Australia and our main findings are: (i) tweets in green spaces evoke more positive and less negative emotions, compared to those in urban areas; (ii) each season affects various emotion types differently; (iii) there are interesting changes in sentiments based on the hour, day and month that a tweet was posted; and (iv) negative sentiments are typically associated with large transport infrastructures such as train interchanges, major road junctions and railway tracks. The novelty of our study is the combination of psychological theory, alongside data collection and analysis techniques on a large-scale Twitter dataset, which overcomes the limitations of traditional methods in urban research.

  • Lifecycle-Based Event Detection from Microblogs
    Authors: Lin Mu, Peiquan Jin, Lizhou Zheng, En-Hong Chen and Lihua Yue

    Keywords: Event Evolution, Emotional Evolution, Tracking, Microblog, Detection

    Abstract:
    Microblog like Twitter and Sina Weibo has been an important information source for event detection and monitoring. In many decision-making scenarios, it is not enough to only provide a structural tuple for an event, e.g., a 5W1H record like . Extracting such event tuples has been widely studied before. However, in addition to event tuples, people need to know the evolution lifecycle of an event. The lifecycle description of an event is more helpful for decision making because people can focus on the progress and trend of events. In this paper, we propose a novel method for efficiently detecting and tracking event evolution on microblogging platforms. The major features of our study are: (1) It provides a novel event-type-driven method to extract event tuples, which forms the foundation for event evolution analysis. (2) It describes the lifecycle of an event by a staged model, and provides effective algorithms for detecting the stages of an event. (3) It offers emotional analysis over the stages of an event, through which people are able to know the public emotional tendency over a specific event at different time periods. We build a prototype system and present its architecture and implemental details in the paper. In addition, we conduct experiments on real microblog datasets. The results in terms of precision, recall, and F-measure suggest the effectiveness and efficiency of our proposal.

  • Human-Guided Flood Mapping: From Experts to the Crowd
    Authors: Jiongqian Liang, Peter Jacobs and Srinivasan Parthasarathy

    Keywords: Flood mapping, Human-guided, Semi-supervised, Crowdsourcing

    Abstract:
    Flood mapping is the process of distinguishing flooded areas from non-flooded areas during and shortly after a disaster. It can be very useful for prioritizing relief efforts and in assessing flood risk. Typical approaches to flood mapping rely on analyzing satellite imagery. Identification of water areas in such images can be challenging considering the heterogeneity in water body size and shape, cloud cover, and natural variations in land cover. In this effort, we introduce a novel semi-supervised learning algorithm, called HUman-Guided Flood Mapping (HUG-FM), specifically designed to tackle the flood mapping problem. We first divide the satellite image into patches in a graph-based approach. A domain expert is then asked to provide labels for a few patches; we learn a classifier based on the provided labels to discriminate other patches as either water or land. We test the efficacy and efficiency of our algorithm on satellite imagery from several recent flood-induced emergencies including the 2015 Chennai flood, the 2016 Houston flood, and the 2016 North Carolina flood. Results show that our algorithm can robustly and correctly detect water areas compared to baseline methods. Moreover, we study whether expert guidance can be replaced by the wisdom of a crowd. To this end, we design an online crowdsourcing platform based on HUG-FM and propose a novel ensemble method to leverage crowdsourcing efforts. We conduct an experiment with more than 50 participants and show that crowdsourced HUG-FM (CHUG-FM) can approach or even exceed the performance of a single expert providing guidance (HUG-FM).

  • Human-level multiple choice question guessing without domain knowledge: Machine-learning of framing effects.
    Authors: Patrick Watson, Tengfei Ma, Ravi Tejwani, Jaewook Ahn, Maria Chang and Sharad Sundararajan

    Keywords: Web-based assessment, Naive Features, Cognitive Computing for Education, AI for education, Machine Learning for Education, Open Educational Resources, Framing effects

    Abstract:
    The availability of open educational resources (OER) has enabled educators and researchers to access a variety of learning assessments online. OER communities are particularly useful for gathering multiple choice questions (MCQs), which are easy to grade, but difficult to design well. To account for this, OERs often rely on crowd-sourced data to validate the quality of MCQs. However, because crowds contain many non-experts, and are susceptible to question framing effects (Tversky & Kahnaman 1981), they may produce ratings driven by guessing on the basis of surface-level linguistic features, rather than deep topic knowledge. Consumers of OER multiple choice questions (and authors of original multiple choice questions) would benefit from a tool that automatically provided feedback on assessment quality, and assessed the degree to which OER MCQs or are susceptible to framing effects. This paper describes a model that is trained to use domain-naive strategies to guess which multiple choice answer is correct. The extent to which this model can predict the correct answer to an MCQ is an indicator that the MCQ is a poor measure of domain-specific knowledge. We describe an integration of this model with a front-end visualizer and MCQ authoring tool.

  • A NeuRetrieval Model for Human-Computer Conversation
    Authors: Rui Yan and Dongyan Zhao

    Keywords: Conversation system, Neural networks, Retrieval model

    Abstract:
    To establish an automatic conversation system between human and computer is regarded as one of the most hardcore problems in computer science. It requires interdisciplinary techniques of information retrieval, natural language processing, data management as well as artificial intelligence. The arrival of big data era reveals the feasibility to create a conversation system empowered by data-driven approaches. Now we are able to collect extremely large conversational data on Web, and organize them to launch a human-computer conversation system. Owing to the diversity of Web resources available, a retrieval-based conversation system will be able to find at least some responses from the massive data repository for any user inputs. Given a human issued utterance, i.e., a query, a retrieval-based conversation system will search for appropriate replies, conduct a relevance ranking, and then output the highly relevant one as the response. In this paper, we propose a novel retrieval model named NeuRetrieval for short text understanding, representation and semantic matching. The proposed model is general and unified for both single-turn and multi-turn conversation scenarios in open domain. In the experiments, we investigate the effectiveness of the proposed deep neural network model for human-computer conversations. We demonstrate performance improvement against a series of baseline methods in terms of p@1, MAP, nDCG, and MRR evaluation metrics. In contrast with previously proposed methods, NeuRetrieval is tailored for conversation scenarios and demonstrated to be more effective.

  • How to improve the answering effectiveness in Pay-for-Knowledge Community: An exploratory application of Intelligent QA System
    Authors: Yihang Cheng, Xi Zhang, Hao Wang and Shan Jiang

    Keywords: Pay-for-knowledge Community (PKC), Intelligent QA System, Answering effectiveness

    Abstract:
    Community Question Answering (CQA) has emerged recentlyand it becomes popular among people. During the process of the communication, different knowledge can be merged. Businessmen use the concept of paying for knowledge to make these knowledge to the benefits. In the consequence, the Pay-for-Knowledge Communities (PKC) have merged. However, in the pay-for-knowledge community, it takes long time to choose some valuable questions.Previous works mainly solve this problem in existing PKC platform. With the development of the cognitive computing and the intelligent QA system, the application of which in PKC platform can be possible. The challenges for us are how to combine the intelligent QA system and the PKC platform because the questions are various besides easy questions recent QA system can answer.We present a Four Module QA Model based on the normal intelligent QA System. Compared to normal intelligent QA System, our model uses categories to classify the questions with traditional machine learning methods. We use the answers in each category and each entity of corresponding question as the document database, getting the best answer among past answers through comparing the TF-IDF weighted bag-of word vectors of them or the new answer including key words through LSTM algorithm with PKC’s features. Experiments were developedon dataset with 1222 users’ QA sites collected from Zhihu. The model we proposed is expected to improve the business model of Pay-for-Knowledge Communities and increase QA’s effectiveness.

  • Region-wise Ranking of Sports Players based on Link Fusion
    Authors: Ali Daud, Akbar Hussain, Rabeeh Ayaz Abbasi, Naif Radi Aljohani, Tehmina Amjad and Hassan Dawood

    Keywords: Ranking, Intra-type and Inter-type Links, Players, Sports, Region-wise Ranking, Link Fusion

    Abstract:
    Players are ranked in various sports to show their importance over other players. Existing methods only consider intra-type links (e.g., player to player and team to team), but ignore inter-type links (e.g., one type of player to other type of player, such as batsman to bowler and player to team). They also ignore the spatiality of the players. There is a strong relationship among players and their teams, which can be represented as a network consisting of multi-type interrelated objects. In this paper, we propose a players’ ranking method, called Region-wise Players Link Fusion (RPLF) which is applied on the sport of cricket. RPLF considers players’ region-wise intra-type and inter-type relation-based features to rank the players. Considering multi-type interrelated objects is based on the intuition that a batsman scoring high against top bowlers of a strong team or a bowler taking wickets against top batsmen of a strong team is considered as a good player. The experimental results show that RPLF provides promising insights of players’ rankings. RLFP is a generic method and can be applied to different sports for ranking players.

  • Measuring the Impact of Topic Drift in Scholarly Networks
    Authors: Tehmina Amjad, Ali Daud and Min Song

    Abstract:
    With the increase in collaboration among researchers of various disciplines, changing the research topic or working on multiple topics is not an unusual behavior. Several comprehensive efforts have been made for predicting, quantifying, and studying the researcher’s impact. The question, that how the change in the field of interest over time or working in more than one topics can influence the scientific impact, remains unanswered. In this research, we study the effect of topic drift on the scientific impact of an author. We apply Author Conference Topic (ACT) model to extract topic distribution of individual authors who are working on multiple topics to compare and analyze with authors who work on a single topic. We analyze the productivity of the authors on the basis of publication count, citation count and h-index. We find that authors who stick to one topic, produce a higher impact and gain more attention. To further strengthen our results we gather the h-index of top-ranked authors working on one topic and top-ranked authors working on multiple topics and examine whether there are similar trends in their progress. The results show an evidence of significant impact of topic drift on career choices of researchers.

  • PersuAIDE ! An Adaptive Persuasive Text Generation System for Fashion Domain
    Authors: Vitobha Munigala, Abhijit Mishra, Srikanth Govindaraj Tamilselvam, Shreya Khare, Riddhiman Dasgupta and Anush Sankaran

    Keywords: Persuasiveness, Persuasive Systems, Persuasion in Fashion, Style-tip Generation

    Abstract:
    Persuasiveness is a creative art which aims at inducing certain set of beliefs in the target audience. In an e-commerce setting, for a newly launched product, persuasive descriptions are often composed to motivate an online buyer towards a successful purchase. Such descriptions can be catchy taglines, product- summaries, style-tips etc.. In this paper, we present PersuAIDE! – a persuasive system based on linguistic creativity to generate various forms of persuasive sentences from the input product specification. To demonstrate the effectiveness of the proposed system, we have applied the technology to fashion domain, where, for a given fashion product like “red collar shirt” we are able to generate descriptive sentences that not only explain the item but also garner positive attention, making it persuasive. PersuAIDE! identifies fashion related keywords from input specifications and intelligently expands the keywords to creative phrases. Once such compatible phrases are obtained, persuasive descriptions are synthesized from the set of phrases and input keywords with the help of a neural language model trained on a large domain specific fashion corpus. We evaluate the system on a large fashion corpus collected from different sources using (a) automatic text generation metrics used for Machine Translation and Automatic Summarization evaluation and Readability measurement, and (b) human judgment scores evaluating the persuasiveness and fluency of the generated text. Experimental results and qualitative analysis show that an unsupervised system like ours can produce more creative and better constructed persuasive output than supervised generative counterparts based on neural sequence-to-sequence models and statistical machine translation.

  • When E-commerce Meets Social Media: Identifying Business on WeChat Moment Using Bilateral-Attention LSTM
    Authors: Tianlang Chen, Yuxiao Chen, Han Guo and Jiebo Luo

    Keywords: attention model, joint visual-textual learning, multimodality analysis, WeChat business

    Abstract:
    WeChat Business, developed on WeChat, the most extensively used instant messaging platform in China, is a new business model that bursts into people’s lives in the e-commerce era. As one of the most typical WeChat Business behaviors, WeChat users can advertise products, advocate companies and share customer feedback to their WeChat friends by posting a WeChat Moment–a public status that contains images and a text. Given its popularity and significance, in this paper, we propose a novel Bilateral-Attention LSTM network (BiATT-LSTM) to identify WeChat Business Moments based on their texts and images. In particular, different from previous schemes that equally consider visual and textual modalities for a joint visual-textual classification task, we start our work with a text classification task based on an LSTM network, then we incorporate a bilateral-attention mechanism that can automatically learn two kinds of explicit attention weights for each word, namely 1) a global weight that is insensitive to the images in the same Moment with the word, and 2) a local weight that is sensitive to the images in the same Moment. In this process, we utilize visual information as a guidance to figure out the local weight of a word in a specific Moment. Two-level experiments demonstrate the effectiveness of our framework. It outperforms other schemes that jointly model visual and textual modalities. We also visualize the bilateral-attention mechanism to illustrate how this mechanism helps joint visual-textual classification. Finally, we extract typical image and text patterns among users with different tendencies toward WeChat Busines, to reveal the correlation between users’ WeChat Business activities and their lifestyle patterns.

  • Disease Tracking in GCC Region Using Arabic Language Tweets
    Authors: Muhammad Usman Ilyas and Jalal S. Alowibdi

    Keywords: Epidemiology, disease tracking, gulf region, Arabic, Twitter

    Abstract:
    Several prior studies have demonstrated the possibility of tracking the outbreak and spread of diseases using public tweets and other social media platforms. However, almost all such prior studies were restricted to geographically filtered English language tweets only. This study is the first to attempt a similar approach for Arabic language tweets originating from the Gulf Cooperation Council (GCC) countries. We obtained a list of commonly occurring diseases in the region from the Saudi Ministry of Health. We used both the English disease names as well as their Arabic translations to filter the stream of tweets. We acquired old tweets for a period spanning 29 months. All tweets were geographically filtered for the Middle East and the list of disease names in both English and Arabic languages. We observed that only a small fraction of tweets were in English, demonstrating that prior approaches to disease tracking relying on English language features are less effective for this region. We also demonstrate how Arabic language tweets can be used rather effectively to track the spread of some infectious diseases in the region. We verified our approach by demonstrating that a high degree of correlation between the occurrence of MERS-Coronavirus cases and Arabic language tweets on the disease. We also show that infectious diseases generating fewer tweets and non-infectious diseases do not exhibit the same high correlation. We also verify the usefulness of tracking cases using Twitter mentions by comparing against a ground truth data set of MERS-CoV cases obtained from the Saudi Ministry of Health.

  • Learning Procedures from Text: Codifying How-to Procedures in Deep Neural Networks
    Authors: Hogun Park and Hamid Reza Motahari Nezhad

    Keywords: Deep Neural Networks, Learning Processes from Text, Intelligent Virtual Bots

    Abstract:
    A lot of knowledge about procedures are described in texts. A procedure consists of a set of methods/tasks to achieve a goal. A challenge is how to automatically construct knowledge base on the procedure texts. Past work mostly has focused on entity relationship extraction within the sentence or high-level clustering from the extracted entities for the knowledge base construction. However, they are often inaccurate due to the error propagation from other NLP modules, and a few works have tried to detect procedure-specific relationship (e.g. is method of, is alternative of, or is subtask of ) in the procedure texts. In this paper, we propose an end-to-end neural network architecture for relationship classification, which is an essential component for constructing the procedure knowledge base. The proposed architecture not only takes natural language sentences as inputs, but also shows good performance even with small training dataset. Under the architecture, we construct a howto- knowledge base from the largest procedure sharing-community, wikihow.com. In evaluation, it outperforms the existing relationship extraction algorithms and could extract more rich knowledge graph from the procedure texts.

  • Pain Prediction in Humans using Human Brain Activity Data
    Authors: Mustansar Ali Ghazanfar, Zara Mansoor, Syed M Anwer, Ahmed S. Alfakeeh and Khaled H. Alyoubi

    Keywords: EEG Analysis, Pain prediction, SVM, KNN, Time domain, Frequency domain, Classification

    Abstract:
    This research article focuses on the analysis of electroencephalography (EEG) signals of the brain during pain perception. The proposed system is based on the hypothesis that a noticeable change occurs in mental conditions while experiencing pain. When the human body is injured, sensory receptors in the brain enter a stimulated state. The injury may be the result of attention or an accident. Pain warnings are natural in humans and protect the body from further negative effects. In this article, an innovative and robust system based on prominent features is proposed to predict the state of pain using an EEG. The brain signals of subjects were captured using two low-cost EEG headsets: neurosky mindwave mobile and emotiv insight. Time and frequency domain features were selected to represent the observed signals. The results showed that a combination of time and frequency domain features is the most informative approach to pain prediction.

  • Making the Most Cost-effective Decision in Online Paid Q&A Community: An Expert Recommender System with Motivation Modeling and Knowledge Pricing
    Authors: Yunhao Zheng, Xi Zhang and Yuting Xiao

    Keywords: Expert recommendation, Knowledge pricing, Motivation modeling

    Abstract:
    Recommending proper experts to knowledge buyers is a significant problem in online paid Q&A community (OPQC). Existing approaches for online expert recommendation have been mainly focused on exploiting semantic similarities and social network influence, while personalizing recommendation according to individuals’ motivations has not received much attention. In this paper, we propose a personalized expert recommender system, which integrates buyer’s motivation for knowledge, social influence, and money in a unified framework. As an innovative application of cognitive computing, our recommender system is capable of providing users with the best matching experts so as to help them make the most cost-effective choice in OPQC. To this end, Paragraph Vector technique is implemented to construct domain knowledge base (KB) in a multilayer information retrieval (IR) framework. Then we perform knowledge pricing based on buyer’s query and bid in the context of bilateral monopoly knowledge market. After that, a Markov Chain based method with user motivation learning is introduced to find the best matching experts. Finally, we evaluate the proposed approach using datasets collected from two largest OPQC in China, Fenda and Zhihu. The experimental results show encouraging success as effectively offering rea-sonable personalization options. As an innovative approach to solve expert matching problem in OPQC, this research provides flexibility in customizing the recommendation heuristics based on user motivation, and demonstrate its contribution to a higher rate of optimal knowledge seller-buyer matching.

  • Discovering Connotations as Labels for Weakly Supervised Image-Sentence Data
    Authors: Aditya Mogadala, Bhargav Kanuparthi, Achim Rettinger and York Sure-Vetter

    Keywords: Image-Sentence Connotation Labels, Weakly Supervised Deep Learning, Multi-label Prediction

    Abstract:
    We address the task of labeling image-sentence pair at the large-scale with varied concepts representing connotations. That is for any given query image-sentence, we aim to annotate them with connotations that capture the intrinsic intension. To achieve it, we propose a connotation multimodal embedding model (CMEM) with a novel loss function. It’s unique characteristics over previous models include (i) leverages multimodal data in contrast to only visual information, (ii) robust to outlier labels in a multi-label scenario and (iii) works well with the large-scale weakly supervised data. Extensive quantitative evaluation is conducted to exhibit the effectiveness of CMEM for detection of multiple labels over other state-of-the-art approaches. Also, we show that in addition to annotation of images with connotation labels, byproduct of our model inherently supports the cross-modal retrieval.

  • Perceiving Commercial Activeness Over Satellite Images
    Authors: Zhiyuan He, Su Yang, Weishan Zhang and Jiulong Zhang

    Keywords: Urban Perception, Deep Learning, Ubiquitous Computing

    Abstract:
    Different urban regions usually have different commercial hotness due to the different social contexts inside. As satellite imagery promises high-resolution, low-cost, real-time, and ubiquitous data acquisition, this study aims to solve commercial hotness prediction as well as the correlated social contexts mining problem via visual pattern analysis on satellite images. The goal is to reveal the underlying law correlating visual patterns of satellite images with commercial hotness so as to infer the commercial hotness map of a whole city for government regulation and business planning. We propose a novel deep learning-based model, which learns semantic information from raw satellite images to enable predicting regional commercial hotness. First, we collect satellite images from Google Map and label such images with POI categories according to the annotations from OpenStreetMap. Then, we train a model of deep convolutional networks that leverage raw images to infer the social attributes of the region of interest. Finally, we use three classical regression methods to predict regional commercial hotness from the corresponding social contexts reflected in satellite images in Shanghai, where the applied deep features are learned from the examples of Beijing to guarantee the generality. The result shows that the proposed model is robust enough to reach 82% precision at average. To the best of our knowledge, it is the first work focused on discovering relations between commercial hotness and satellite images. A web service is developed to demonstrate how business planning can be done in reference to the predicted commercial hotness of a given region.

  • Netizen-Style Commenting on Fashion Photos: Dataset and Diversity Measures
    Authors: Wen Hua Lin, Kuan-Ting Chen, Hung Yueh Chiang and Winston Hsu

    Keywords: Fashion, Image Captioning, Commenting, Diversity, Deep Learning, Topic Model

    Abstract:
    Recently, deep neural network models have achieved promising results in image captioning task. Yet, “vanilla” sentences, only describing shallow appearances (e.g., types, colors), generated by current works are not satisfied netizen style resulting in lacking engagements, contexts, and user intentions. To tackle this problem, we propose Netizen Style Commenting (NSC), to automatically generate characteristic comments to a user-contributed fashion photo. We are devoted to modulating the comments in a vivid “netizen” style which reflects the culture in a designated social community and hopes to facilitate more engagement with users. In this work, we design a novel framework that consists of three major components: (1) We construct a large-scale clothing dataset named NetiLook, which contains 300K posts (photos) with 5M comments to discover netizen-style comments. (2) We propose three unique measures to estimate the diversity of comments. (3) We bring diversity by marrying topic models with neural networks to make up the insufficiency of conventional image captioning works. Experimenting over Flickr30k and our NetiLook datasets, we demonstrate our proposed approaches benefit fashion photo commenting and improve image captioning tasks both in accuracy and diversity.

  • Learning the Chinese Sentence Representation with LSTM Autoencoder
    Authors: Mu-Yen Chen, Tien-Chi Huang, Yu Shu, Chia-Chen Chen, Tsung-Che Hsieh and Neil Y. Yen

    Keywords: Deep learning, Auto Encoder, Long Short-Term Memory(LSTM), Chinese Sentence Representation, Sentiment classification

    Abstract:
    This study retains the meanings of the original text using Autoencoder (AE) in this regard. This study uses the different loss (includes three types) to train the neural network model, hopes that after compressing sentence features, it can still decompress the original input sentences and classify the correct targets, such as positive or negative sentiment. In this way, it supposed to get the more relative features (compressing sentence features) in the sentences to classify the targets, rather than using the classification loss that may classify by the meaningless features (words). In the result, this study discovers that adding additional features for correction of errors does not interfere with the learning. Also, not all words are needed to be restored without distortion after applying the AE method.

  • Activity-Based Mobility Profiling: A Purely Temporal Modeling Approach
    Authors: Shreya Ghosh, Soumya K Ghosh, Rahul Deb Das and Stephan Winter

    Keywords: Temporal modeling, Mobility trace, Human Activity, User profiling

    Abstract:
    Several studies have shown that the spatio-temporal mobility traces of human movements can be used to identify an individual. However, this work presents a novel framework for activity-based mobility profiling of individuals using only the temporal information. The proposed framework is conducive to model individuals’ activity patterns in temporal scale, and quantifies the uniqueness measures based on certain temporal features of the activity sequence.