NGI is a key priority in the EU H2020 work programme. This is to ensure that the immense potential of AI, of our digital connection with the physical world, our mediated experience of immersive environments, and future data networks connecting people and machines are used to empower EU citizens to steer their lives and contribute to inclusive and sustainable societies. The NGI initiative aims to keep the internet resources open and trustworthy. This effort engages all relevant stakeholders, from the public and private sectors to academia and civil society NGOs. The EC roadmap and four NGI project results are presented. Two panellists discuss the ethical values and the geopolitical context that waits a next generation of millennium internauts.
This Panel is an excellent opportunity to gain inside on the running of flagship research projects. Panelists are project leaders or staff deeply involved in these projects. In addition to the theoretical aspects, these researchers are practitioners with real hands-on experience.
From MultiJEDI to MOUSSE: Two ERC Projects for Innovating Multilingual Disambiguation and Semantic Parsing of Text
Authors: Valerio Basile and Roberto Navigli
Keywords: word sense disambiguation, entity linking, multilingual semantic parsing
The exponential growth of the Web is resulting in vast amounts of online content. However, the information expressed therein is not at easy reach: what we typically browse is only an infinitesimal part of the Web. And even if we had time to read all the Web we could not understand it, as most of it is written in languages we do not speak. Rather than time, a key problem for a machine is language comprehension, that is enabling a machine to transform sentences, i.e., sequences of characters, into machine-readable se- mantic representations linked to existing meaning inventories such as computational lexicons and knowledge bases. In this paper we present two interrelated projects funded by the European Research Council (ERC) aimed at addressing and overcoming the current limits of lexical semantics: MultiJEDI and MOUSSE. We also present the results of Babelscape, a Sapienza spin-off company with the goal is making the project outcomes sustainable in the long term.
CEDAR: Semantic Web Technology to Support Open Science
Authors: Mark Musen, Susanna-Asunta Sansone, Kei-Hoi Cheung, Steven Kleinstein, Morgan Crafts, Stephan Schürer and John Graybeal
Keywords: Semantic Web, Open Science, Open Data, Metadata, Ontology
There is an expectation that scientists will archive their experimental data online in public repositories to enable other investigators to verify their work and to re-explore their data in search of new discoveries. When left to their own devices, however, scientists do a terrible job creating the metadata that describe their datasets. The lack of standardization makes it extremely difficult for other investigators to find relevant datasets, to perform secondary analyses, and to integrate those datasets with other data. The Center for Expanded Data Annotation and Retrieval (CEDAR) was founded with the goal of enhancing the authoring of experimental metadata to make online datasets more useful to the scientific community. CEDAR technology includes Web-based methods for creating and managing libraries of templates for representing metadata to describe experimental datasets. The templates often are derived from standards proposed by different scientific communities. CEDAR’s templates interoperate with a repository of biomedical ontologies to standardize the way in which the templates may be filled out. CEDAR uses a repository of previously authored metadata from which it learns patterns that drive predictive data entry, making it easier for metadata authors to perform their work. CEDAR formats metadata as JSON or RDF, and uploads the metadata to online repositories automatically through Web services. Ongoing collaborations with several major research projects are allowing us to explore how CEDAR may ease access to scientific data sets stored in public repositories.
SoBigData: Social Mining & Big Data Ecosystem
Authors: Fosca Giannotti, Roberto Trasarti, Kalina Bontcheva and Valerio Grossi
Keywords: Big Data, Social Mining, Research Infrastructure, Platform
One of the most pressing and fascinating challenges scientists face today, is understanding the complexity of our globally interconnected society. The big data arising from the digital breadcrumbs of human activities has the potential of providing a powerful social microscope, which can help us understand many complex and hidden socio-economic phenomena. It is clear that such challenge requires high-level analytics, modeling and reasoning across all the social dimensions above. There is a need to harness these opportunities for scientific advancement and for the social good, compared to the currently prevalent exploitation of big data for commercial purposes or, worse, social control and surveillance. The main obstacle to this accomplishment, besides the scarcity of data scientists, is the lack of a large-scale open ecosystem where big data and social mining research can be carried out. The SoBigData Research Infrastructure (RI) provides an integrated ecosystem for ethic-sensitive scientific discoveries and advanced applications of social data mining on the various dimensions of social life as recorded by “big data”. The research community uses the SoBigData facilities as a “secure digital wind-tunnel” for large-scale social data analysis and simulation experiments SoBigData promotes repeatable and open science. Its mission is to support data science research projects by providing: • An ever-growing, distributed data ecosystem for procurement, access and curation and management of big social data, to underpin social data mining research within an ethic-sensitive context. • An ever-growing, distributed platform of interoperable, social data mining methods and associated skills: tools, methodologies and services for mining, analysing, and visualising complex and massive datasets, harnessing the techno-legal barriers to the ethically safe deployment of big data for social mining. • An ecosystem where protection of personal information and the respect for fundamental human rights can coexist with a safe use of the same information for scientific purposes of broad and central societal interest. SoBigData has a dedicated ethical and legal board, which is implementing a legal and ethical framework.
Copernicus App Lab: A Platform for Easy Data Access Connecting the Scientific Earth Observation Community with Mobile Developers
Authors: Manolis Koubarakis
Keywords: Earth observation data, linked geospatial data, Copernicus programme
The main objective of Copernicus App Lab is to make Earth Observation data produced by the Copernicus programme of the European Union available on the Web as linked data to aid their use by mobile developers.
Linked Data for Production (LD4P): A Multi-Institutional Approach to Technical Services Transformation
Authors: Philip Schreur
Keywords: Linked data for production, LD4P, Linked open data, BIBFRAME, MARC formats, Library technical services, Metadata production, Library workflows
Linked Data for Production (LD4P) is a collaboration between six institutions (Columbia, Cornell, Harvard, Library of Congress, Princeton, and Stanford) to begin the transition of technical services production workflows from a series of library-centric data formats (MARC) to ones based in Linked Open Data (LOD). This first phase of the transition focuses on the development of the ability to produce metadata as LOD communally, the enhancement of the BIBFRAME ontology to encompass the multiple resource formats that academic libraries must process, and the engagement of the broader academic library community to ensure a sustainable and extensible environment. As its name implies, LD4P is focused on the immediate needs of metadata production such as ontology coverage and workflow transition. The LD4P partners’ work will be based, in part, on a collection of tools that currently exist, such as those developed by the Library of Congress. The cyclical feedback of use and enhancement request to the developers of these tools will allow for their enhancement based on use in an actual production environment. The six institutions involved will focus on materials ranging from art to rare books, from cartographic materials to music, from annotations to workflows. Tool development and enhancement will also be a key aspect of the project. By the end of the first phase of this project (Spring 2018), the partners will have the minimal tooling, workflows, and standards developed to begin the transformation from MARC to LOD in Phase 2 of the project.
FIESTA-IoT Project: Federated Interoperable Semantic IoT/cloud Testbeds and Applications
Authors: Martin Serrano, Hung Nguyen, Elias Tragos and Amelie Gyrard
Keywords: Experimentation, IoT Labs, IoT Platforms, Semantic Interoperability, Linked Data, Federation
FIESTA-IoT project provides a blueprint experimental infrastructure, software tools, techniques, processes and best practices enabling IoT testbed/platforms to interconnect their facilities in an interoperable way. FIESTA-IoT project enables the integration of IoT platform’s resources, testbeds infrastructure and their associated applications. FIESTA-IoT opens up new opportunities in the development and deployment of experiments using data from IoT testbeds. The FIESTA-IoT infrastructure enables experimenters to use a single EaaS API (i.e. the FIESTA-IoT EaaS API) for executing experiments over multiple IoT federated testbeds in a testbed agnostic way i.e. like accessing a single large scale virtualized testbed. The main goal of the FIESTA-IoT project is to open new horizons in the development and deployment of IoT applications and experiments at a EU (and global) scale, based on the interconnection and interoperability of diverse IoT platforms and testbeds. FIESTA-IoT project’s experimental infrastructure provides to the European experimenters in the IoT domain with the unique capability for accessing and sharing IoT semantically annotated datasets in a testbed-agnostic way. FIESTA-IoT enables execution of experiments across multiple IoT testbeds, based on a single API for submitting the experiment and a single set of credentials for the researcher and the portability of IoT experiments across different testbeds and the provision of interoperable standards-based IoT/cloud interfaces over diverse IoT experimental facilities
Domain-specific Insight Graph (DIG)
Authors: Mayank Kejriwal and Pedro Szekely
Keywords: Knowledge Graphs, DARPA MEMEX, Information Extraction, Semantic Web, Semantic Search, User Interfaces, Human Trafficking
The DARPA MEMEX program was established with the goal of funding research into building domain-specific search systems that integrated state-of-the-art focused crawling (‘domain discovery’) information extraction and semantic search, and that could be used by users and domain experts with no programming or technical experience. Domain-specific Insight Graphs (DIG) was proposed and funded under MEMEX and has led to an end-to-end search system currently being used by over 200 law enforcement for combating human trafficking, by investigators from the Securities and Exchange Commission (SEC) in the US for investigating securities fraud, and for numerous other domains of a difficult, socially consequential (e.g., investigative) and unusual nature.
Information Recall Support for Elderly People in Hyper Aged Societies
Authors: Hsin-Hsi Chen and Manabu Okumura
Keywords: lifelogging, memory recall, personal big data
The number of the elderly was increased quickly in the last decade, and expected to grow fastest in the next 15 years. In the hyper aged societies, various kinds of problems arise in accompany with age increase. Memory loss, common seen in elderly people, affects their social interaction in the daily life very much. This 3-year international project jointly funded by Taiwan Ministry of Science and Technology (MOST) and Japan Science and Technology Agency (JST) investigates together the crucial issues behind the hyper aged societies. We aim at developing technologies and systems to provide information recall support for elderly people at the right time and at the right place. The system can not only reactively accept requests by elderly people, but also be proactively involved in human-human conversation. We will investigate lifelogging mechanisms to keep digital traces generated by individuals, extract entities, properties, relations, and events from the personal big data along timeline, construct the personal knowledge base, allow flexible knowledge base access, and present recall information.
AFEL – Analytics for Everyday Learning
Authors: Mathieu D’Aquin, Angela Fessl, Dominik Kowald and Stefan Thalmann
Keywords: Learning Analytics, Online learning, Social learning, Self-directed learning, Online activity data
The goal of AFEL is to develop, pilot and evaluate methods and applications, which advance informal/collective learning as it surfaces implicitly in online social environments. The project is following a multi-disciplinary, industry-driven approach to the analysis and understanding of learner data in order to personalize, accelerate and improve informal learning processes. Learning Analytics and Educational Data Mining traditionally relate to the analysis and exploration of data coming from learning environments, especially to understand learners’ behaviours. However, studies have for a long time demonstrated that learning activities happen outside of formal educational platforms, also. This includes informal and collective learning usually associated, as a side effect, with other (social) environments and activities. Relying on real data from a commercially available platform, the aim of AFEL is to provide and validate the technological grounding and tools for exploiting learning analytics on such learning activities. This will be achieved in relation to cognitive models of learning and collaboration, which are necessary to the understanding of loosely defined learning processes in online social environments. Applying the skills available in the consortium to a concrete set of live, industrial online social environments, AFEL will tackle the main challenges of informal learning analytics through 1) developing the tools and techniques necessary to capture information about learning activities from (not necessarily educational) online social environments; 2) creating methods for the analysis of such informal learning data, based on combining feature engineering and visual analytics with cognitive models of learning and collaboration; and 3) demonstrating the potential of the approach in improving the understanding of informal learning, and the way it is better supported; 4) evaluate all the former items in real world large scale applications and platforms.