LATC Project 
Infrastructure

Describes the project infrastructure including project management platform and communication.

Project
 Fact
 Sheet

Summarises the project goals and main deliverables.

Deployment
 of 
Crawler 
and 
Indexer
 Module

Describes the capabilities and the deployment of the data Crawler and Indexer, including data acquisition, supported formats and synchronization of datasets.

Interface
 Definitions 
for 
24/7 
Platform

Describes the LATC 24/7 Platform design and design goals. It defines the Platform scope as well as the target users. The 24/7 Platform components are introduced, the interfaces between the components are defined and the workflow to generate links using the 24/7 Platform is described. See also the GitHub repository for the current state ...

Initial Data Publication & Consumption Tools Library

Gives an overview of the phases of the publication and consumption process of Linked Data. It furthermore gives open source tool recommendations for each of the phases, which establish the first version of the LATC Data Publication & Consumption Tools Library.

Initial Documentation & Tutorials for Tools Library

Documents the usage of the tools that are recommended in the LATC Data Publication & Consumption Tools Library and describes an example usage of it.

First Deployment of Data Source Inventory

Describes the requirements of the Dataset Inventory (DSI), and how the interface seeks to meet these requirements; it discusses the data sources combined in the Metadata Store (MDS), how the DSI is initially implemented.

First Deployment of Linking Engine

Describes the LATC Linking Engine and its setup as part of the LATC 24/7 24/7. It first introduces Silk, the Linking Engine for LATC and then presents its integration in the 24/7 Platform by the LATC Runtime.

First Deployment of Quality Assurance Module

Describe how quality assurance (QA) works in the LATC 24/7 Platform. The methods presented in the report mainly involve detecting and assessing the quality of such links. We distinguish between internal quality assurance,
which happens within the LATC platform and external quality assurance, which involves crawling the Web of Data and computing a number of metrics to assess its quality.

First
 Report 
on 
Models
 for
 Distributed
 Computing

Explores different models of distributed computing that are or could be used for the LATC 24/7 Platform. Altogether, the Linking Engine from WP1T3 and the Quality Module from WP1T4 are expected to process billions of triples in order to create, evaluate and maintain millions of RDF links between the data sources. This report provides an overview of relevant techniques and reports on the one currently deployed.

Performance Measurements and Bottleneck Analysis

Based on the LATC 24/7 Platform interfaces and interplay this report identifies the main performance bottlenecks of the LATC 24/7 Platform and provides eight preventive actions for addressing them. We then give a detailed account of how we analysed the 24/7 Platform and identified the bottlenecks.

Report on the Publication of Business-related Datasets

Reports on the publication of business-related produced by European Institutions as Linked Data on the Web. Besides the publication, the datasets were interlinked with corresponding datasets on the Web of Data. Procedures for keeping the data sources up‐to‐date in respect to original sources have been established and reported on. See also the GitHub repository for the current state ...

Evaluation Report of Publication & Consumption Tools Library

Reports on the outcome of a survey we conducted on the initial version of the Publication & Consumption Tools Library in order to improve the library and the presented tools further. We describe which questions were asked in the survey and why, describe the execution of the survey and present the obtained results.

Initial Publication and Consumption Best Practice Guide

Provides a collection of best practice guides for Linked Data publishing and consuming.

Initial Sustainability Report

Outlines the sustainability of the output of the LATC Support Action, as a whole, and of its component parts, after the end of the project in 2012.

Schema.org support

In early June 2011, the three big search engines Bing, Google and Yahoo! introduced Schema.org, a collection of terms that webmasters can use to markup their pages to improve the display of search results. The collection of terms was published originally in HTML only. LATC has taken the initiative and provided, within 24h, an initial RDFS version of the Schema.org terms via a new site: Schema.RDFS.org with the aim to support Schema.org deployment and usage with a special focus on Linked Data.

Linked Data life cycles

Existing data management approaches assume control over schema, data and data generation, which is not the case in open, de-centralised environments such as the Web. The lack of control means that there are social processes necessary to generate ’ordo ab chao’. Based on our experience in Linked Data publishing and consumption over the past years, we have identify involved parties and fundamental phases, which provide for a multitude of so called Linked Data life cycles.

5 ★ Open Data

5stardata.info

Tim Berners-Lee, the inventor of the Web and Linked Data initiator, suggested a 5 star deployment scheme for Open Data. Here, we give examples for each step of the stars and explain costs and benefits that come along with it.

Report on the Publication of European Institutions Data

This deliverable reports on the publication of EU Institutions data as Linked Data on the Web. It describes the conversion, publication, interlinking, and provision for updates of seven EU Institutions datasets, contributing to the EU data cloud.

Report on the Publication of Legal Datasets

This deliverable reports on the publication of legal datasets produced by European Institutions as Linked Data on the Web. The datasets published in this task are:

Final Release of P&C Library

Based on the experience acquired and the evaluations performed, this deliverable reports on the extension of the LATC Linked Data Publication and Consumption (P&C) Tools library. We also conducted a survey to better assess the total cost of ownership of the tools that are documented in this library. Our findings essentially show that these tools require low hardware, training of staff and come with a limited (mail-based) amount of live support.

Final Release of Documentation and Tutorials

This deliverable documents usage examples that would benefit from applying the LATC Linked Data Publication and Consumption Tools to the existing data publishing techniques. It contains a list of real‐life use cases that can benefit from using Linked Data techniques. One of them (Real Estate Agency scenario) was used as foundation for detailed, step-­by-­step tutorial of how the Linked Data process can look like and how it should be performed. The use cases are presented in a form of explanatory screencast, which will be available online and accessible via the Tools Library website.

Final Deployment of Data Source Inventory

Describes the updates to the Dataset Inventory (DSI) and the Metadata Store (MDS).

Final Deployment of Linking Engine

Describes the updates and extensions of the LATC Linking Engine based on the Silk Interlinking framework.

Final Deployment of QA Module

Describes the updates of the quality assurance (QA) module in the LATC 24/7 Platform.

Final Report on Models for Distributed Computing

Discusses choices we made and describes the techniques currently in use to ensure the LATC Interlinking Platform scales. We also motivate these choices with an explanation of the technology challenges behind dealing with Web data at scale.

Lessons Learned and Reporting Summary

Describes the reporting functionality of the LATC 24/7 Interlinking Platform (also known as the dashboard) as well as lessons learned gathered during the assembly, deployment and operation of the overall platform. The latter is especially useful for anyone who wants to set up and operate the Interlinking Platform independently of the SaaS offering deployed in LATC.

Final Best Practice Guide

A collection of best practices for Linked Data publishing and consuming. The LATC team has published several guides targeting the different aspects of the publication and consumption process of Linked Data. Besides the Linked Data Book by Tom Heath and Christian Bizer, the LATC team gave a tutorial at WWW2012 on Practical Cross‐Dataset Queries on the Web of Data and the EU Data Cloud session at the European Data Forum 2012. Finally, the LATC team provides a quality checklist for LATC‐published datasets.

Final Sustainability Report

Outlines the sustainability of the output of the LATC Support Action, as a whole, and of its component parts, after the end of the project in September 2012. An important aspect to the success of a project is its sustainability: how and where will the project’s outcome be maintained or even taken further. In this respect, no major obstacles are foreseen for sustaining the LATC 24/7 Platform, any of its components or data artefacts, which will all be continued after the end of the project through other projects, or the individual efforts of LATC project partners.