SemStats 2016 Call for Contributions

Authors: Sarven Capadisli¹; Franck Cotton²; Armin Haller³; Evangelos Kalampokis⁴; Monica Scannapieco⁵; Raphaël Troncy⁶

¹University of Bonn, Germany
²INSEE, France
³ANU, Australia
⁴CERTH, Greece
⁵IStat, Italy
⁶EURECOM, France

Document ID: http://semstats.org/2016/call-for-contributions

Published: 2016-04-24

Modified: 2016-07-04

License: CC BY 4.0

Keywords

Event: 4th International Workshop on Semantic Statistics co-located with the 15th International Semantic Web Conference (ISWC 2016)
Location: Kobe, Japan
Date: October 17, 2016 or October 18, 2016

Important dates

Submission deadline: July 715th, 2016, 23:59PM Hawaii time
Notifications to authors: July 31st 2016, 23:59PM Hawaii time

Workshop Summary

The goal of this workshop is to explore and strengthen the relationship between the Semantic Web and statistical communities, to provide better access to the data held by statistical offices. It will focus on ways in which statisticians can use Semantic Web technologies and standards in order to formalize, publish, document and link their data and metadata, and also on how statistical methods can be applied on linked data. It is the fourth workshop in a series that started at the International Semantic Web Conference in 2013 (SemStats 2013) and run since every year at ISWC (SemStats 2014 and SemStats 2015).

The statistical community shows more and more interest in the Semantic Web. In particular, initiatives have been launched to develop semantic vocabularies representing statistical classifications and discovery metadata. Tools are also being created by statistical organizations to support the publication of dimensional data conforming to the Data Cube W3C Recommendation. But statisticians see challenges in the Semantic Web: how can data and concepts be linked in a statistically rigorous fashion? How can we avoid fuzzy semantics leading to wrong analyses? How can we preserve data confidentiality?

The workshop will also cover the question of how to apply statistical methods or treatments to linked data, and how to develop new methods and tools for this purpose. Except for visualization techniques and tools, this question is relatively unexplored, but the subject will obviously grow in importance in the near future.

Motivation

The interest of the statistical community for linked data has recently increased in a spectacular way. Two illustrations of this phenomenon can be mentioned:

The UNECE High-level group for the modernization of official statistics, a group of ten directors of national or international statistical institutes around the world, launched in 2016 a project on implementing statistical standards. Two work packages of this project aim at implementing linked statistical metadata systems (classifications, glossaries, statistical models).
Eurostat, as part of its "Vision 2020" strategic program, has started a major project focusing on digital communication, user analytics and innovative products. Work package 3 of this project contains different tasks related directly to linked data.

There is also a significant interest in exploiting linked statistical data inside public administrations in order to create innovative public services for citizens and businesses: see for example the new EU-funded H2020 OpenGovIntelligence project.

This growing interest is a tremendous opportunity for the SemStats community to leverage the work done in the previous years and to continue to elaborate the solutions that are needed for these initiatives.

Topics

The workshop will address topics related to statistics and linked data. This includes but is not limited to:

How to publish linked statistics?

What are the relevant vocabularies for the publication of statistical data?
What are the relevant vocabularies for the publication of statistical metadata (code lists and classifications, descriptive metadata, provenance and quality information, etc.)?
What are the existing tools? Can the usual statistical software packages (e.g. R, SAS, Stata) do the job?
How do we include linked data production and publication in the data lifecycle?
How do we establish, document and share best practices?

How to use linked data for statistics?

Where and how can we find statistics data: data catalogues, dataset descriptions, data discovery?
How do we assess data quality (collection methodology, traceability, etc.)?
How can we perform data reconciliation, ontology matching and instance matching with statistical data?
How can we apply statistical processes on linked data: data analysis, descriptive statistics, estimation, correction?
How to intuitively represent statistical linked data: visual analytics, results of data mining?

Submissions

This workshop is aimed at an interdisciplinary audience of researchers and practitioners involved or interested in Statistics and the Semantic Web. All contributions must represent original and unpublished work that is not currently under review. Contributions will be evaluated according to their significance, originality, technical content, style, clarity, and relevance to the workshop.

At least one author of each accepted contribution is expected to attend the workshop. Workshop participation is available to ISWC 2016 attendants at an additional cost, see http://iswc2016.semanticweb.org/pages/attending.html for the details.

Full and Short articles (up to 12 and 6 pages)

The workshop will welcome long and short scientific articles related to the topics mentioned above. Long articles refer to mature research work, where ideas have been implemented and evaluated. Short articles refer to brave new ideas or position statements describing a vision for the Semantic Statistics community.

Challenge articles (up to 12 pages)

The workshop will also feature a data challenge based on a corpus of linked metadata that will be made available on the SemStats web site by the end of May. The challenge will consist in the realization of mashups or visualizations, but also on comparisons, alignment and enrichment of the data and concepts involved.

Application and Demo articles (up to 6 pages)

This year, the workshop calls for contributions more generally. This includes interactive demonstrations of applications, or useful and relevant software library and repository, described in short articles. All application and demo articles should include a link where readers can experiment with the live software. Additional pointers such as source code repository are also welcomed.

Awards

This year, SemStats will award prizes, thanks to the generous sponsoring from CASD and Oracle :

The best full or short article will win 1000 EUR
The best challenge article will win 500 EUR
The best application or demo article will win 500 USD

Writing your contribution

All submissions must be written in English and must be formatted according to the information for LNCS Authors (see http://www.springer.com/computer/lncs?SGWID=0-164-6-793341-0). Please note that HTML+RDFa submissions are also welcome as long as the layout complies with the LNCS style. Authors are welcome to use dokieli (source) or similar systems. Submissions are not anonymous. Please submit your contributions through Easychair and before July 715, 2016, 23:59 PM Hawaii Time. All accepted articles will be archived in an electronic proceedings published by CEUR-WS.org.

Submission date: July 715th, 2016
Notifications: July 31st 2016

If you are interested in submitting a contribution but would like more preliminary information, please contact semstats2016@easychair.org.

Organizing Committee

Sarven Capadisli, University of Bonn, Germany
Franck Cotton, INSEE, France
Armin Haller, ANU, Australia
Evangelos Kalampokis, CERTH/ITI and University of Macedonia, Greece
Monica Scannapieco, Istat, Italy
Raphaël Troncy, EURECOM, France

Program Committee

Stefano Abruzzini, EC - DG Connect
Ghislain Auguste Atemezing, Mondeca
Carlo Batini, Bicocca, University of Milan
Chris Beer, Australian Bureau of Statistics
Oscar Corcho, Universidad Politécnica de Madrid
Richard Cyganiak, Digital Enterprise Research Institute, NUI Galway
Cinzia Daraio, University of Rome "La Sapienza"
Stefano De Francisci, ISTAT
Jay Devlin, Fidelity
Miguel Expósito Martín, Instituto Cántabro de Estadística
Dan Gillman, US Bureau of Labor Statistics
Tudor Groza, The Garvan Institute of Medical Research
Christophe Guéret, BBC
Paul Hermans, ProXML
Laurent Lefort, W3C Australia
Erik Mannens, iMinds - Ghent University - Multimedia Lab
Albert Meroño-Peñuela, VU University Amsterdam
Jindřich Mynarz, University of Economics, Prague
Marco Pellegrino, Eurostat
Dave Reynolds, Epimorphics Ltd
Bill Roberts, Swirrl IT Limited
Hideaki Takeda, National Institute of Informatics
Wendy Thomas, Minnesota Population Center
Joachim Wackerow, GESIS - Leibniz Institute for the Social Sciences