SemStats 2020 Call for Challenge
- Document ID
- http://semstats.org/2020/call-for-challenge
- Published
- Modified
- License
- CC BY 4.0
Keywords
- ISWC2020
- SemStats
- Data integration
- Linked Data
- Semantic Web
- Statistics
- Statistical database
- Event
- 8th International Workshop on Semantic Statistics co-located with the 19th International Semantic Web Conference (ISWC 2020)
- Location
- Athens, Greece
- Date
- or
Important dates
- Submission deadline: October 2nd, 2020, 23:59PM Hawaii time
- Notifications to authors: October 12th
Challenge data
The SemStats data challenge is a competition open to all and based on datasets made available by statistical offices. The participants are invited to demonstrate an original and helpful usage of the data, in connection with semantic web technologies. The SemStats data challenge is sponsored by the CASD, a facility offering to researchers a secure access to detailed statistical data. The CASD will offer prizes to the best contributions.
The SemStats 2020 data challenge offers different possibilities:
- The French Permanent database of facilities (BPE)
- Results of the French Census
- The French business register (Sirene)
- Open track
These options are detailed below.
BPE track
The Permanent database of facilities (BPE for Base permanente des équipements) offers information on the level of facilities and services a territory provides for its population. It lists more than 2.5 million facilities of a wide range of different types with their main characteristics, most of which are geolocalized.
The resources provided for this track are:
- The 2018 edition of the BPE in RDF
- The corresponding data model in OWL
- The code lists used, expressed as SKOS
- Quality metadata on the geolocalization
Suggested use cases for the BPE data include:
- Provide feedback on modelling, quality and usability of data
- Create links to other data sets
- Propose visualization tools
- Produce statistical studies on the data
Census track
Since 2004, the French population census has been based on annual surveys, with total national coverage achieved in five-year cycles. The goals of this new method are to produce results more regularly and to distribute the workload over time better than with the previous exhaustive census procedure (see details on Insee's web site). This new methodology also implies that, although new population figures are produced every year, valid temporal comparisons can only be made between results separated by 5 years.
The legal populations are amongst the most important data published from the census. Insee has been publishing the legal populations as RDF since 2010, so that full 5-year cycles are now available. All the data are available as Turtle on Insee's RDF site (reference years 2010, 2011, 2012, 2013, 2014, 2015 and 2016).
For this challenge, the submissions are expected to realize original and statistically meaningful uses of the legal population RDF data.
Sirene track
Sirene is the official database of French enterprises (legal units) and establishments (local units). Created and managed by Insee since 1973, this database now includes more than 10 million units and is continuously updated. Since the beginning of 2017, Sirene is available as open data.
The register can be downloaded following the links on the Sirene web site. A detailed documentation and a FAQ are also available.
This challenge consists in proposing a RDF modelization for Sirene data. The reuse of existing vocabularies is encouraged. The submissions will be evaluated on the relevance of the proposed model and on the clarity of its documentation. Concrete use cases of the data proving the relevance of the model proposed are of course welcome.
Open track
An open track is also available for the SemStats challenge. For this track, submissions can be based on any publicly available data sets and propose any uses relevant for the SemStats workshop. Novelty and sound methodology will be considered for evaluating the submissions in this track.
Submissions
See the main Call for Contributions page for details on the challenge submissions and on the submission process and deadline. Challenge articles should be no longer than 12 pages. The challenge data can also be used as a basis for the realization of application and demo articles (up to 6 pages). Remember that these two categories of submissions are awarded specific prizes!
- Submission date
- October 2nd, 2020
- Notifications
- October 12th, 2020
If you are interested in submitting a contribution but would like more preliminary information, please contact semstats2020@easychair.org.