diff --git a/README.md b/README.md index 24970c2..a6c691f 100644 --- a/README.md +++ b/README.md @@ -6,9 +6,9 @@ of relating classifications.* We've been representing statisical classifications in CSV and then -converting to SKOS Concept Schemes by way of table2qb and CSV2RDF. The -RDF Data Cube vocabulary uses SKOS concepts as the values of -dimensions of observations in a data cube. SKOS, by design, doesn't +converting to SKOS Concept Schemes by way of [table2qb][table2qb] and +CSV2RDF. The RDF Data Cube vocabulary uses SKOS concepts as the values +of dimensions of observations in a data cube. SKOS, by design, doesn't provide much in the way of semantics, leaving it to an application to decide what skos:Concepts and relations between them logically mean. @@ -26,15 +26,17 @@ restrictions. By way of example, we've taken two overlapping breakdowns of -geography, the [British Isles](https://en.wikipedia.org/wiki/Terminology_of_the_British_Isles) and the British Islands and created -simple datasets about the populations of the various parts. +geography, the [British +Isles](https://en.wikipedia.org/wiki/Terminology_of_the_British_Isles) +and the British Islands and created simple datasets about the +populations of the various parts. ![An Euler diagram with an overview of the terminology (public domain, TWCarlson, via Wikipedia)](https://upload.wikimedia.org/wikipedia/commons/2/28/British_Isles_Euler_diagram_15.svg) This directory contains the following: [population-british-isles.csv](population-british-isles.csv) contains -the observations as Tidy Data in the style acceptable to table2qb. +the observations as Tidy Data in the style acceptable to [table2qb][table2qb]. [population-british-islands.csv](population-british-islands.csv) contains the observations as Tidy Data in a simplified style. @@ -43,7 +45,8 @@ gives the CSVW needed to convert the data into an RDF data cube using the W3C standard csv2rdf. -[population-british-islands.csv-metadata.json](population-british-islands.csv-metadata.json) is similar, with some changes to cope with the simpler representation. +[population-british-islands.csv-metadata.json](population-british-islands.csv-metadata.json) +is similar, with some changes to cope with the simpler representation. [owl_classification.py](owl_classification.py) takes a typical CSV file as above, representing a statistical classification, expected to @@ -67,17 +70,48 @@ [codelists/british-islands.csv](codelists/british-islands.csv) and [codelists/british-isles.csv](codelists/british-isles.csv) provide -separate breakdowns of the two overlapping hierarchies in a table2qb -style. +separate breakdowns of the two overlapping hierarchies in a +[table2qb][table2qb] style. [codelists-metadata.json](codelists-metadata.json), [columns.csv](columns.csv) and the blank -[components.csv](components.csv) are configuration files used by table2qb. +[components.csv](components.csv) are configuration files used by [table2qb][table2q]. [prefixes.ttl](prefixes.ttl) is used to make the Turtle files more readable. [skos.rdf](skos.rdf) is a copy of the SKOS ontology with some small changes to remove the "lints" that might break reasoners. -[Makefile](Makefile) used to record the various steps used to create +[Makefile](Makefile) used to record the various steps used to create/target the following: + +[british-isles.ttl](british-isles.ttl) and +[british-islands.ttl](british-islands.ttl) using [table2qb][table2qb] +to create SKOS concept schemes, then some `sed` to reference Wikidata +URIs and remove some straggling `skos:member` statements that +shouldn't be there, and Apache Jena's `riot` to tidy things into +readable Turtle. + +[population-british-isles.ttl](population-british-isles.ttl) and +[population-british-islands.ttl](population-british-islands.ttl) using [csv2rdf.clj][csv2rdf.clj]. + +[british-isles-owl.ttl](british-isles-owl.ttl) and [british-islands-owl.ttl](british-islands-owl.ttl) using `owl_classification.py`. + +test: checks the resulting data against the Data Cube Integrity Constraints. + +** Using Pellet ** + +* as a general ontology lint tool (lint) + +* running queries (query) + +* checking unsatisfiability (unsat) + +* showing the class hierarchy (classify) and where the instances fit (realization) + +* materializing inferences (extract) + +* as a query engine in Fuseki to act as a small SPARQL server with reasoning. + +[table2qb]: https://upload.wikimedia.org/wikipedia/commons/2/28/British_Isles_Euler_diagram_15.svg +[csv2rdf.clj]: https://github.com/Swirrl/csv2rdf