W3C Linked Data Glossary

https://www.w3.org/TR/ld-glossary/

Project glossary

administrator
- user with elevated rights (mainly can import knowledge bases, monitor other users' tasks)
Austrian open data catalogs
- data portals that serve as the source of the test data
- https://www.data.gv.at
- additionally https://www.opendataportal.at/
cell disambiguation
- determining which RDF resource is represented by a literal string in the cell
columns classification
- column classification annotates each NE-column with one concept, or in the case of literal columns, associates the column to one property of the
  concept assigned to the subject column of the table
conversion
- the same as 'task'
CSV meta-data
- CSV file description
- conforming to JSON schema specified at https://www.w3.org/2013/csvw/wiki/Main_Page (see metadata)
- example here: http://data.opendataportal.at/dataset/kunstler-der-sammlung-des-mumok/resource/e25640f8-a3e4-46d2-8a4f-9be471b115d2
CSV schema
- see CSV meta-data
DBpedia.org
- one of the existing Linked Data knowledge bases
disambiguated entity
- result of disambiguation
disambiguation
- see cell disambiguation
execution
- processing of the input file according with Odalic algorithm based on the TableMiner+
execution service
- module providing execution
feedback
- user input manually setting some of the result annotations or giving constraints to the algorithm in order to improve results in the next run
focused knowledge bases
- knowledge bases containing specialized, usually detailed information about certain, well defined domain
- opposite of general knowledge bases
Freebase
- on of the existing knowledge bases
- deprecated
general knowledge bases
- knowledge bases containing general information without restriction to particular topic
- opposite of focused knowledge bases
input constraints
- see feedback
JSON+LD
- JSON based format for encoding linked data
knowledge base's SPARQL endpoint
- web service allowing querying the knowledge base using SPARQL
knowledge bases
- published and accessible collection of datasets composed of RDF triples (or quads if named graphs are involved ) and the accompanying infrastructure allowing to query it, modify, search
Linked Open Data Cloud
- http://lod-cloud.net/
literal column
- column with data literal, e.g., plain string, number
named graph
- extension of RDF
- created by extending triples to quads with additional information described by URI
- can be used to make the manipulation with RDF sets more flexible (adapted by SPARQL)
- https://blog.ldodds.com/2009/11/05/managing-rdf-using-named-graphs/
NE-column
- column with entity (currently, as long as it is a string with letters, it is considered as named entity column)

ontology
- a model for describing the world
- consists of a set of types, properties, and relationship types
- description of taxonomy, classification network
OWL
- Web Ontology Language
- a family of languages for authoring of ontologies
- http://www.cambridgesemantics.com/semantic-university/owl-101
owl:sameAs
- built-in OWL property that links an individual to an individual
- an owl:sameAs statement indicates that two URI references actually refer to the same thing
predicate
- second part of an RDF statement
- defines the property for the subject of the statement
- always a URI
- establishes the relationship between a subject and an object and makes the object value a characteristic of the subject.
- visually connects subject and object in an RDF graph
primary KB
- KB which is denoted as being the primary one - during export the concepts from such KBs are preferred, and all the possible conflicts are resolved with respect to its heightened priority.
RDF data model
- a standard model for data interchange on the WebRDF
- uses URIs to name the relationship between things as well as the two ends of the link (triple)
- allows for data merging even if the underlying schemas differ, specifically supports the evolution of schemas over time without requiring all the data consumers to be changed
- http://www.cambridgesemantics.com/semantic-university/rdf-101
RDF data cube
- organization of RDF data indexed by its dimensions
- can be visualized as a hypercube
RDF data format serialization
- encoding of RDF triples
- enables to store them permanently or transport over network
- many options (Turtle and similar, XML, JSON)
RDF export configuration
- Configuration of the way how the classified, disambiguated data with relations discovered, are exported to RDF. The template describing the format of the data exported as RDF must be stored (e.g. in the form of SPARQL triple patterns, which may be applied to all rows as the export is prepared).
- ```
?cell01 rdf:type ad:City; s:address ad:address. ad:address s:postalCode ?cell05
```
RDF store
- store keeping RDF data
RDF triple
- object, predicate and subject
RDFS
- RDF Schema
- semantic extension of RDF
- provides mechanisms for describing groups of related resources and the relationships between these resources
- operates with domains and ranges of properties
- http://www.cambridgesemantics.com/semantic-university/rdfs-introduction
RDFS/OWL
- http://www.cambridgesemantics.com/semantic-university/rdfs-vs-owl
relation label/predicate
- predicate assigned to particular binary relation between two NE-columns
relations discovery and creation
- identifies binary relations between NE-columns
- alternatively, in the case of one NE-column and a literal column and given that the NE-column is annotated by a specific concept, identifies a property of that concept that could explain the data literals
result preview
- limited (probably to predefined number of processed rows) view of the result computed by the algorithm
- user may be able to provide feedback based on the preview
- Odalic allows to limit the number of processed rows in the task configuration
Semantic Table Interpretation
- name of the problem that TableMiner+ and consequently Odalic try to solve
SPARQL
- semantic query language
- able to retrieve and manipulate data RDF
SPARQL endpoint
- a service accepting and returning result of SPARQL queries
staged file
- uploaded file (either by direct upload or providing a link to a remote one) destined to be processed by the core algorithm
subject column
- assumed to exist in every processed table
- exactly one per table (unless statistical data processing is selected)
suggested class/concept
- the class which was suggested by the algorithm as being one of the possible classes
suggested winning class/concept
- the class which was suggested by the algorithm as being the best for the given column
TableMiner+
- core files predecessor and the base of extensions
tabular data
- abstract data loaded from provided CSV
task
- a unit of processing
- defined by a input content (uploaded file) and task configuration
- has a defined state and transition between them
- from the application's point of view, each task is a unique one, even though they can be defined by the same input constraints, configurations and input files
task configuration
- set of options allowing to run the task without any further input apart from the actual file
- includes delimiters, knowledge-bases used, input specification,...
transformation
- see task
Turtle
- terse encoding format of RDF triples
UnifiedViews pipeline
- a set of UV DPU instances connected
UnifiedViews
- integration tool for linked data
- http://unifiedviews.eu
- https://www.semantic-web.at/unifiedviews
UnifiedViews DPU
- processing unit in the UnifiedViews pipeline
- provided as a plugin (programmable)
- examples: download, export, SPARQL query,...
URI
- used in context of the Linked Data to identify entitites
user
- someone who is able to log in to the running Odalic instance
- can stage files, create and run tasks, stop them, run again
- can choose from available knowledge bases to apply to particular task
WikiData
- one of the existing knowledge bases
- https://www.wikidata.org/
YAGO
- one of the existing vocabularies
- http://www.mpi-inf.mpg.de/departments/databases-and-information-systems/research/yago-naga/yago/

ADEQUATE : Glossary

W3C Linked Data Glossary

Project glossary