W3C Linked Data Glossary
https://www.w3.org/TR/ld-glossary/
Project glossary
- administrator
- user with elevated rights (mainly can import knowledge bases, monitor other users' tasks)
- Austrian open data catalogs
- data portals that serve as the source of the test data
- https://www.data.gv.at
- additionally https://www.opendataportal.at/
- cell disambiguation
- determining which RDF resource is represented by a literal string in the cell
- columns classification
- column classification annotates each NE-column with one concept, or in the case of literal columns, associates the column to one property of the
concept assigned to the subject column of the table
- column classification annotates each NE-column with one concept, or in the case of literal columns, associates the column to one property of the
- conversion
- the same as 'task'
- CSV meta-data
- CSV file description
- conforming to JSON schema specified at https://www.w3.org/2013/csvw/wiki/Main_Page (see metadata)
- example here: http://data.opendataportal.at/dataset/kunstler-der-sammlung-des-mumok/resource/e25640f8-a3e4-46d2-8a4f-9be471b115d2
- CSV schema
- see CSV meta-data
- DBpedia.org
- one of the existing Linked Data knowledge bases
- disambiguated entity
- result of disambiguation
- disambiguation
- see cell disambiguation
- execution
- processing of the input file according with Odalic algorithm based on the TableMiner+
- execution service
- module providing execution
- feedback
- user input manually setting some of the result annotations or giving constraints to the algorithm in order to improve results in the next run
- focused knowledge bases
- knowledge bases containing specialized, usually detailed information about certain, well defined domain
- opposite of general knowledge bases
- Freebase
- on of the existing knowledge bases
- deprecated
- general knowledge bases
- knowledge bases containing general information without restriction to particular topic
- opposite of focused knowledge bases
- input constraints
- see feedback
- JSON+LD
- JSON based format for encoding linked data
- knowledge base's SPARQL endpoint
- web service allowing querying the knowledge base using SPARQL
- knowledge bases
- published and accessible collection of datasets composed of RDF triples (or quads if named graphs are involved ) and the accompanying infrastructure allowing to query it, modify, search
- published and accessible collection of datasets composed of RDF triples (or quads if named graphs are involved ) and the accompanying infrastructure allowing to query it, modify, search
- Linked Open Data Cloud
- literal column
- column with data literal, e.g., plain string, number
- named graph
- extension of RDF
- created by extending triples to quads with additional information described by URI
- can be used to make the manipulation with RDF sets more flexible (adapted by SPARQL)
- https://blog.ldodds.com/2009/11/05/managing-rdf-using-named-graphs/
- NE-column
- column with entity (currently, as long as it is a string with letters, it is considered as named entity column)
- ontology
- a model for describing the world
- consists of a set of types, properties, and relationship types
- description of taxonomy, classification network
- OWL
- Web Ontology Language
- a family of languages for authoring of ontologies
- http://www.cambridgesemantics.com/semantic-university/owl-101
- owl:sameAs
- built-in OWL property that links an individual to an individual
- an owl:sameAs statement indicates that two URI references actually refer to the same thing
- predicate
- second part of an RDF statement
- defines the property for the subject of the statement
- always a URI
- establishes the relationship between a subject and an object and makes the object value a characteristic of the subject.
- visually connects subject and object in an RDF graph
- primary KB
- KB which is denoted as being the primary one - during export the concepts from such KBs are preferred, and all the possible conflicts are resolved with respect to its heightened priority.
- RDF data model
a standard model for data interchange on the WebRDF
uses URIs to name the relationship between things as well as the two ends of the link (triple)
allows for data merging even if the underlying schemas differ, specifically supports the evolution of schemas over time without requiring all the data consumers to be changed
- http://www.cambridgesemantics.com/semantic-university/rdf-101
- RDF data cube
- organization of RDF data indexed by its dimensions
- can be visualized as a hypercube
- RDF data format serialization
- encoding of RDF triples
- enables to store them permanently or transport over network
- many options (Turtle and similar, XML, JSON)
- RDF export configuration
Configuration of the way how the classified, disambiguated data with relations discovered, are exported to RDF. The template describing the format of the data exported as RDF must be stored (e.g. in the form of SPARQL triple patterns, which may be applied to all rows as the export is prepared).
?cell01 rdf:type ad:City; s:address ad:address. ad:address s:postalCode ?cell05
- RDF store
- store keeping RDF data
- RDF triple
- object, predicate and subject
- RDFS
- RDF Schema
- semantic extension of RDF
- provides mechanisms for describing groups of related resources and the relationships between these resources
- operates with domains and ranges of properties
- http://www.cambridgesemantics.com/semantic-university/rdfs-introduction
- RDFS/OWL
- relation label/predicate
- predicate assigned to particular binary relation between two NE-columns
- relations discovery and creation
- identifies binary relations between NE-columns
- alternatively, in the case of one NE-column and a literal column and given that the NE-column is annotated by a specific concept, identifies a property of that concept that could explain the data literals
- result preview
- limited (probably to predefined number of processed rows) view of the result computed by the algorithm
- user may be able to provide feedback based on the preview
- Odalic allows to limit the number of processed rows in the task configuration
- Semantic Table Interpretation
- name of the problem that TableMiner+ and consequently Odalic try to solve
- SPARQL
- semantic query language
- able to retrieve and manipulate data RDF
- SPARQL endpoint
- a service accepting and returning result of SPARQL queries
- staged file
- uploaded file (either by direct upload or providing a link to a remote one) destined to be processed by the core algorithm
- subject column
- assumed to exist in every processed table
- exactly one per table (unless statistical data processing is selected)
- suggested class/concept
- the class which was suggested by the algorithm as being one of the possible classes
- suggested winning class/concept
- the class which was suggested by the algorithm as being the best for the given column
- TableMiner+
- core files predecessor and the base of extensions
- tabular data
- abstract data loaded from provided CSV
- task
- a unit of processing
- defined by a input content (uploaded file) and task configuration
- has a defined state and transition between them
- from the application's point of view, each task is a unique one, even though they can be defined by the same input constraints, configurations and input files
- task configuration
- set of options allowing to run the task without any further input apart from the actual file
- includes delimiters, knowledge-bases used, input specification,...
- transformation
- see task
- Turtle
- terse encoding format of RDF triples
- UnifiedViews pipeline
- a set of UV DPU instances connected
- UnifiedViews
- integration tool for linked data
- http://unifiedviews.eu
- https://www.semantic-web.at/unifiedviews
- UnifiedViews DPU
- processing unit in the UnifiedViews pipeline
- provided as a plugin (programmable)
- examples: download, export, SPARQL query,...
- URI
- used in context of the Linked Data to identify entitites
- user
- someone who is able to log in to the running Odalic instance
- can stage files, create and run tasks, stop them, run again
- can choose from available knowledge bases to apply to particular task
- WikiData
- one of the existing knowledge bases
- https://www.wikidata.org/
- YAGO
- one of the existing vocabularies
- http://www.mpi-inf.mpg.de/departments/databases-and-information-systems/research/yago-naga/yago/