The core structure used to assign semantic statements to ink is the grouping mechanism. In most cases, a single ink stroke does not contain meaningful information, only a collection or group of strokes does, as illustrated in Figure 1, where stroke s8, s9, s10, and s11 form the word "digital". The groups are handled in industry standard tree data structures, holding a collection of nodes for ink strokes and related sensor data. For example, in Figure 2, g4 is the word "hello", g5 "world", and g6 the punctuation symbol "!". These are combined in g2 to represent a text line, which is part of a paragraph. A similar structuring mechanism is used in Rich Text editing and here it is applied to ink documents.
Figure 1: Example ink document with groups
Figure 2: Hierarchy of groups
An ink group combines a list of elements, either:
- Sensor data sequences
- Ink Paths
- Chunks of RAW sensor data or
- Chunks of ink path
This basic concept is utilized in our Universal Ink Data format to store the semantic content for ink groups. Figure 3 illustrates how semantic statements are linked to Ink Groups, for the example introduced in Figure 1, the following semantic statements could be formulated:
- g2 - is_a TEXTLINE
- g2 - has_text "hello world!"
- g2 - is http://dbpedia.org/page/%22Hello,_World!%22_program
So, applications receiving this semantically enriched ink can understand the content of Ink document.
Figure 3: Example of semantics which are assigned to groups.
The Ink Model may hold a logical tree (InkTree), which represents a structure of hierarchically organized paths or raw input data-frames. This tree is also referred to as main ink-tree or primary ink-tree.
The InkTree is defined as an ordered logical tree of nodes, called ink-nodes (InkNode). It is used to represent hierarchical structures of paths or raw input data-frames.
A single InkTree may refer to either Path or SensorData objects within its hierarchy. An ink tree instance is defined by its root node (see the definition of InkNode for details); therefore, an InkTree instance is serialized using the protobuf message Node.
A logical node of an ink-tree. In terms of a class hierarchy the InkNode should be considered an abstract class, which is specialized into the following classes:
- PathNode - A leaf node, holding a reference to a Path, provided by the Path Repository
- SensorDataNode - A leaf node, holding a reference to a SensorData instance, provided by the SensorData Repository
- InkGroupNode (abstract) - A non-leaf node, used to group other ink-nodes; specialized into:
- PathGroupNode - A non-leaf node, used to group ink-nodes of type PathNode and/or PathGroupNode
- SensorDataNode - A non-leaf node, used to group ink-nodes of type SensorDataNode and/or SensorDataGroupNode
The InkNode identifier is unique in the scope of the InkModel. An InkNode instance is serialized using the protobuf message Node.
The Ink Model may also hold a collection of views of the ink data (InkView), stored as a collection of named ink trees. These views represent different aspects of the Ink Model, for example, text segmentation view, named entity recognition view, etc.
An InkView is defined as a named ink-tree.
The InkView's instance name is unique in the scope of the InkModel and is encoded as an URI. For known view definitions refer to section C 3. Common InkModel Views. A View instance is serialized using the protobuf message View.
The Ink Model Specification provides a standard mechanism to describe relationships between different parts of the ink model, and/or between parts of the ink model and external entities. The Ink Model keeps an instance of an RDF-compliant triple store, referred to as "Knowledge Graph" within this documentation. This triple store holds a list of semantic triples to encode relationships between subject, predicate, and object, as defined in the RDF specification.
Using the knowledge graph nodes of the ink views, contained within the ink model, could be annotated with additional metadata in order to describe different aspects of the ink model, for example, text segmentation view, named entity recognition view, etc.
Knowledge Graph, an RDF compliant triple store, represented by a collection of SemanticTriple instances.
Nodes of the logical trees, contained within the InkModel, are identified by URIs in the triple store in accordance with the guidelines in section vocabulary. A Knowledge Graph instance is serialized using the protobuf message TripleStore.
An RDF-compliant triple, which consists of:
The RDF data model is similar to classical conceptual modeling approaches (such as entity–relationship or class diagrams). It is based on the idea of making statements about resources (in particular web resources) in expressions of the form subject–predicate–object, known as triples. The subject denotes the resource, and the predicate, also known as the property of the triple, denotes traits or aspects of the resource, and expresses a relationship between the subject and the object.
A Knowledge Graph instance is serialized using the protobuf message SemanticTriple.