Skip to main content

Introduction

The Universal Ink Model is a data model describing ink-related data structures and metadata concepts to describe the semantic content of ink. This specification defines a language-neutral and platform-neutral data model for representing and manipulating digital ink data captured using an electronic pen or stylus, or using touch input.

The main aspects of the ink model are:

  • Interoperability of ink-based data models by defining a standardized interface with other systems
  • Biometric data storage mechanism
  • Spline data storage mechanism
  • Rendering configurations storage mechanism
  • Ability to compose spline/raw input-based logical trees, which are contained within the ink model
  • Portability, by enabling conversion to common industry standards
  • Extensibility, by enabling the description of ink data-related semantic metadata
  • Standardized serialization mechanism

This reference document defines a RIFF container and Protocol Buffers schema for serialization of ink models as well as a standard mechanism to describe relationships between different parts of the ink model, and/or between parts of the ink model and external entities. For further details check the encoding section.

The specified serialization schema is based on the following standards:

  • Resource Interchange File Format (RIFF) - A generic file container format for storing data in tagged chunks
  • Protocol Buffers v3 - A language-neutral, platform-neutral extensible mechanism for serializing structured data
  • Resource Description Framework (RDF) - A standard model for data interchange on the Web
  • OWL 2 Web Ontology Language (OWL2) - An ontology language for the Semantic Web with formally defined meaning

Meta Data

Metadata means "data about data." Metadata is defined as the data providing information about one or more aspects of the data; it is used to summarize basic information about data which can make tracking and working with specific data easier. In the context of digital ink, there are several relevant items of information that need to be shared:

  • Means of creation of the data
  • Purpose of the data
  • Time and date of creation
  • Creator or author of the data
  • Context for where the data was created
  • The ink-capturing environment & the environment in which it was created, such as the operating system (OS) or firmware of the input device
  • Data quality
  • Source of the data

The property is a simple key-value pair and a list of properties can be assigned to several Universal Ink Model structures.

Environment

The properties that can be used to describe the Environment where the ink is recorded:

  • os.name - Name of the operating system
  • os.version.name - Name of the version
  • os.version.release - Release build number
  • wacom.ink.sdk.name - Name of the Wacom Ink technology
  • wacom.ink.sdk.version - Version number of the SDK

InkInputProvider

The properties which can be used to describe the InkInputProvider:

  • pen.type - Type of the pen device

InputDevice

The properties which can be used to describe the InputDevice:

  • dev.id - ID of the input device
  • dev.manufacturer - Manufacturer of the input device
  • dev.model - Model of the input device
  • dev.codename - Code name

Ink Model

The properties which can be used to describe the Ink Model:

  • model.created - Creation time of the model
  • model.last_modified - Last time the model has been modified
  • model.content.type - Description of the type of content, (e.g 'notes', 'sketch').

The core structure used to assign semantic statements to ink is the grouping mechanism. In most cases, a single ink stroke does not contain meaningful information, only a collection or group of strokes does, as illustrated in Figure 1, where strokes s8, s9, s10, and s11 form the word "digital". The groups are handled in industry-standard tree data structures, holding a collection of nodes for ink strokes and related sensor data. For example, in Figure 2, g4 is the word "hello", g5 "world", and g6 is the punctuation symbol "!". These are combined in g2 to represent a text line, which is part of a paragraph. A similar structuring mechanism is used in Rich Text editing and here it is applied to ink documents.

Example ink document with groups

Figure 1: Example ink document with groups

Hierarchy of groups

Figure 2: Hierarchy of groups

An ink group combines a list of elements, either:

  • Sensor data sequences
  • Ink Paths
  • Chunks of RAW sensor data, or
  • Chunks of ink path

This basic concept is utilized in our Universal Ink Data format to store the semantic content for ink groups. Figure 3 illustrates how semantic statements are linked to Ink Groups, for the example introduced in Figure 1, the following semantic statements could be formulated:

So, applications receiving this semantically enriched ink can understand the content of Ink document.

Example of semantics that are assigned to groups

Figure 3: Example of semantics that are assigned to groups.

The Ink Model may hold a logical tree (InkTree), which represents a structure of hierarchically organized paths or raw input data-frames. This tree is also referred to as main ink-tree or primary ink-tree.

Ink Tree

The InkTree is defined as an ordered logical tree of nodes, called ink-nodes (InkNode). It is used to represent hierarchical structures of paths or raw input data-frames.

InkNode

A logical node of an ink-tree. In terms of a class hierarchy the InkNode should be considered an abstract class, which is specialized into the following classes:

  • PathNode - A leaf node, holding a reference to a Path, provided by the Path Repository
  • SensorDataNode - A leaf node, holding a reference to a SensorData instance, provided by the SensorData Repository
  • InkGroupNode (abstract) - A non-leaf node, used to group other ink nodes; specialized into:
    • PathGroupNode - A non-leaf node, used to group ink nodes of type PathNode and/or PathGroupNode
    • SensorDataNode - A non-leaf node, used to group ink nodes of type SensorDataNode and/or SensorDataGroupNode

The Ink Model may also hold a collection of views of the ink data (InkView), stored as a collection of named ink trees. These views represent different aspects of the Ink Model, for example, text segmentation view, named entity recognition view, etc.

InkView

An InkView is defined as a named ink-tree.

Knowledge Graph

The Ink Model Specification provides a standard mechanism to describe relationships between different parts of the ink model, and/or between parts of the ink model and external entities. The Ink Model keeps an instance of an RDF-compliant triple store, referred to as "Knowledge Graph" within this documentation. This triple store holds a list of semantic triples to encode relationships between subject, predicate, and object, as defined in the RDF specification.

Using the knowledge graph nodes of the ink views, contained within the ink model, could be annotated with additional metadata in order to describe different aspects of the ink model, for example, text segmentation view, named entity recognition view, etc.

Knowledge Graph, an RDF-compliant triple store, represented by a collection of SemanticTriple instances.

Semantic Triple

An RDF-compliant triple, which consists of:

  • Subject
  • Predicate
  • Object

The RDF data model is similar to classical conceptual modeling approaches (such as entity–relationship or class diagrams). It is based on the idea of making statements about resources (in particular web resources) in expressions of the form subject–predicate–object, known as triples. The subject denotes the resource, and the predicate, also known as the property of the triple, denotes traits or aspects of the resource, and expresses a relationship between the subject and the object.

Wacom Ontology Description Language

The Wacom Ontology Description Language (WODL) is a lightweight language to define the schema for the ink content schemas.

Ink Content Schemas

Ink content schemas describe the content written with ink.