Skip to main content
Version: Wacom Ink SDK for documents

Documents SDK - IDML Format

caution

Important Note: This series of Wacom products is now discontinued and support for them is no longer offered.

1. Overview

The Ink Document Metadata file format defines active areas within a PDF document by the use of Extensible Metadata Platform (XMP) packets with a custom schema. Each active area within a document has a matching AcroForm object within the PDF. An XMP packet embedded in the metadata stream of the matching AcroFormObject stores the data describing the behaviour and features of the active area. An XMP packet embedded in the metadata stream for the PDF page object describes the features such as barcode ID for each individual page. An XMP packet embedded in the metadata stream of the root PDF object stores the metadata related to the document as a whole.

Throughout the libraries and documentation there are references to 'Barbera' and 'Baxter' for the following reason: The Baxter SDKs were initially created to support CLB-Create and CLB-Paper applications developed for the Wacom Clipboard PHU-111, codenamed Barbera. The source of the name Baxter is ‘Barbera Extensible Translator’.

The SDKs can be used universally to extend the contents of PDF documents, as described in the Ink Document Metadata Format (aka Barbera File Format).

1.1 Schema Overview

The XML namespace for the Barbera file format is http://wacomgss.com/barbera/1.0/
XMP packets define the namespace in this way:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description
xmlns:xmp="http://ns.adobe.com/xap/1.0/"
xmlns:wgss="http://wacomgss.com/barbera/1.0/"
/>
</rdf:RDF>

The file format has three types of XMP packet:

  • Document Level Packet: Contains metadata covering the whole document
  • Page Level Packet: Contains metadata for a particular page
  • Object Level Packet: Contains metadata for a particular object

Before data capture, each XMP packet defines the characteristics and type of a particular area. After data capture, the XMP packet stores both raw and converted data alongside the area characteristics. The schemas outline below are placed between rdf:Description elements of the XMP packet.

1.2 Definitions

This section gives a high level overview of the terms used and the underlying technology.

1.2.1 PDF

PDF or ‘Portable Document Format’ is a file format used to present documents in a program, software and hardware independent way, as defined by the ISO 32000–1 specification. The PDF format is a subset of COS (Carousel object structure). A COS tree mainly consists of objects, of where there are 8 types:

  • Boolean (true and false)
  • Number
  • String (delimited in parentheses, may contain 8 bit chars)
  • Name (starting with a ‘/’)
  • Array (ordered collection of objects delimited in square brackets)
  • Dictionary (Collections of objects indexed by Names, stored in << >> pairs)
  • Stream (Usually large amounts of data either compressed or binary)
  • Null

Objects can be direct (embedded within another object) or indirect. Indirect objects have an object number and a generation number (defined between obj and endobj keywords). An index table (xref keyword) follows the main body of the data and gives the byte offset of each indirect object from the start of the file. At the end of the pdf the file trailer (trailer keyword) contains a dictionary, an offset to the start of the index table and the %%EOF (end of file marker). The dictionary contains a reference to the root object of the tree structure, and the count of indirect objects in the index table. This dictionary is also used to store metadata for the document.

1.2.2 XMP

XMP or ‘Extensible Metadata Platform’ is an ISO standard for the creation, processing and interchange of custom metadata for digital documents and data sets. PDFs support XMP packets embedded within metadata streams of COS objects, allowing the data definitions for the IDML file format to be part of the same file used to print the form itself. Using XMP in this way allows the flexibility to store the IDML metadata and captured pen data in the digitally signed document, readable on standard a PDF readers.

1.2.3 AcroForm

An AcroForm (sometimes called an Acrobat form) is a collection of active form objects such as a checkbox, text input field or radio button added to a PDF to create an interactive form. The IDML file format uses AcroForms to bind active areas to underlying PDF objects.

1.2.4 Smart Pad

A smart pad refers to the Wacom device that captures pen data from an inking pen, with the intention of recording a digital copy of the real ink, drawn by the pen on paper. This includes the Barbera and Columbia devices.

1.2.5 Pen data

For the purposes of this document, pen data refers to the digital version of the data captured directly from the smart pad device. The pen data is a collection of strokes, with each stroke containing a collection of points. Each point contains x, y, pressure and time data. Each stroke also contains the ink colour for that particular stroke, and the ink thickness for the stroke.

1.2.6 HWR (Hand writing recognition)

For certain active areas, it is expected that the client application will perform automatic hand writing recognition( HWR). This translated data will be stored in the the field level XMP data. The appearance stream for the active area will be set to show the stroke data so that the visual appearance of the digital and analogue documents is the same.

1.2.7 Centripetal Catmull-Rom Spline

A Centripetal Catmull-Rom spline is a type of interpolating spline (a curve that goes through its control points), defined by four control points (P0 to P3) where the curve is only drawn between P1 and P2.

2. Scope

This document defines the IDML file format. The format is a collection of XMP packets embedded within metadata streams of a PDF document. The three types of XMP packet (Document Level, Page Level and Object Level). Sections 3, 4 and 5 of this document define the schema for each packet, and section 7 contains a full document example.

3. Common Data Type Definitions

This section covers the XMP chunks that are common between sections, including:

  • XMP Preamble
  • Pen Data / Compressed Pen Data
  • WGSS Packet Type Marker

3.1 XMP Packet Wrapper

Every XMP has a pair of XML processing instructions, that contain both the rdf:RDF element and the rest of the packet data. The header XML processing instruction contains a UUID and the unicode character U+FEFF used as a byte-order marker (￯ shows the 0xFEFF character):

<?xpacket begin="￯" id="W5M0MpCehiHzreSzNTczkc9d"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:xmp="http://ns.adobe.com/xap/1.0/" xmlns:wgss="http://wacomgss.com/barbera/1.0/">
<rdf:Description rdf:about="http://signature.wacom.eu"/>
</rdf:RDF>
<?xpacket end="w"?>

If the XMP block is writeable, the end attribute of the trailer should be ‘w’, for a read only block this should be ‘r’.

3.2 Pen Data

The pen data sub-packet records input data received from the smart pad device. This is a collection of strokes, each of which is a collection of points that represent control points on a catmull-rom spline and a width at that point:

<wgss:PenData>
<rdf:Seq>
<rdf:li rdf:parseType="Resource">
<wgss:Stroke>
<rdf:Seq>
<rdf:li wgss:inkColor="#FFFFFF" wgss:x="1" wgss:y="2" wgss:w="3" />
<rdf:li wgss:inkColor="#FFFFFF" wgss:x="2" wgss:y="3" wgss:w="3" />
<rdf:li wgss:inkColor="#FFFFFF" wgss:x="3" wgss:y="4" wgss:w="3" />
<rdf:li wgss:inkColor="#FFFFFF" wgss:x="4" wgss:y="5" wgss:w="4" />
<rdf:li wgss:inkColor="#FFFFFF" wgss:x="5" wgss:y="6" wgss:w="4" />
<rdf:li wgss:inkColor="#FFFFFF" wgss:x="6" wgss:y="7" wgss:w="5" />
<rdf:li wgss:inkColor="#FFFFFF" wgss:x="7" wgss:y="8" wgss:w="6" />
</rdf:Seq>
</wgss:Stroke>
</rdf:li>
<rdf:li rdf:parseType="Resource">
<wgss:Stroke>
<rdf:Seq>
<rdf:li wgss:inkColor="#FFFFFF" wgss:x="4" wgss:y="5" wgss:w="4" />
<rdf:li wgss:inkColor="#FFFFFF" wgss:x="5" wgss:y="6" wgss:w="4" />
<rdf:li wgss:inkColor="#FFFFFF" wgss:x="6" wgss:y="7" wgss:w="5" />
<rdf:li wgss:inkColor="#FFFFFF" wgss:x="7" wgss:y="8" wgss:w="6" />
<rdf:li wgss:inkColor="#FFFFFF" wgss:x="1" wgss:y="2" wgss:w="3" />
<rdf:li wgss:inkColor="#FFFFFF" wgss:x="2" wgss:y="3" wgss:w="3" />
<rdf:li wgss:inkColor="#FFFFFF" wgss:x="3" wgss:y="4" wgss:w="3" />
</rdf:Seq>
</wgss:Stroke>
</rdf:li>
</rdf:Seq>
</wgss:PenData>

The example above shows three strokes. When recording pen data, 'air strokes' where the pen is not in contact with the device are discarded.

3.2.1 Stroke

A stroke is an RDF sequence of points, the color of the stroke is defined in the first point in the sequence. If the inkColor attribute is present in any point other than the first of the sequence, it will be ignored.

<wgss:Stroke>
<rdf:Seq>
<rdf:li wgss:inkColor="#FFFFFF" wgss:x="1" wgss:y="2" wgss:w="3" />
<rdf:li wgss:inkColor="#FFFFFF" wgss:x="2" wgss:y="3" wgss:w="3" />
<rdf:li wgss:inkColor="#FFFFFF" wgss:x="3" wgss:y="4" wgss:w="3" />
<rdf:li wgss:inkColor="#FFFFFF" wgss:x="4" wgss:y="5" wgss:w="4" />
<rdf:li wgss:inkColor="#FFFFFF" wgss:x="5" wgss:y="6" wgss:w="4" />
<rdf:li wgss:inkColor="#FFFFFF" wgss:x="6" wgss:y="7" wgss:w="5" />
<rdf:li wgss:inkColor="#FFFFFF" wgss:x="7" wgss:y="8" wgss:w="6" />
</rdf:Seq>
</wgss:Stroke>

3.2.1.1 Point

Each point is an RDF list item, representing a control point on a catmull-rom spline, along with the thickness of the stroke at that point. It will also contain the inkColor.

<rdf:li wgss:inkColor="#FFFFFF" wgss:x="1" wgss:y="2" wgss:w="3" />

The point element has the following attributes describing the control point:

AttributeDetail
xThe x coordinate of the point in device units
yThe y coordinate of the point in device units
wThe width of the stroke at this point
inkColorThe Color of the point (and or stroke)

x,y,w are expressed in device units.

3.3 Compressed Pen Data

Sometimes it will be preferable to store the pen data in a compressed format in order to reduce the size of the meta data streams that have bee added to the document. In these instances, any 'Pen Data' sections within the file format can be substituted with a 'Compressed Pen Data' section. A compressed pen data section stores the stroke data as serialised,compressed base64. Serialisation and compression/decompression is handled by IDML so data is presented uncompressed to developers.

For the purposes of encoding this point data into the meta data stream, the byte data is simply base64 encoded, and placed in the appropriate XMP tag:

<wgss:CompressedPenData>
<wgss:compressedStroke="MzQ1MTMwOTQ2NTEyMjkzNDc1ODI5MzY0NTgyMzQ2NzU5MjgzNDc2NTkyODM0NzY1MjA0NzY1MjAzODQ3NTYyOTgzNDc2NTIzODQ3NTYzODQ3NTYyMDgzNDc1NjI4MDM0NzU2MjM4NDdpNTYyODM0NzU2Mjg3MzQ1NjI4MzQ3Nmk1">
<wgss:compressedStroke="1YyMDgzgyMzQ2NzU5MjgzNDc2NTODI5MzY0NTMMzQ1MTMwOTQ2NTEyM3MzQ1NjI4MzQ3NjkzNDcjA0NzY1NTIzODQ3NTYzODQ3NTkyODM0NzY1NDc1NjIDc2DM0NzU2Mjgmk14MDM0NzU2MjM4NDdpNTYyOMjAzODQ3NTYyOTgzN">
</wgss:CompressedPenData>

3.4 WGSS Packet Type Marker

Each XMP Packet will have an identifying element so parsers can determine which Level the XMP Packet belongs to.

<wgss:PacketType wgss:level="document"/>
LevelDetail
documentThe contents of this XMP packet defines Document Level data
pageThe contents of this XMP packet defines Page Level data
fieldThe contents of this XMP packet defines Field Level data

4. Document Level Definitions

This section defines features specified in the document level XMP packet, including:

  • Authoring tool version
  • Page ID list
  • Page ID order
  • Document completion time
  • Smart pad device ID
  • Smart pad device characteristics
  • Client application ID
  • Client device ID

4.1 Authoring Tool Version

This element records the version number and name of the authoring tool used to create the file:

<wgss:AuthoringTool wgss:version="1.0" wgss:toolname="Barbera Authoring Library"/>

The ‘version’ attribute denotes the version number of the client, and the tool name attribute is the human readable name for the authoring tool used.

4.2 Page ID List

This element lists the page IDs (barcodes) for each active page in the document, and which PDF page the page ID relates to:

<wgss:PageIDList>
<rdf:Bag>
<rdf:li wgss:pdfPage="1" wgss:uuid="1234567" />
<rdf:li wgss:pdfPage="3" wgss:uuid="1234568" />
<rdf:li wgss:pdfPage="2" wgss:uuid="1234569" />
</rdf:Bag>
</wgss:PageIDList>
AttributeDetail
pdfPageThe PDF page that the ID relates to
uuidThe barcode ID for the particular page in the PDF

4.3 Page Completion Order

The page order element lists the order in which the PDF pages should be completed:

<wgss:PageCompletionOrder>
<rdf:Seq>
<rdf:li>1</rdf:li>
<rdf:li>2</rdf:li>
<rdf:li>3</rdf:li>
</rdf:Seq>
</wgss:PageCompletionOrder>

In the example above the pages are completed 1, 2 then 3. This is an optional element.

4.4 Document Completion Time

This element is set to the time when a document is sealed, and marks the time the document was completed:

<wgss:DocumentCompletionTime>2016-06-17T13:32:45.5316112Z</wgss:DocumentCompletionTime>

The time is specified in ISO–8601 format (“yyyy-MM-dd’T’HH:mm:ss.fffffffZ”") in UTC.

4.5 Smart Pad Device ID

The’SmartPad' element stores the ID of the last smart pad device used to complete the document:

<wgss:SmartPad wgss:id="1234567890" wgss:deviceName="Joss' Barbera Device"/>
AttributeDetail
idThe device ID of the smart pad used
deviceNameThe human readable name of the smart pad device (as configured by the user)

4.6 Smart Pad Device Characteristics

The’SmartPadCharacteristics’ element stores details about the smart pad used to capture data:

   <wgss:SmartPadCharacteristics wgss:unit="inch" wgss:pointsPerUnit="200" wgss:width="10400" wgss:height="10400"/>
AttributeDetail
unitThe unit of measurement for the characteristics (e.g. meter, inch)
pointsPerUnitThe number of devices points per unit / the density of the sensor
widthThe width of the device sensor in device points
heightThe height of the device sensor in device points

4.7 Client App

The ‘ClientApp’ element stores information about the application that was used to capture data from the smart pad device:

<wgss:ClientApp wgss:version="1.0" wgss:os="iOS" wgss:applicationName="barbera for iOS"/>
AttributeDetail
versionThe client application version number
osThe operating system for the client
applicationNameThe name of the client application

4.8 Client Device

The ‘ClientDevice’ element stores information about the device that the client application is running on:

<wgss:ClientDevice wgss:id="1234567890" wgss:deviceClass="iPad" wgss:deviceName:"Joss' iPad Pro"/>
AttributeDetail
idThe device ID as returned by the host OS
deviceClassThe type of device (e.g. iPhone)
deviceNameThe human readable name of the device as returned by the host operating system

4.9 Full document level XMP sample

This is the full document level XMP packet for a 3 page document that was completed on an iPad pro:

<?xml version="1.0"?>
<?xpacket begin="￯" id="W5M0MpCehiHzreSzNTczkc9d"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:xmp="http://ns.adobe.com/xap/1.0/" xmlns:wgss="http://wacomgss.com/barbera/1.0/">
<rdf:Description rdf:about="http://signature.wacom.eu">
<wgss:PacketType wgss:level="document"/>
<wgss:PageIDList<wgss:PageIDList>
<rdf:Bag>
<rdf:li wgss:pdfPage="1" wgss:uuid="1234567" />
<rdf:li wgss:pdfPage="3" wgss:uuid="1234568" />
<rdf:li wgss:pdfPage="2" wgss:uuid="1234569" />
</rdf:Bag>
</wgss:PageIDList>
<wgss:PageCompletionOrder>
<rdf:Seq>
<rdf:li>1</rdf:li>
<rdf:li>2</rdf:li>
<rdf:li>3</rdf:li>
</rdf:Seq>
</wgss:PageCompletionOrder>
<wgss:AuthoringTool wgss:version="1.0" wgss:toolname="Barbera Authoring Library"/>
<wgss:DocumentCompletionTime>2016-06-17T13:32:45.5316112Z</wgss:DocumentCompletionTime>
<wgss:SmartPad wgss:id="1234567890" wgss:deviceName="Joss Barbera Device"/>
<wgss:SmartPadCharacteristics wgss:unit="inch" wgss:pointsPerUnit="200" wgss:width="10400" wgss:height="10400"/>
<wgss:ClientApp wgss:version="1.0" wgss:os="iOS" wgss:applicationName="barbera for iOS"/>
<wgss:ClientDevice wgss:id="1234567890" wgss:deviceClass="iPad" wgss:deviceName="Joss' iPad Pro"/>
</rdf:Description>
</rdf:RDF>
<?xpacket end="w"?>

5. Page Level Definitions Section

This section defines the fields that can be set in a page level XMP packet, including:

  • Page ID
  • Field ID list
  • Pen Data

These settings relate to a particular page within a document. The XMP packet is embedded into the metadata stream for the PDF page that it relates to.

5.1 Page ID

This element identifies the XMP packet within the document (ie this packet applies for PDF Page blah with barcode blah)

<wgss:PageID wgss:pdfPage="1" wgss:uuid="1234567" wgss:uuid_type="Code128"/>

5.2 Field IDs for Page

This element lists the field IDs for the current page:

<wgss:FieldIDList>
<rdf:Bag>
<rdf:li>142532</rdf:li>
<rdf:li>142533</rdf:li>
<rdf:li>142534</rdf:li>
</rdf:Bag>
</wgss:FieldIDList>

FieldIDs correspond to the underlying AcroForm objects PDFName (T:).

5.3 Pen Data

This element contains all pen data captured outside active areas on the page:

<wgss:PenData>
<rdf:Seq>
<rdf:li rdf:parseType="Resource">
<wgss:Stroke>
<rdf:Seq>
<!-- sequence of points for stroke -->
<rdf:li rdf:parseType="Resource">
<wgss:Point wgss:x="10" wgss:y="13" wgss:w="53" wgss:inkColor="#FFFFFF"/>
</rdf:li>
<rdf:li rdf:parseType="Resource">
<wgss:Point wgss:x="11" wgss:y="14" wgss:w="52"/>
</rdf:li>
<rdf:li rdf:parseType="Resource">
<wgss:Point wgss:x="12" wgss:y="16" wgss:w="48"/>
</rdf:li>
<rdf:li rdf:parseType="Resource">
<wgss:Point wgss:x="13" wgss:y="14" wgss:w="10"/>
</rdf:li>
</rdf:Seq>
</wgss:Stroke>
</rdf:li>
</rdf:Seq>
</wgss:PenData>

The format is defined in section 3.2 ‘Pen Data’.

5.4 Full Page Level XMP Sample

Here is a full sample of a page level XMP packet:

<?xml version="1.0"?>
<?xpacket begin="￯" id="W5M0MpCehiHzreSzNTczkc9d"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:wgss="http://wacomgss.com/barbera/1.0/">
<rdf:Description rdf:about="http://www.w3.org/">
<wgss:PacketType wgss:level="page"/>
<wgss:PageID wgss:pdfPage="1" wgss:uuid="1234567"/>
<wgss:FieldIDList>
<rdf:Bag>
<rdf:li>142532</rdf:li>
<rdf:li>142533</rdf:li>
<rdf:li>142534</rdf:li>
</rdf:Bag>
</wgss:FieldIDList>
<wgss:PenData>
<rdf:Seq>
<rdf:li rdf:parseType="Resource">
<wgss:Stroke>
<rdf:Seq>
<!-- sequence of points for stroke -->
<rdf:li rdf:parseType="Resource">
<wgss:Point wgss:x="10" wgss:y="13" wgss:w="53" wgss:inkColor="#FFFFFF"/>
</rdf:li>
<rdf:li rdf:parseType="Resource">
<wgss:Point wgss:x="11" wgss:y="14" wgss:w="52"/>
</rdf:li>
<rdf:li rdf:parseType="Resource">
<wgss:Point wgss:x="12" wgss:y="16" wgss:w="48"/>
</rdf:li>
<rdf:li rdf:parseType="Resource">
<wgss:Point wgss:x="13" wgss:y="14" wgss:w="10"/>
</rdf:li>
</rdf:Seq>
</wgss:Stroke>
</rdf:li>
</rdf:Seq>
</wgss:PenData>
</rdf:Description>
</rdf:RDF>
<?xpacket end="w"?>

The sample defines the data for the second page of a PDF document, with a barcode of 1234569. The page has 3 active fields, and one stroke has been captured outside any active areas.

6. Object Level Definitions Section

This section defines the fields that can be set in an object level XMP packet, including:

  • Field ID
  • Field Location
  • Field type
  • Tag
  • Required
  • Pen Data
  • Completion Time

These settings relate to a particular object within a PDF document. The XMP packet is embedded into the metadata stream for the AcroForm object that it relates to.

6.1 Common attributes for object level fields

This section lists values that are common between all field types.

6.1.1 Field UUID

The UUID element specifies the UUID for the field:

<wgss:FieldUUID wgss:pdfID="5A234" wgss:fieldID="12345"/>
AttributeDetail
pdfIDThe Nameof the underlying PDF object
fieldIDIDML Specific Field ID

6.1.2 Field Location

The field location element specifies the location of the active area on the page:

<wgss:FieldLocation wgss:x="123" wgss:y="345" wgss:w="100" wgss:h="300"/>

The location is a rectangle defined by an origin and a size element. The values are defined in PDF coordinates, with the following attributes:

AttributeDetail
xThe X coordinate of the origin for the active area
yThe Y coordinate of the origin for the active area
wThe width for the active area
hThe height for the active area

6.1.3 Field Type

The field type attribute defines the field type of the active are:

<wgss:FieldType>Text</wgss:FieldType>