OSL Ideographic Variation

This page describes an experimental implementation of Unicode Ideographic Variation Sequences as defined in Unicode Technical Report 37 (referred to as TR37 in the following). [https://unicode.org/reports/tr37] in OSL and Oracc fonts. The original idea of using these with Unicode cuneiform is owed to Robin LeRoy.

Overview

Unicode IVSs provide a means for selecting glyph variants in a standardized way. They work as a character pair in which the second character is a selector for a glyph variant of the first. The selectors are in the range U+E0100-U+E01EF.

The Unicode Ideographic Variation Database, IVD, as defined in TR37 is very simple: a collection must be registered with basic metadata. A complementary file defines sequences in the collection consisting of pairs of base characters and variation selectors along with a collection name and a name for the sequence.

Oracc_OSL Collection

The name of the Oracc OSL collection is Oracc_OSL. This implementation is presently unregistered, but the collection data would be of the form:

	Oracc_OSL;[A-Z]+[0-9]*(?:[@%&*.][A-Z]+[0-9]*)*_[A-Z]+[0-9]*(?:[@%&*.][A-Z]+[0-9]*)*;http://oracc.org/osl/ivs/
      

Oracc_OSL Identifier Structure

Oracc_OSL identifiers consist of a BASESIGN, underscore character, '_', and VARDATA. Both BASESIGN, the sign subject to variation, and VARDATA, the variation, consist of uppercase letters A-Z followed by optional digits 0-9, followed by optional compound sign parts. A compound sign part consists of a delimiter from the set @%&*. plus a sign name, consisting of uppercase letters A-Z followed by optional digits 0-9.

The BASESIGN is always derived from an OSL sign name translating subscript digits to regular digits and substituting SH for Š, e.g., MU, NI2, or SHU. For compound signs the sign name is simplified by omitting vertical bars and parentheses and mapping any TIMES sign to '*', e.g., |GA₂×AN| would be specified as GA2*AN.

The VARDATA is either an OSL sign name or a descriptive label for the variant selected via the sequence as described further in the next section.

Oracc_OSL Reference Glyphs

A table of Oracc_OSL reference glyphs will be maintained at XXX/osl/ivsglyphs.html.

Oracc_OSL use of Variation Selectors

Oracc_OSL defines IVSs for two reasons to support research annotation of cuneiform texts where it is important to encode both the base sign and its paleographic variant forms.

An example would be that a variation of the MU sign, mostly associated with ED Adab, is to form the SHE-style component of the sign with a KASKAL-style component instead:

	# MU sign with KASKAL-style replacement for normal SHE-style component
	1222C E0100;Oracc_OSL;MU_KASKAL
      

In the case of variant forms, The VARDATA component of the label MU_KASKAL may qualify the variation rather than defining the target form as a merged sign.

Oracc_OSL Ideographic Variation Database

The Oracc_OSL Ideographic Variation Database, OIVD, is kept in 00etc/Oracc_OSL.txt in the Oracc OSL repo, and is available online at XXX.

The OIVD contains all possible applications of IVS in Oracc_OSL; not all script phases will exhibit all IVS traits, however.

Implementation Notes

Fonts

In Oracc cuneiform fonts IVS sequences are treated as OpenType LIGA entries [/osl/OraccCuneiformFonts#h_ligaturesliga].

Annotation

In Oracc's transliteration system IVS sequences will be selectable using OSL Graphetics tags. [/osl/OraccCuneiformFonts/OSLGraphetics].

 
Back to top ^^
 
CC BY-SA The OSL Project, 2014-
http://oracc.org/OraccCuneiformFonts/OSLIdeographicVariation/