This document describes how to create scores using Oracc.
Oracc divides scores into two types and two modes. The two types are the matrix and the synoptic; the two modes are parsed and unparsed. To specify that an ATF transliteration is some kind of score, you give one of the following @-protocols immediately after the &-line:
@score matrix parsed @score matrix unparsed @score synoptic parsed @score synoptic unparsed
Both types of scores use a composite line--the reconstructed line which does not belong to a physical source--and exemplar lines, which are the individual manuscript witnesses. In a synopsis, the composite line may be empty; in either type the exemplars may be empty.
In a matrix, the composite line is used to establish a set of columns; the exemplar lines are given in schematic form, with each column of the composite having a corresponding entry in the exemplars' columns. Matrices may be specified at the grapheme or word level: in Sumerian, they are typically specified at the grapheme level. The actual rules for matrices are given later in this page, but an example will give the general idea:
1. šag4-ga-ne2 er2 im-si edin-še3 ba-ra-e3 N1,i_1: . . , + + + + , + + + N2,1: + + + . . . . . . . . N3,1: + . . . . . . . . . . Ki4,1: . . . . . . , + + + + Šad1,1: . + + + + + + + , . . Su1,2': . . . . . ši + . . . . X5,i_1: . . . . . . . . . . .
The default matrix-level is graphemic. To specify that your matrix
is word-level in a parsed score, add the token word
at
the end of the @score
line (note that you don't need to
do this with an unparsed matrix):
@score matrix parsed word
In a synoptic score, the composite line and exemplar lines are both written out in full. The composite line may be empty, but this is not a recommended practice (it is allowed, as there is so much legacy data using this approach, but for new data the editor should, whenever possible, specify the composite text and ensure that it is properly used as the basis for translation).
In a parsed score, the composite and exemplar lines are validated and parsed according to the respective rules of the score types.
In an unparsed score, the composite line must be parsable, but the exemplars are treated as a blob of text which is opaque to Oracc's parsers. This is strictly a transitional feature, which allows legacy data to be presented to users until such time as the necessary changes can be made to bring the score in line with Oracc parsing conventions.
The composite and exemplar line-labeling conventions are shared by all types and modes of score.
The composite line label has the same format as any other transliteration line in ATF: it is a string terminated by a period and one or more space or tab characters.
An exemplar line label is basically string terminated by a colon and one or more space or tab characters. This string, however, has its own internal structure, consisting of a siglum, or code for a witness, and an optional line label.
#link
protocol:#link: def A = P123456 = N 1
#link: def B = X000001 = N 1 = PBS 1, 29
#key: siglum-map FROM_SIGLUM=>TO_SIGLUM
#link: def A = cmawro:P445799 = KUB 37, 44-49 ... #key: siglum-map A₁=>A #key: siglum-map A₂=>A #key: siglum-map A₃=>A ... A₁,obv_i_1′: ...
A,i_1: a
A,i_1;a: a
A,i_1_-_i_2b: a
Synoptic conventions for parsed line-content are simple: they are regular ATF.
Matrices have quite different conventions for composite and exemplar lines.
Composite lines in matrices are basically in ATF, with these additions:
.
(period),
(comma)x
(lowercase letter ex)^
(caret)#
(hash)#
or .
depending on whether it is likely
that the exemplar originally may have had a corresponding signThese are a subset of the standard ATF flags (#
is not
used as a flag in matrices):
;
(semi-colon)/
(forward slash)& (ampersand)
Lines beginning with a $
are ATF $-line. If the
$-line comes immediately after the composite, it is included in the
composite. If the $-line contains an exemplar label, it pertains to
that witness:
Ki1: $ Omits.
Oracc can compute transliterations from scores of both kinds by sorting on the labels, splitting and reassembling portions of lines as necessary. For parsed synopses this procedure is relatively straightforward as the transliterations are already ATF. For unparsed synopses the witness transliteration is assumed invalid by the Oracc processor and is presented as-is.
For matrices, only the parsed form can be turned into witness transliterations: unparsed matrices cannot be meaningfully handled by the processor.
In a matrix, the grapheme codes are simply replaced as appropriate with graphemes from the composite line. Where the exemplar contains transliteration, the texts replaces the text in the composite line. This results in a number of preferred-practice recommendations to ensure that the witness generation is as accurate as possible:
The note marker ^[DIGITS]^ should be used in the matrix to handle insertions that belong in the witness transliteration but which overload the matrix structure. If the text immediately following the corresponding note label begins with a +, it is included with the matrix line's text when the source is rendered as a witness transliteration; this text must be terminated by a period followed by a space.
55: ... szu-na ba-... N₂₁_r_5': + +^1^ #note: +{{NI?}} . Any note begins here ...
Is rendered as:
5. ... szu-na{{NI?}} ba-...
After the initial +, text is inserted literally--to get a space or hyphen before the text it must be given after the +. The same rules apply at the end of the inserted text, hence the space before the closing period in the example above.
Steve Tinney
Steve Tinney, 'Scores', Oracc: The Open Richly Annotated Cuneiform Corpus, Oracc, 2019 [http://oracc.museum.upenn.edu/doc/help/editinginatf/scores/]