If you are preparing texts for insertion directly into the CDLI repository you should read this document first.
Before learning any ATF it is useful to know a little about the history and current state of ATF. ATF was developed for use in CDLI, and was first defined as a relatively small specification which used only ASCII characters. Over time, two things have happened. Firstly, the range of texts encoded in ATF has grown, and ATF has grown with it. Secondly, ATF has been extended to allow Assyriologists to process legacy data more quickly and to type new texts in a format that is very close to the way things look on screen.
Because of the archival nature of the CDLI repository, we do not allow extended ATF to be used in the repository itself. Texts in Oracc's extended ATF will be converted to the archival core ATF format that we call Canonical ATF (C-ATF). Although C-ATF does not imitate the print versions of texts, C-ATF can be converted to a pretty-printed version using this webservice [http://oracc.museum.upenn.edu/doc/wwwhome/util/atfproc.html].
If you are typing texts to go directly into the CDLI repository you must follow the instructions in this document so that you create Canonical ATF (C-ATF) directly.
Oracc documentation is generally written using extended ATF; when you are writing C-ATF you need to be careful to adjust the examples appropriately.
cdli@cdli.ucla.edu
if you wish to recommend values which
are not listed there.sudx(|SU.KUR|)
.ab#
not
[a]b
.|UR2x(A.HA)|
is never qualified as damaged or
broken, only the whole sign. Similarly, a damaged number notation,
for instance [5(disz)]+4(disz)
must be coded as
9(disz)#
.%a
_lugal_
, not %a LUGAL
.(%hit ...) _%a mu-u2_
3(disz)
,
4(u)
etc.) except sexagesimal
numbers in Place Value notation (PVN).#atf: lang XXX
, where XXX is a language code#atf: use math
, where PVN is to be used#
-sign ("hash"-sign) introduces comments about
individual line content and always follows the commented line$
-sign introduces comment of text structure,
never of line content$
-lines for breakage of uncertain length must conform
to the following patterns:$ broken
(for instances of loss of full surface or column)$ beginning broken
(after this, use primes on
subsequent line numbers but where the length of the break is known,
instead enter all line numbers and use [...]
for the line
content; beginning broken
may also refer to some unknown
number of columns missing, after which the first preserved column is
to be qualified @column 1'
and so on)$ rest broken
(see above for both missing lines and columns)$ n lines broken
(within column and surface; line
numbering after resumption of preserved text is in sequence with the
number preceding the break with, for example, 5'.
following either 4.
or 4'.
.)&P100003 = AAS 015 #atf: lang sux @tablet @obverse 1. 1(disz) geme2 u4 1(disz)-sze3 2. ki dingir-ra-ta 3. da-da-ga 4. szu ba-ti @reverse 1. mu ki-masz{ki} ba-hul
The various ATF features illustrated here are:
&-line
&-line
giving the ID
and the text's designation according to the CDLI catalog; if your text
is not yet in the catalog, e-mail cdli@cdli.ucla.edu to get the ID and
designation.#atf: lang sux
#atf: lang akk
.@tablet
@object OBJECT_TYPE
,
e.g., @object head
.@obverse, @reverse
@left @right @top
@bottom
(but note that no physical surface of a tablet is to be
included in C-ATF unless it, such as @left
or in the case
of occasional partial sums at the bottom of colums in Ur III
administrative texts, assumes an explicit function in text
format)Determinatives are given in curly brackets.
Phonetic complements and glosses are marked with a +
immediately after the first curly bracket; they are assumed to be in the same language as the rest of the word.
&P348658 = SpTU 2, 055 #atf: lang akk @tablet @obverse 1. t,up-pi _a-sza3_ ki-szub-ba#-[a ...] 2. {i7}har-ri sza2 {d}muati? x [...] 3. ša2 qe2-reb unu#[{ki}]
There are no half-brackets in ATF: signs which are damaged are
flagged with the hash-sign (#
) after the grapheme.
Signs which are completely broken away are placed in square
brackets; square brackets may not occur inside a grapheme, only before
or after it. The ellipsis (...
) may be used to indicate that an
undeterminable number of signs is missing.
Signs which cannot be identified are transliterated as
x
; when a number is missing the convention is to use
n
as in n(disz)
. Both
within or after the parentheses further qualification of
n
as n(disz)
is allowed.
?
) which can be placed
after a grapheme to indicate uncertainty of reading; the asterisk
(*
) which indicates a collated reading; and the
exclamation mark which indicates correction. After a corrected sign,
the actual sign on the tablet may optionally be given, using sign
names in upper case: a!
or ki!(DI)
.Steve Tinney
Steve Tinney, 'CDLI ATF Primer', Oracc: The Open Richly Annotated Cuneiform Corpus, Oracc, 2019 [http://oracc.museum.upenn.edu/doc/help/editinginatf/cdliatf/]