This document provides an overview of language-specific annotation conventions for Sumerian syntax used in Oracc.
There is also a primer on linguistic annotation of Sumerian, which you should read before this page.
The syntax annotation conventions are developed and maintained in and adjunct project related to the Pennsylvania Sumerian Dictionary. This project is known as the PPCS: the Penn Parsed Corpus of Sumerian. Its documentation (which is in the process of being updated) is available on the PPCS website [http://psd.museum.upenn.edu/ppcs/].
The PPCS is based on a Penn tradition of parsed corpora which have common vocabulary and annotation conventions and this has informed the descriptive vocabulary used for syntactic features. Readers interested in the broader context of this work can see the Penn Parsed Corpora of Historical English Home Page [http://www.ling.upenn.edu/hist-corpora/].
Syntax processing is carried out using a Sumerian-specific parser which understands enough about Sumerian syntax to enable it to work effectively on many texts with only minimal hints being given in the annotation. In the following sections we describe what the parser can and cannot do, and what it sometimes needs to be prevented from doing, at the same time as we describe the notations and features relevant to annotation in ATF files.
Parenthetic noun-phrases (NP-PRN) are the name used for appositional phrases. The parser normally recognizes parenthetic phrases automatically by considering the context and the semantic class of key words.
To force a lemma to be the head noun of a parenthetic phrase, use
the notation +,
(plus-sign followed by
comma). To suppress parenthetic interpretation, use -,
(minus-sign followed by comma).
In the following example, the parser defaults to understanding
dijir
as parenthetic but that is incorrect; the
annotation suppresses the parser's enthusiasm in this case:
1. {d}nin-jesz-zid-da #lem: DN 2. dijir gu3-de2-a-kam #lem: -, dijir[deity]; RN
Because the parser defaults to understanding adjacent otherwise unmarked NPs as parenthetic it is necessary to mark phrasal conjunctions unless they are in the parser's built-in table.
The following entries are in the parser's table:
# List of N N conjunction pairs # input must be a list of pairs of ePSD CFGWs an[heaven] ki[earth] dam[spouse] dumu[child] kug[metal] zagin[lapis lazuli] gud[ox] udu[sheep] kugsig[gold] kugbabbar[silver] Enki Ninki
Common cases should be added to this list; less common cases can be
annotated using the convention +&
(plus-sign ampersand). Similarly, an unwanted conjunction can be suppressed using the
notation -&
(minus-sign
ampersand):
1. ku6 muszen #lem: kud[fish]; +& muszen[bird]
Nouns can modify nouns in Sumerian and the parser supports this
with a built-in table (derived mainly from third millennium royal
inscriptions) as well as the notation +<
. The current version of the table contains
the following entries:
# List of N N modifier pairs; default is to assume post-mod # for pre-mod put '+' before first element # # input must be a list of pairs of ePSD CFGWs; proper nouns should # not have a guideword element e[house] ub[corner] šita[weapon] saŋ[head] šita[weapon] ur[lion] ur[lion] saŋ[head] aga[rear] eren[cedar] eš[shrine] Bagara eš[shrine] Gutur eš[shrine] Nibru eš[shrine] Ŋirsu gu[neck] Idigna iri[city] Ŋirsu egal[palace] Tiraš hursaŋ[mountain] Uringiriaz hursaŋ[mountain] Magan kur[foreign land] Magan kur[foreign land] Dilmun abulla[gate] Kasurra i[oil] nun[prince] ir[scent] nun[prince] mu[name] gilsa[treasure] kar[harbor] zagin[lapis lazuli] udu[sheep] i[oil] udu[sheep] nita[male] gur[unit] lugal[king]
A manually annoted N-N modifier looks like this:
1. ab2 ti dara4 #lem: ab[cow]; +< ti[rib]; dara[brown] #tr: brown-ribbed cow
Premodification, with the exception of kug inana, must be annotated
manually using the notation +>
:
1. kug ama {d}nansze #lem: kug[pure] +> ; ama[mother]; DN
Annotation of subordinate clauses generally involves only giving
the start of the clause using the ([WORD])
convention. Notes on individual
clause types are given below.
The parser is usually able to determine the start of a relative
clause. If necessary, however, the notation (S-REL)
can be inserted before the first lemma
of the clause:
1. lu2 e2 mu-du3-a #lem: lu[person]; (S-REL) e[house]; du[build]
N.B.:This is just by way of example; relatives like this are correctly handled by the parser without hinting.
The parser never tries to guess the start of a DE3-clause. As a result, it is always necessary to supply the start and type of the clause. The most common type is purpose (S-PRP); other DE3-clauses should be tagged as S-ADV:
1. e2 du3-u3-de3 i3-jen #lem: (S-PRP) e[house]; du[build]; jen[go]18 Dec 2019
Steve Tinney
Steve Tinney, 'Annotation of Sumerian syntax', Oracc: The Open Richly Annotated Cuneiform Corpus, Oracc, 2019 [http://oracc.museum.upenn.edu/doc/help/languages/sumerian/syntax/]