Grammar Engineering Frequently Asked Questions
What do the punctuation marks mean in the tdl files? (A very basic guide to tdl syntax.)
tdl (Type Description Language), like any computer language, has a particular syntax that must be followed. If you deviate from the accepted syntax, you'll either get an error as the LKB tries to load your grammar, or the LKB will interpret what you wrote differently from how you intended.
Typical type definitions looks like this:
type-name := supertype1 & supertype2 &
[ FEATURE1 value1,
FEATURE2 [ FEATURE3 value2 ]].
foo := bar &
[ BAZ #coref & quux,
ZXC #coref ].
another-type := some-type &
[ ZXC spqr &
[ WOMBAT foo ]].
Some things to note:
- There can't be any spaces in the type name.
- The first thing after the type name is := (or :+, see below).
- After := comes at least one parent type (supertype), followed by an & sign. There may be as many parent types as you like, each followed by their own & sign.
- Next comes the constraints between square brackets.
- The square brackets must be "balanced" (as many right brackets as left ones).
- Each constraint consists of a feature and its value.
- By convention, feature names are written in ALL CAPS. Types are written in lowercase.
- Feature structures can be nested within each other. For example, the value of FEATURE2 in the example above is the feature structure [ FEATURE3 value2 ].
- If there are multiple feature-value pairs in the same feature structure, all but the last are followed by a comma.
- The end of a type definition is signalled by a period.
- The pound sign (#) is used to indicate that the values of two features (which could be at different levels of embedding in the feature structure) must be the same. # is followed by some string (in the example 'coref') that must be the same in both places. The same type definition can have three or more features that all have the same value, too, or two different sets of features with identified values. In the latter case, you'll need to make up different strings after the #. You can reuse those strings in other type definitions without trouble, however.
- Sometimes you want to provide multiple kinds of information about the value of a feature (what type it is, what it's identified with, and some specific feature-value constraints on it). In this case again, use & to separate the different kinds of information.
In addition to the basic syntax above, there are some further variations:
- If the type definition consists only of a declaration of parent types, the final & is replaced with a period:
type-name := supertype1 & supertype2 & supertype3.
- The comment character is ';'. That is, the LKB will ignore any line starting with a semicolon. At present, you cannot, however, have comment lines in the middle of a type definition, only before or after.
- If the feature you want to talk about is buried deep inside several nested feature structures, you can use the '.' path notation instead of lots of square brackets. In other words, the following are equivalent:
type := parent &
[ FOO [ BAR [ BAZ [ QUUX quux ]]]].
type := parent &
[ FOO.BAR.BAZ.QUUX quux ].
- While you can repeat the same path (or partially similar paths) for different features, it's generally better style not to. That is, the second of these two is preferred:
type := parent &
[ FOO.BAR.BAZ.QUUX quux,
FOO.BAR.BAZ.ZPC zpc ].
type := parent &
[ FOO.BAR.BAZ [ QUUX quux,
ZPC zpc ]].
- Strings are indicated by double quotes ("). There are two places where strings show up: as the value of certain features (notably, the elements on the difference-list which is the value of STEM and the value of PRED inside a relation), and as documentation strings. Documentation strings can occur immediately after the list of parent types (including the final &) in a type definition or addendum, or after the :+ in a type addendum, if there are no parent types.
chocolate := lex-type &
[ STEM < ! "chocolate" !>,
SYNSEM.LKEYS.KEYREL.PRED "_chocolate_n_rel" ].
typo := parent-type &
"I think there's something wrong with this type".
- Type addendum statements allow you to add information to types that are already defined in the Matrix. They are written with :+ instead of :=, and don't require the presence of any supertypes. If there are no supertypes in the addendum, a [ or a " (for a documentation string) immediately follows the :+.
old-type :+ another-supertype &
[ FEATURE7 quux ].
old-type :+ [ FEATURE quux ].
old-type :+ "I had to add some documentation here".
- The LKB provides abbreviations for lists and difference lists:
- An empty list:
[ FEATURE < > ]
- A list with exactly one element:
[ FEATURE < foo > ]
- A list with exactly two elements:
[ FEATURE < foo, bar > ]
- A list with at least one element:
[ FEATURE < foo, ... > ]
- An empty difference list :
[ FEATURE < ! ! > ]
- A difference list with exactly one element:
[ FEATURE < ! foo ! > ]
[NB: It's actually more customary to write the
< and the ! for difference lists without a space in between.
When I try to do that in html, however, they get interpreted
as html comments, and nothing shows up.]
Back to FAQs page
Back to main course page
- 02 Nov 2004
Topic revision: r1 - 2004-11-03 - 05:54:55 - TWikiGuest