next up previous contents index
Next: Compatibility with previous versions Up: The Term Processor Kimwitu Previous: Syntax of the Kimwitu

Structure File Encoding

This is a description of the ASCII SSL V3 format as used by Grammatech's Synthesizer Generator and the term processor Kimwitu (University of Twente). (Actually, both of these are generators of programs using this representation.)

Not all features are described, e.g. other atomic phyla are possible in Synthesizers, but not in combination with Kimwitu.

The structure file format is an ASCII encoding of a term. It is a prefix representation of this term. A file representation has two major components. The first component is a table of the operators that are used in the term, operators not appearing in the term do not have to be contained in this table. The operators in this table are numbered, starting from 0. The second component of the file is a representation of the term (or object as it is called). The begin markers of these components, $operators and $object, must be followed by one space at the end of the line.

Each line in the operator table describes the properties of one operator, and has 4 fields. The fields in the line are separated by one space. They are: the operator name, the number of its operands, the number of attributes (in term processor generated files this is always 0), and an indication of whether the operator belongs to an atomic phylum (1=yes, 0=no).

In the object section each line represents one node in the term. There are four types of these nodes: operator applications, string constants, integer constants, and pointers to shared nodes.

A line beginning with a number denotes an operator application, it is an index in the operator table. The rest of this line, if any, can be ignored (it contains an alternative unparsing indication for some tools). An operator of an atomic phylum is followed, on the next line, by a representation of a value.

A plus sign (+) denotes a string, and is followed by a number indicating the number of characters in the string, a separating space, and the string. In the string, all printable characters except the backslash ( $\backslash$) are represented as such. A backslash is doubled, and a non printable character is represented as a backslash followed by a hexadecimal representation of the ascii number, e.g. newline is $\backslash$oa. An integer is represented as the string encoding of its decimal representation.

Pointers serve to share trees, and strings (but not integers). Such a pointer is encoded in a base-64 representation, in which the character : represents the `digit' 0, and the character y represents the `digit' 63. Intermediate `digits' are represented by the intermediate ascii characters. For example, the string ;= denotes the value 67. Conceptually at least, the result of each operator application is stored in one table, and each string is stored in a second table. The two numbers on the line below the begin marker $object give the size of the operator application resp. the string table. The pointer value is then a reverse index (counting from the end back) in the appropriate table of values. E.g. the last value of the appropriate kind in the file before the pointer has number 1.

The following two examples illustrates the structure file format. Both examples show the same term; the first example contains no sharing, the second everything possible is shared. Also, in the second example the operator table in the operators section has been sorted (decreasingly) on the number of operator applications. As a result, the references to this table should be smaller (lower numbers), taking less characters for their representation.

A#S#C#S#S#L#V#3 $\leftarrow$ magic word indicating file type
$operators $\leftarrow$ begin marker operator table
CR_Spec 12 0 0 $\leftarrow$ operator 0
CR_Label 1 0 0
_Str 0 0 1 $\leftarrow$ operator of an atomic phylum
Nilcr_comment_list 0 0 0
CR_Specification_id 1 0 0
CR_Identifier 2 0 0
NoCaseStr 0 0 1
CR_DefExtension 1 0 0
Nilcr_gate_identifier_list 0 0 0
Nilcr_identifier_declaration_list 0 0 0
CR_Noexit_part 0 0 0
Nilcr_data_type_definition_list 0 0 0
CR_Definition_block 3 0 0
CR_Stop_expression 2 0 0
Nilcr_annotation_list 0 0 0
Nilcr_process_definition_list 0 0 0
CR_Booleans_NotChecked 0 0 0
CR_IS8807 0 0 0
$object $\leftarrow$ begin marker object
24 4 $\leftarrow$ number of operator applications; number of strings
0 $\leftarrow$ operator application, index in table above
1
2
+10 specname_1 $\leftarrow$ string representation
3
4
5
6
+8 specname
7
6
+1 0
8
9
10
11
3
12
13
1
2
+6 stop_0
14
11
15
16
14
17

A#S#C#S#S#L#V#3 $\leftarrow$ magic word indicating file type
$operators $\leftarrow$ begin marker operator table
CR_Label 1 0 0
NoCaseStr 0 0 1 $\leftarrow$ operator of an atomic phylum
_Str 0 0 1 $\leftarrow$ operator of an atomic phylum
Nilcr_identifier_declaration_list 0 0 0
CR_Spec 12 0 0
CR_IS8807 0 0 0
Nilcr_comment_list 0 0 0
CR_Booleans_NotChecked 0 0 0
CR_DefExtension 1 0 0
CR_Specification_id 1 0 0
CR_Definition_block 3 0 0
Nilcr_process_definition_list 0 0 0
CR_Identifier 2 0 0
Nilcr_gate_identifier_list 0 0 0
Nilcr_annotation_list 0 0 0
CR_Noexit_part 0 0 0
Nilcr_data_type_definition_list 0 0 0
CR_Stop_expression 2 0 0
$object $\leftarrow$ begin marker object
21 4 $\leftarrow$ number of operator applications; number of strings
4 $\leftarrow$ operator application, index in table above
0
2
+10 specname_1 $\leftarrow$ string representation
6 $\leftarrow$ shared node referenced below as label_1
9
12
1
+8 specname
8
1
+1 0
13
3
15
16 $\leftarrow$ shared node referenced below as label_2
D $\leftarrow$ reference to label_1 above
10
17
0
2
+6 stop_0
14 $\leftarrow$ shared node referenced below as label_3
@ $\leftarrow$ reference to label_2 above
11
7
= $\leftarrow$ reference to label_3 above
5


next up previous contents index
Next: Compatibility with previous versions Up: The Term Processor Kimwitu Previous: Syntax of the Kimwitu

2000-04-17