@V7 54 2 -5
@L3 COUK1247
80
@V9 0
@YCHAPTER 24 - MUSS USER MANUAL
@G@RCHAPTER 24 - COMPILER WRITER UTILITIES@G
@V10 1 9 345
@T% 10
@BThis section describes utilities to aid compiler
construction and development.
@S224.1 DISASSEMBLER
@
@
1) @JOUT.PROG(I,I,I)
@BThis procedure produces an assembler like listing of a portion
of code as defined by the parameters. P1 is the segment number
containing the code, P2 is the starting byte within this segment
and P3 the finishing byte of the portion of code to be output.
Output is sent to the currently selected stream. If P3 is zero
then a suitable default is taken.
@S224.2 SYNTAB
@BSYNTAB is a Syntax Processing Package which allows a translator
written largely in MUSL to have its syntax phase defined by a
notation similar to that of BNF. First the package translates the
syntax specification into MUSL then sends the resulting code to a
file for compilation by the MUSL compiler. This MUSL code needs
a SCAN procedure which may be called within the compiler. When
entered this procedure attempts to match some text supplied in a
vector parameter with the automatically generated vector containing
the encoded form of the specified syntax. The SCAN procedure uses
a top-down fastback technique.
@S324.2.1 The Notation for describing Syntax
@BThe syntax specifications must appear at the beginning of the
translator and be delimited by:-
@
@Q 4
@
@M    BEGIN SYNTAX SPEC
@Nand END
@
@
Thus a translator has the form:-
@
@Q 6
@
@MBEGIN SYNTAX SPEC
@N<SYNTAX SPEC>
@NEND
@N<AUTOCODE PROGRAM>
@
@
There are five components in the SYNTAX SPEC the main two of
which define the synes and cosynes.
@
@Q 7
@
@M<SYNTAX SPEC> =
@N<DELIMITER LIST>
@N<PROCEDURE LIST>
@N<COSYNE LIST>
@N<SYNE DEFINITIONS>
@N<COSYNE DEFINITIONS>
@BSyne definitions are syntactic rules of the languages written
as in BNF except for these differences:-
@
@
1.@IStylistic differences. The word SYNE must precede each
definition and ::= is replaced by =.@
2.@IDifferences in ordering. The order of alternatives and
of elements within alternatives are both different. Syne formulae
are used by a left to right scanning algorithm which requires that:-@
(a)@IAny alternative which is a stem of another comes after it.@
(b)@IIf one alternative is a special case of another it
must come first.@
(c)@IIn recursive definitions there must be at least
one leftmost element not recursive.@
3.@IMetalinguistic Bracketing. Several alternatives may be
specified as an element of another by enclosing them in square
brackets.@
4.@IThose syne definitions which are to be referenced by the autocode
part of the compiler, (e.g., as parameters of the scan routine) must be
preceded by a *. The result of this will be that a CONST declaration
is generated to associate the syne name with the position allotted to
it in a DATA VEC 'SYNES'.@
5.@IAlternatives within syne definitions must be less than 128
elements long.@
@
@IFor example, a definition of <STATEMENTS> might take the form:-@
@
@ISYNE <STATEMENTS> = <STATEMENT> [<STATEMENTS>|<NULL>]@
@
@Iwhere <NULL> indicates an empty string.
@BFor both semantic and syntactic reasons it is convenient to be
able to interrupt the scanning algorithm at defined points and execute
code provided by the user. This is achieved by inserting an element
into the syntax which is referred to as if it were a syne but is in
fact defined as the name of a section of code (COSYNE) to be executed
when the scanning algorithm reaches that point.
@BTo make this facility more flexible the cosyne routine may have
one numeric parameter the value of which is defined explicitly
wherever the cosyne is used thus:-
@
@
@M<cosyne name (numeric parameter)>
@BCosynes are defined as labelled sequences of Autocode instructions
which operate within the SCAN procedure. Each sequence is labelled with
the name of the cosyne it represents. Cosynes may use the variables and
labels in the scan of which the following are the most useful:-
@
@
AS, WS, LINE, SYNTAX
@BDescriptors mapping the analysis record area, the working stack,
the current statement and the syntax tables, respectively.
@
@
AP, WP, SS, SY
@BPointers to the current position in AS, WS, LINE and SYNTAX, respectively.
@
@
TRUE, FALSE
@BLabels to which the cosynes may jump on completion.
@BIn a translator it is usual to itemise the input string before
applying the scan. This pre-scan pass may reduce delimiters to single
pseudo symbols. A similar transformation has to be applied to
delimiters appearing in the syntax. These delimiters and the value of
the code to be assigned to them must be listed thus:-
@U 6
@
      <DELIMITER LIST> =
                    DELIMITERS<NL>
                    <DELIMITER SPEC>
                    END<NL>
@
where <DELIMITER SPEC> = <INTEGER>/<STRING><NL>[<DELIMITER SPEC>|<NIL>]
@
@
<INTEGER>@Iis any integer in the range 0 - 255 and is the
code to be assigned to the delimiter specified by,@
<STRING>@Iwhich is any character string but all strings must
begin with the same symbol, (e.g., 'for delimiters in quotes),@
<NL>@Irepresents a newline.@
@BEach procedure name used as the parameter of a cosyne will be
replaced by the integer which indexes its entry in the procedure list.
@
@
@U 4
Formally:-
@
           <PROCEDURE LIST> = PROCEDURE NAMES<NAME LIST><NL>
     where <NAME LIST> = <NAME>[,<NAME LIST>|<NIL>]
@BThe COSYNE LIST contains the names of all the cosynes used in
the syntax. Its form is:-
@
@
@MCOSYNE NAMES<NAME LIST><NL>
@U 52
@
                              FIGURE 1
@
@
                    Schematic Form of Input
@
BEGIN SYNTAX SPEC
DELIMITERS
128/'BEGIN'
.
.@
END
@
PROCEDURE NAMES name, name, .....
COSYNE NAMES name, name, .....
SYNE<name>=---
*SYNE<name>=---
.
.
COSYNES
.
.
END SYNTAX SPEC
@
                              FIGURE 2
@
@
                    Schematic Form of Output
                    for a MUSL Compiler
@
LITERAL synename=index;
.
.
DATAVEC SYNTAX($LO8)
.
.
END;
PSPEC SCAN(ADDR[$LO8],ADDR[$IN],ADDR[$IN],ADDR[$IN],$IN);
PROCEDURE SCAN(SYNTAX,LINE,AS,WS,START);
@
->PASSWITCH;
LOOP2:
SWITCH SYNTAX[SY]\
cosyne name,
.
.
cosyne name;
.
PASSWITCH:;
.
**IN -1
@S324.2.2 The Syntactic Scan Procedure
@BThe scan procedure has the specification:-
@
@
@MPROC SPEC SCAN (S, S, S, S, I32)
@
The parameters are:-
@
@T% 30
@
(i)@IThe string SYNTAX which contains the complete syntax to be
used in the scan and which is set up by the syntax processing phase.
The scan maintains a modifier SY (I32 variable) at the current position
in the syntax tables.@
@
(ii)@IThe vector LINE with 8-bit or 32-bit elements containing the
text to be recognised. Each element is compared with a byte in the
syntax tables. SS is maintains pointing to the last symbol recognised.
On entry SS = 0.@
@
(iii) or (iv)@ITwo vectors AS and WS with 8-bit or 32-bit elements as the
user requires. The modifiers within these vectors are AP
and WP respectively, zeroed on entry to the scan. AS and WS are for
use by the cosynes in generating an analysis record. AS or analysis
space is conventionally the area in which the final analysis record
is built up. WS or work space is the area used as temporary storage
for intermediate levels of the analysis record. These areas are used
for this purpose by the built-in cosynes VAL, NOVAL and IN. WS is
used for output information as follows:-@
@
@IWS[0] = -1 if the scan fails, WS[0] = 1 if it passes.@
@
@IConsistent use of VAL will ensure that WS[1] is a modifier in AS
pointing to the first element of the analysis record. The first free
space in WS, usually WS[2] is the value of the modifier in LINE (SS)
for the last symbol recognised. The analysis record created by the
user starts as WS[1]. AS[0] contains the number of elements in the
complete analysis record (i.e., if the last element is AS[4],
AS[0] = 5).@
@
(v)@IThe address of the start of the SYNE to be scanned for. This
is the I32 literal constant given by the name of a starred SYNE.
@S324.2.3 Operation of the Scan Procedure
@S324.2.3.1 Analysis Records
@BObviously in BNF no record is kept of the way in which the
source string fits the syne definitions. Also there is no specific
way of recognising the end of an alternative. Both these functions
are performed by cosynes. A terminal cosyne is required at the end
of every alternative in a syne definition. This should generate the
required record and end by transferring control to a specified point
in the scanning routine, where the stack will be adjusted and the
scan continued in the syne definition one level up. Two terminal
cosynes are built into the system, they may be used as models for
others. The first is UP which causes the scan to continue at the
syne definition one level up and generates no analysis record. The
other, VAL, generates an analysis record which as a tree structure
form. The parameter of the reference to the VAL cosyne is entered
as the first element of the current level of tree structure.
@BOn some occasions it may be necessary to record information in
the analysis record without creating a new level of tree structure.
This is achieved by use of the cosyne IN, which inserts its parameter
into the analysis record. Thus analysis records may be produced as
in Figure 3. The pointers are stored as indices in the vector
containing the analysis record.
@3
@U 20
@
SYNTAX:
  SYNE <SENTENCE> = THE[CAT<IN(1)>|DOG<IN(2)>]<VERB>THE WALL<VAL(0)>
  SYNE <VERB>     = SAT<PREP><VAL(1)>|RAN FROM <VAL(0)>
  SYNE <PREP>     = [ON<IN(0)>|UPON<IN(1)>]<UP>
@
@
INPUT:
               THE DOG SAT ON THE WALL
@
@
ANALYSIS RECORD:
               |
              @O|    @O
             |@O0|2|.@O|
                  |
                  @O|  @O
                 |@O1|0@O|
@
                              Figure 3. Analysis Records
@0
@S324.3.2 Cosynes
@BThese are labelled sections of code usually delimited by BEGIN and
END. Several are built into the scan procedures and these are:-
@
@
@MBRANCH, SYMBOL, SYNE, MERGE, UP, ENDFALSE, ENDTRUE, VAL, NOVAL, IN
@BThe VAL, UP and IN cosynes have already been discussed. The
cosyne NOVAL acts as VAL but has no parameter and does not precede the
analysis record level with a number. The other built in cosynes are
of less interest to the user.
@BA reference to the UP cosyne occupies one 8-bit element of the
internal encoding of the syntax; a syne reference occupies 3 elements
and all other cosyne references occupy 2 elements.
@BThe second of the 2 elements occupied by a cosyne holds the
parameter and this may be accessed within the cosyne as:-
@
@
@MSYNTAX1[SY]
@BCosynes may insert information onto the current level of analysis
record by storing to:-
@
@
@MWS[WP]
@V3 1
@V4 0
@Band advancing the index WP. They may also access the next symbol to be
recognised as:-
@
@
@MLINE[SS + 1]
@BIf a symbol is recognised thus, then the index SS should be advanced.
@V3 2
@V4 3
@BCosynes should normally exit to one of the two built in labels
TRUE or FALSE - according to whether recognition is to proceed or to
backtrack in the syntax.
@F
