[Back to TEXTFILE SWAG index]  [Back to Main SWAG index]  [Original]

{     Right now I'm writing an interpreter For a language that I
developed, called "Isaac".  (It's Physics oriented).  I'd be very
interested in you publishing this inFormation regarding Pascal
Compilers, though I would likely not have time to do the excercises
right away.

   Ok, Gavin. I'll post the lister (not Really anything exceptional,
   but it'll get this thing going in Case anyone joins in late.)

   Here's the lister Program:
}
{$I-}
Program Lister;

Uses Dos;

{$I PTypeS.inC}
{Loacted in the SOURCE\MISC Directory.}

Function LeadingZero(w:Word): String;{convert Word to String With 0's}
   Var s :String;
   begin
      Str(w:0,s);
      if Length(s) < 2 then s := '0'+s;
      LeadingZero := s;
      if Length(s) > 2 then Delete(s,1,Length(s)-2);
   end;


Function FormatDate :String; { get system date and pretty it up }
   Const
      months : Array[1..12] of String[9] =
      ('January', 'February', 'March', 'April', 'May', 'June', 'July',
       'August', 'September', 'October', 'November', 'December');
   Var s1,fn : String; y,m,d,dow : Word;
   begin
      GetDate(y,m,d,dow);
      s1 := leadingZero(y);
      fn := LeadingZero(d);
      s1 := fn+' '+s1;
      fn := months[m];
      s1 := fn+' '+s1;
      FormatDate := s1;
   end;

Function FormatTime :String; { get system time and pretty it up }
   Var s1, fn : String; h,m,s,s100 : Word;
   begin
      GetTime(h,m,s,s100);
      fn := LeadingZero(h);
      s1 := fn+':';
      fn := LeadingZero(m);
      FormatTime := s1+fn;
   end;

Procedure Init(name:String);
   Var t,d        :String;
   begin
      line_num := 0; page_num := 0; level := 0;
      line_count := MAX_LinES_PER_PAGE;
      source_name := name;
      Assign(F1, name);      { open sourceFile - terminate if error }
      Reset(F1);
      if Ioresult>0 then
      begin
         Writeln('File error!');
         Halt(1);
      end;
      { set date/time String }
      d := FormatDate;
      t := FormatTime;
      date := d+'  '+t;
   end;

Procedure Print_Header;
   Var s, s1 :String;
   begin
      Writeln(F_FEED);
      Inc(page_num);
      Str(page_num, s1);
      s := 'Page '+s1+'   '+source_name+'  '+date;
      Writeln(s);
   end;

Procedure PrintLine(line :String);
   begin
      Inc(line_count);
      if line_count>MAX_LinES_PER_PAGE then
      begin
         print_header;
         line_count := 1;
      end;
      if ord(line[0])>MAX_PRinTLinE_LEN then
         line[0] := Chr(MAX_PRinTLinE_LEN);
      Writeln(line);
   end;


Function GetSourceLine :Boolean;
   Var print_buffer :String[MAX_SOURCELinE_LEN+9];
       s            :String;
   begin
      if not(Eof(F1)) then begin
         Readln(F1, source_buffer);
         Inc(line_num);
         Str(line_num:4, s);
         print_buffer := s+' ';
         Str(level, s);
         print_buffer := print_buffer+s+': '+source_buffer;
         PrintLine(print_buffer);
         GetSourceLine := True;
      end else GetSourceLine := False;
   end;


begin  { main }
   if ParamCount=0 then begin
      Writeln('Syntax: LISTER <Filename>');
      Halt(2);
   end;
   init(ParamStr(1));
   While GetSourceLine do;
end.

{
   Now that the task of producing a source listing is taken care of,
   we can tackle the scanners main business: scanning. Our next job
   is to produce a scanner that, With minor changes, will serve us
   For the rest of this "course".

   The SCANNER will do the following tasks:

   ø scan Words, numbers, Strings and special Characters.
   ø determine the value of a number.
   ø recognize RESERVED WordS.

   LOOKinG For toKENS

   SCANNinG is reading the sourceFile and breaking up the Text of a
   Program into it's language Components; such as Words, numbers,
   and special symbols. These Components are called toKENS.

   You want to extract each each token, in turn, from the source
   buffer and place it's Characters into an empty Array, eg.
   token_String.

   At the start of a Word token, you fetch it's first Character and
   each subsequent Character from the source buffer, appending each
   Character to the contents of token_String. As soon as you fetch a
   Character that is not a LETTER, you stop. All the letters in
   token_String make up the Word token.

   Similarly, at the start of a NUMBER token, you fetch the first
   digit and each subsequent digit from the source buffer. You
   append each digit to the contents of token_String. As soon as you
   fetch a Character that is not a DIGIT, you stop. All digits
   within token_String make up the number token.

   Once you are done extracting a token, you have the first
   Character after a token. This Character tells you that you have
   finished extracting the token. if the Character is blank, you
   skip it and any subsequent blanks Until you are again looking at
   a nonblank Character. This Character is the start of the next
   token.

   You extract the next token in the same way you extracted the
   previous one. This process continues Until all the tokens have
   been extracted from the source buffer. Between extracting tokens,
   you must reset token_String to null String to prepare it For the
   next token.

   PASCAL toKENS

   A scanner For a pascal Compiler must, of course, recognize Pascal
   tokens. The Pascal language contains several Types of tokens:
   identifiers, reserved Words, numbers, Strings, and special
   symbols.

   This next exercise is a toKENIZER that recognizes a limited
   subset of Pascal tokens. The Program will read a source File and
   list all the tokens it finds. This first version will recognize
   only Words, numbers, and the Pascal "end-of-File" period - but it
   provides the foundation upon which we will build a full Pascal
   scanner in the second version.

   Word: A Pascal Word is made up of a LETTER followed by any number
   of LETTERS and DIGITS (including 0).

   NUMBER: For now, we'll restrict a number token to a Pascal
   unsigned Integer, which is one or more consecutive digits. (We'll
   handle signs, decimals, fractions, and exponents later) and,
   we'll use the rule that an input File *must* have a period as
   it's last token.

   The tokenizer will print it's output in the source listing.

   EXERCISE #2

   Use the following TypeS and ConstANTS to create a SCANNER as
   described above:

-------------------------------------------------------------------

Type
   Char_code    = (LETTER, DIGIT, SPECIAL, Eof_CODE);
   token_code   = (NO_toKEN, Word, NUMBER, PERIOD,
                   end_of_File, ERRor);
   symb_Strings :Array[token_code] of String[13] =
                  ('<no token>','<Word>','<NUMBER>','<PERIOD>',
                   '<end of File>','<ERRor>');

   literal_Type = (Integer_LIT, String_LIT);

   litrec = Record
      l :LITERAL_Type;
      Case l of

         Integer_LIT: value :Integer;
         String_LIT:  value :String;
      end;
   end;

Const
   Eof_Char = #$7F;

Var
   ch             :Char;        {current input Char}
   token          :token_code;  {code of current token}
   literal        :litrec;      {value of current literal}
   digit_count    :Integer;     {number of digits in number}
   count_error    :Boolean;     {too many digits in number?}
   Char_table     :Array[0..255] of Char_code;{ascii Character map}


The following code initializes the Character map table:

For c := 0 to 255 do
   Char_table[c] := SPECIAL;
For c := ord('0') to ord('9') do
   Char_table[c] := DIGIT;
For c := ord('A') to ord('Z') do
   Char_table[c] := LETTER;
For c:= ord('a') ro ord('z') do
   Char_table[c] := LETTER;
Char_table[ord(Eof_Char)] := Eof_CODE;

-------------------------------------------------------------------

   You can (and should) use the code from your source listing
   Program to start your scanner. if you have just arrived, use my
   own code posted just previously.


[Back to TEXTFILE SWAG index]  [Back to Main SWAG index]  [Original]