Adding actions to a DLG lexclass:

Actions can be added to DLG lexclass descriptions to perform various tasks. A typical action might count line numbers or change lexclass modes at the start or end of a source code comment block.

An action is enclosed in a double-</double-> pair. << >> and is placed after the regular expression describing the token:


    #token K_BEGIN "begin" << printf("Action.\n"); >>

Actions can contain any C or C++ source code that can be placed directly into a C or C++ source file, and may contain multiple lines of text:


	#token	NL	"\n"
		<<	printf("Found a newline.\n");
				zzline++;
		>>

In the previous example, zzline was used to count the number of newlines. This variable is defined if the following line is included in a top-level action which precedes the lexclass:


	#define ZZCOL

You can add your own variables by declaring them in a top-level action preceding the lexclass:


	<<
	int numtokens = 0;
	>>


	#lexclass START
	#token  WS		"[\ \t]*"
	#token  ALLELSE		"~[\ \t]*"	<< numtokens++; >>
	#token  ENDOFFILE	"@"
			<<
				printf ( "Total tokens: %d\n", numtokens );
			>>

There is a special top-level action called the "header", which is included in each of the C files generated during the building of a PCCTS based compiler. Because of this, you should NOT put variable declarations which will allocate space in the header action. You should, however, put extern declarations for variables, typedefs, and struct descriptions which are used throughout the compiler in the header action. Allocating declarations for these items should then be placed in a non-header action:


	#header
	<<
	#define ZZCOL
	extern int numtokens;
	>>

	<<
	int numtokens = 0;

	main (int argc, char *argv[])
	{
        	ANTLR (prog(), stdin);
        	return(0);
	}
	>>


	#lexclass START
	#token  WS		"[\ \t]*"
	#token  ALLELSE		"~[\ \t]*"	<< numtokens++; >>
	#token  ENDOFFILE	"@"
			<<
				printf ( "Total tokens: %d\n", numtokens );
			>>

Of course, placing an "extern" declaration in a file doesn't hurt, so you may want to place extern declarations in the header just to be sure the item is declared everywhere it is needed.

Predefined symbols in lexical actions

Certain functions, variables, and macros are defined by DLG for use in lexical actions. They include functions for manipulating the recognized token text, appending characters onto the current token, skipping characters, and changing the data stream from which characters a read. Also included are global variables which point to the beginning, end, and text of the current token, count the number of lines read, keep the last read character, and hold other information. Other symbols are also available which affect the operation of the lexer when defined or used.

Here are a few excerpts from table 16 in Terence Parr's Book Language Translation Using PCCTS and C++:

int zzline
The current line number being scanned by DLG. This variable must be maintained by the user; this variable is normally maintained by incrementing it upon matching a newline character. Note that you must #define ZZCOL to use zzline.
zzmore (void)
This function merely sets a flag that tells DLG to continue looking for another token; future characters are appended to zzlextext.
zzskip (void)
This function merely sets a flag that tells DLG to continue looking for another token; future characters are not appended to zzlextext.
zzadvance (void)
Instruct DLG to consume another input character. zzchar will be set to this next character.
int zzchar
The most recently scanned character.
char *zzlextext
The entire lexical buffer containing all characters matched thus far since the last token tpye was returned. See zzmore() and zzskip().
ZZCOL
Define this preprocessor symbol to get DLG to track the column numbers.
zzmode (int m)
Set the lexical mode (i.e., lexical class or automaton) corresponding to a lex class defined in an ANTLR grammar with the #lexclass directive.
void (*zzerr) (char *)
You can set zzerr to point to a routine of your choosing to handle lexical errors (e.g., when the input does not match any regular expression).
These functions, variables, and symbols may be used as you would expect within lexical actions.
This page was last modified .