Você está na página 1de 4

Experiment 11

Case Study:Lex and Yacc


Aim: To study Lex and Yacc
Theory:
Lex - A Lexical Analyzer Generator
Lex helps write programs whose control flow is directed by instances of regular expressions in
the input stream. It is well suited for editor-script type transformations and for segmenting
input in preparation for a parsing routine.
Lex source is a table of regular expressions and corresponding program fragments. The table is
translated to a program which reads an input stream, copying it to an output stream and
partitioning the input into strings which match the given expressions. As each such string is
recognized the corresponding program fragment is executed. The recognition of the
expressions is performed by a deterministic finite automaton generated by Lex. The program
fragments written by the user are executed in the order in which the corresponding regular
expressions occur in the input stream.
For example, a C program may contain something like:
{
int int;
int = 33;
printf("int: %d\n",int);
}

In this case, the lexical analyser would have broken the input stream into a series of "tokens", like this:
{
int
int
;
int
=
33
;
printf
(
"int: %d\n"
,
int
)
;
}
Yacc: Yet Another Compiler-Compiler
Computer program input generally has some structure; in fact, every computer program that
does input can be thought of as defining an ``input language'' which it accepts. An input
language may be as complex as a programming language, or as simple as a sequence of
numbers. Unfortunately, usual input facilities are limited, difficult to use, and often are lax
about checking their inputs for validity.
Yacc provides a general tool for describing the input to a computer program. The Yacc user
specifies the structures of his input, together with code to be invoked as each such structure is
recognized. Yacc turns such a specification into a subroutine that handles the input process;
frequently, it is convenient and appropriate to have most of the flow of control in the user's
application handled by this subroutine.
Yacc is officially known as a "parser". It's job is to analyse the structure of the input stream, and
operate of the "big picture". In the course of it's normal work, the parser also verifies that the
input is syntactically sound.
Consider again the example of a C-compiler. In the C-language, a word can be a function name
or a variable, depending on whether it is followed by a ( or a = There should be exactly one } for
each { in the program. YACC stands for "Yet Another Compiler Compiler". This is because this
kind of analysis of text files is normally associated with writing compilers. However, as we will
see, it can be applied to almost any situation where text-based input is being used.
CODE AND OUTPUTS:
To Count Number of words in a sentence:
%{
#include<stdio.h>
#include<string.h>
int i = 0;
%}
/* Rules Section*/
%%
([a-zA-Z0-9])* {i++;} /* Rule for counting
number of words*/
"\n" {printf("%d\n", i); i = 0;}
%%
int yywrap(void){}
int main()
{
// The function that starts the analysis
yylex();
return 0;
}
To Convert Lowercase to Uppercase
lower[a-z]
CAPS[A-Z]
space[\t\n]
%%
{lower} {printf("%c",yytext[0]-32);}
{CAPS} {printf("%c",yytext[0]+32);}
{space} ECHO;
. ECHO;
%%
main()
{
yylex();
}

To Extract tokens from a code


%{
#include<stdio.h>
%}
%%
"if"|"else"|" while"|"do"|"switch"|"case" {printf("Keyword \n");}
[a-zA-Z][a-z|0-9]* {printf("Identifier\n"); }
[0-9]* {printf("Number\n"); }
"+"|"-"|"/"|"*" {printf("operators\n");}
"!"|"@"|"&"|"^"|"%"|"$"|"#" {printf("Special Character\n");}
%%
int yywrap() {
return 1;}
int main()
{
printf("Enter a string of data\n");
yylex();
}

Conclusion:
Thus we have successfully implemented lex and yacc tools for tokenizing and parsing
grammer.

Você também pode gostar