|
Formal Languages and Computability
(csc360)
Fall 2020
Dr. Gur Saran Adhar
Module 3 (Text: Chapters 5, 6, and
7):
In this
module and the next, we define increasingly powerful models of computation,
more and more sophisticated devices for accepting and generating languages.
First, we study a very restricted model of computation called a finite
automaton . What makes it such a restricted model is that it has no
memory outside its fixed central processor. Strings are fed into the device
by means of an input tape , divided into squares, with one symbol
inscribed in each square. The finite control can be, at one moment, in
one of a finite number of states . The control can sense what symbols
are on the tape by means of a movable reading head . Initially, the
reading head is placed at the left-most square of the tape, and the control
is set in a designated initial state . At regular intervals the
automaton reads a symbol from the tape and enters a new state. A deterministic
finite automaton (dfa) is one in which the state entered is determined
only by the current state and symbol just read. After reading one symbol the
head moves to the right one square. This process is repeated until the whole
string is read. The automaton indicates its approval or disapproval of the
string in terms of the state at the end: if the final state it's in is an accepting
state , the string is accepted; otherwise it is rejected.
Adding non-determinism
to this computational model means that the automaton is allowed to change
its states in a way that is only partly determined. It may choose to go into
any one of a set of legal states. Non-deterministic finite automata (nfa) are
not realistic computational models, but they give us a simpler way to
describe acceptors. In fact, a somewhat remarkable result we will study is
that non-determinism is an inessential feature of computation, in the sense
that every nfa is equivalent to a dfa.
We have a
variety of powerful techniques to show that a language is regular.
We also have a technique to show that languages are not regular. This
technique is based on a property that all regular languages have, viz., that
every string of a regular language that is larger than a certain size must
have a part that can be repeated an arbitrary number of times, with the
result of this repetition, or pumping , also being a string in a
language. This corresponds intuitively to the fact that to accept a string of
this size, a path through a state diagram of a dfa or nfa must contain a
cycle. Repeating the same cycle any number of times cannot turn a string of a
language into one that is not. The Pumping Lemma states that all
regular languages have this property. To prove that a language is not
regular, therefore, we can show that it does not have this property.
The theory
of finite automata is rich and elegant, and understanding them puts us in a
better position to understand the effects of adding more memory. Another
reason for studying finite automata is their applicability to the design of
several common types of computer algorithms and programs, such as a lexical
analyzer of a compiler.
Not every
context-free language can be recognized by finite automaton, since some
context-free languages are not regular. Recall, for example, the non-regular
language {ww^R : w in {a,b}}. It would seem that to recognize this language,
a device must be added to a finite automaton that would "remember"
what it scanned in the first part of the string to check that the second half
has the right pattern. The required storage device need not be a
general-purpose one; a simple stack, or "pushdown store" would be enough.
The idea of an automaton with a stack as auxilliary storage is formalized in
the notion of a pushdown automaton , or PDAs. A PDA does what a NFA
does (scans symbols and changes state) and also "pushes" and
"pops" things from the stack. An NFA can be viewed trivially as a
PDA that never operates on its stack; in this sense, PDAs are more general.
In this
module we will show that PDAs are exactly what we need to accept arbitrary
context-free languages. This fact is of mathematical and practical
importance: mathematical because it connects together two views of the same
class of languages; and practical because it lays the foundation for the
study of syntax analysers, like Bison for
"real" context-free languages such as programming languages.
Infinite
context=free languages display periodicity in more subtle ways than do
regular languages. To understand this periodicity we examine the structure of
derivation trees of derivations from context-free grammars. This structure
allows us to formulate a pumping lemma for context-free languages. The
pumping lemma gives us a proof technique for showing that languages are not
context-free. Finally, we look at closure properties of context-free
languages, which give us a proof technique for showing that langauges are context-free.
By the end
of this module, the student should know the following:
- What finite automata are, how to transform and
simplify their descriptions.
- Relations between DFAs and regular languages.
- The pumping lemma for regular languages and its
application to proofs that languages are not regular.
- Pushdown automata, and their relation to context-free
languages.
- The pumping lemma for context free languages and its
application to proofs that languages are not context-free.
These are
some problems the student should focus on during this module:
- Building state diagrams of DFAs and NFAs from their
description (Sudkamp, p. 188, problem 2) Solution for problem 2: (c)
b*((a+b)+b+)* ; (d) b*((a+b)+b+)* U b*(a+b)+b+a+
- Drawing machine state diagrams from a description of
some string language Sudkamp, p. 189, problems 6, 10.
- Solution for problem 6. How many times does the
substring "aa" occur in the string "aaa"? On page
40, the concept of "substring" is defined. Based on this
definition, it makes sense to say that "aa" occurs twice in "aaa",
since there are two different decompositions of "aaa" such
that "aa" is part of the decomposition. So, in the solution,
you should include anystring containing "aaa" as part of the
language.
- Solution for problem 10. Consider the machine with
three states, q0 q1, and q2, with q0 being both the start and the only
terminal state. Let the transition function t be defined as follows:
- t[q0,a]=t[q0,c]=q0 ; t[q0,b]=q1.
- t[q1,a]=t[q1,b]=q2 ; t[q1,c]=q0
- t[q2,a]=t[q2,b]=t[q2,c]=q2
- Given an NFA-lambda, construct an equivalent DFA
(Sudkamp, p. 192, problem 30) (d) a U b* U ab+ U aba+b* Given a
DFA, construct an equivalent minimal state DFA (Sudkamp, p. 194, problem
39)
- Explore the closure properties of regular languages
(Sudkamp, p. 223, problem 7. (a) the set of strings in E = {a,b,c}
that contain an a can be represented by the regular expression {a u b u
c}*a{a u b u c}*. This language is therefore regular. Intersecting this
language with L is the language {w | w in L and w contains an a}. By the
closure properties of regular languages, this language is regular.
- Prove that languages are not regular (without using
pumping lemma). (Problem 10, p. 224). (b) To show that the set P of
even length palindrones is not regular, we need to find strings ui and
vi that satisfy
- uivi is in P
- uivj is not in P, i <> j.
Let ui = a^ib and vi = ba^i. Then
uivi is the palindrone a^ibba^i. For i <> j, the string uivj is not in
P. Therefore, by Corollary 7.2.2, P is not regular.
- Use the pumping lemma to show that a set of strings
is not regular. (b) Assume that L = {a^nb^m | n < m} is regular.
Let k be the number specified by the pumping lemma and llet z be the
string a^kb^k+1. This implies that z can be written uvw, where
- v <> lambda
- length(uv) =< k
- uv^iw is in L, for all i >= 0.
By the second condition, v
consists solely of a's. Pumping v produces the string uv^2w that contains at
least as many a's as b's, and therefore is not in L. So, L is not regular by
the pumping lemma.
- Build Pushdown Automata to accept context-free
languages (Problem 3, p. 252).
- Given a context-free grammar, create a PDA that
accepts it (Problem 12, p. 254) Constructing a grammar for a
language
- Use the pumping lemma for Context-free languages to
prove that a language is not context-free (problem 17, p. 255) (a)
Assume that the language L consisting of strings over {a} whose lengths
are a perfect square is a context-free language. By the pumping lemma,
there is a number k such that every string in L with length k or more
can be written uvwxy where
- length(vwx) =< k;
- v and x are not both null; and
- uv^iwx^iy is in L, for i greater than 0.
The string a^n, where n = k^2,
must have a decomposition that satisfies these three conditions. Consider the
length of the string z = uv^2wx^2y obtained by pumping uvwxy. The length of z
= length of uv^2wx^2y = length uvwxy + length of v + length of x = k^2 +
length of v + length of x, which is less than or equal to k^2 + k. This last
value is strictly less than (k+1)^2, the next perfect square after k^2. This
means that the length of z is strictly greater than k^2 and less than
(k+1)^2, which means it is not a perfect square, hence not in L. This
contradicts the pumping lemma, which means that L is not context free.
|