Home-Adhar

Formal Languages and Computability (csc360)

Fall 2020
Dr. Gur Saran Adhar

Module 3 (Text: Chapters 5, 6, and 7):

In this module and the next, we define increasingly powerful models of computation, more and more sophisticated devices for accepting and generating languages. First, we study a very restricted model of computation called a finite automaton . What makes it such a restricted model is that it has no memory outside its fixed central processor. Strings are fed into the device by means of an input tape , divided into squares, with one symbol inscribed in each square. The finite control can be, at one moment, in one of a finite number of states . The control can sense what symbols are on the tape by means of a movable reading head . Initially, the reading head is placed at the left-most square of the tape, and the control is set in a designated initial state . At regular intervals the automaton reads a symbol from the tape and enters a new state. A deterministic finite automaton (dfa) is one in which the state entered is determined only by the current state and symbol just read. After reading one symbol the head moves to the right one square. This process is repeated until the whole string is read. The automaton indicates its approval or disapproval of the string in terms of the state at the end: if the final state it's in is an accepting state , the string is accepted; otherwise it is rejected.

Adding non-determinism to this computational model means that the automaton is allowed to change its states in a way that is only partly determined. It may choose to go into any one of a set of legal states. Non-deterministic finite automata (nfa) are not realistic computational models, but they give us a simpler way to describe acceptors. In fact, a somewhat remarkable result we will study is that non-determinism is an inessential feature of computation, in the sense that every nfa is equivalent to a dfa.

We have a variety of powerful techniques to show that a language is regular.

We also have a technique to show that languages are not regular. This technique is based on a property that all regular languages have, viz., that every string of a regular language that is larger than a certain size must have a part that can be repeated an arbitrary number of times, with the result of this repetition, or pumping , also being a string in a language. This corresponds intuitively to the fact that to accept a string of this size, a path through a state diagram of a dfa or nfa must contain a cycle. Repeating the same cycle any number of times cannot turn a string of a language into one that is not. The Pumping Lemma states that all regular languages have this property. To prove that a language is not regular, therefore, we can show that it does not have this property.

The theory of finite automata is rich and elegant, and understanding them puts us in a better position to understand the effects of adding more memory. Another reason for studying finite automata is their applicability to the design of several common types of computer algorithms and programs, such as a lexical analyzer of a compiler.

Not every context-free language can be recognized by finite automaton, since some context-free languages are not regular. Recall, for example, the non-regular language {ww^R : w in {a,b}}. It would seem that to recognize this language, a device must be added to a finite automaton that would "remember" what it scanned in the first part of the string to check that the second half has the right pattern. The required storage device need not be a general-purpose one; a simple stack, or "pushdown store" would be enough. The idea of an automaton with a stack as auxilliary storage is formalized in the notion of a pushdown automaton , or PDAs. A PDA does what a NFA does (scans symbols and changes state) and also "pushes" and "pops" things from the stack. An NFA can be viewed trivially as a PDA that never operates on its stack; in this sense, PDAs are more general.

In this module we will show that PDAs are exactly what we need to accept arbitrary context-free languages. This fact is of mathematical and practical importance: mathematical because it connects together two views of the same class of languages; and practical because it lays the foundation for the study of syntax analysers, like Bison for "real" context-free languages such as programming languages.

Infinite context=free languages display periodicity in more subtle ways than do regular languages. To understand this periodicity we examine the structure of derivation trees of derivations from context-free grammars. This structure allows us to formulate a pumping lemma for context-free languages. The pumping lemma gives us a proof technique for showing that languages are not context-free. Finally, we look at closure properties of context-free languages, which give us a proof technique for showing that langauges are context-free.

By the end of this module, the student should know the following:

What finite automata are, how to transform and simplify their descriptions.
Relations between DFAs and regular languages.
The pumping lemma for regular languages and its application to proofs that languages are not regular.
Pushdown automata, and their relation to context-free languages.
The pumping lemma for context free languages and its application to proofs that languages are not context-free.

These are some problems the student should focus on during this module:

Building state diagrams of DFAs and NFAs from their description (Sudkamp, p. 188, problem 2) Solution for problem 2: (c) b*((a+b)+b+)* ; (d) b*((a+b)+b+)* U b*(a+b)+b+a+
Drawing machine state diagrams from a description of some string language Sudkamp, p. 189, problems 6, 10.

Solution for problem 6. How many times does the substring "aa" occur in the string "aaa"? On page 40, the concept of "substring" is defined. Based on this definition, it makes sense to say that "aa" occurs twice in "aaa", since there are two different decompositions of "aaa" such that "aa" is part of the decomposition. So, in the solution, you should include anystring containing "aaa" as part of the language.
Solution for problem 10. Consider the machine with three states, q0 q1, and q2, with q0 being both the start and the only terminal state. Let the transition function t be defined as follows:

t[q0,a]=t[q0,c]=q0 ; t[q0,b]=q1.
t[q1,a]=t[q1,b]=q2 ; t[q1,c]=q0
t[q2,a]=t[q2,b]=t[q2,c]=q2

Given an NFA-lambda, construct an equivalent DFA (Sudkamp, p. 192, problem 30) (d) a U b* U ab+ U aba+b* Given a DFA, construct an equivalent minimal state DFA (Sudkamp, p. 194, problem 39)
Explore the closure properties of regular languages (Sudkamp, p. 223, problem 7. (a) the set of strings in E = {a,b,c} that contain an a can be represented by the regular expression {a u b u c}*a{a u b u c}*. This language is therefore regular. Intersecting this language with L is the language {w | w in L and w contains an a}. By the closure properties of regular languages, this language is regular.
Prove that languages are not regular (without using pumping lemma). (Problem 10, p. 224). (b) To show that the set P of even length palindrones is not regular, we need to find strings ui and vi that satisfy

uivi is in P
uivj is not in P, i <> j.

Let ui = a^ib and vi = ba^i. Then uivi is the palindrone a^ibba^i. For i <> j, the string uivj is not in P. Therefore, by Corollary 7.2.2, P is not regular.

Use the pumping lemma to show that a set of strings is not regular. (b) Assume that L = {a^nb^m | n < m} is regular. Let k be the number specified by the pumping lemma and llet z be the string a^kb^k+1. This implies that z can be written uvw, where

v <> lambda
length(uv) =< k
uv^iw is in L, for all i >= 0.

By the second condition, v consists solely of a's. Pumping v produces the string uv^2w that contains at least as many a's as b's, and therefore is not in L. So, L is not regular by the pumping lemma.

Build Pushdown Automata to accept context-free languages (Problem 3, p. 252).
Given a context-free grammar, create a PDA that accepts it (Problem 12, p. 254) Constructing a grammar for a language
Use the pumping lemma for Context-free languages to prove that a language is not context-free (problem 17, p. 255) (a) Assume that the language L consisting of strings over {a} whose lengths are a perfect square is a context-free language. By the pumping lemma, there is a number k such that every string in L with length k or more can be written uvwxy where

length(vwx) =< k;
v and x are not both null; and
uv^iwx^iy is in L, for i greater than 0.

The string a^n, where n = k^2, must have a decomposition that satisfies these three conditions. Consider the length of the string z = uv^2wx^2y obtained by pumping uvwxy. The length of z = length of uv^2wx^2y = length uvwxy + length of v + length of x = k^2 + length of v + length of x, which is less than or equal to k^2 + k. This last value is strictly less than (k+1)^2, the next perfect square after k^2. This means that the length of z is strictly greater than k^2 and less than (k+1)^2, which means it is not a perfect square, hence not in L. This contradicts the pumping lemma, which means that L is not context free.

Formal Languages and Computability (csc360)

Fall 2020 Dr. Gur Saran Adhar

Module 3 (Text: Chapters 5, 6, and 7):

Fall 2020
Dr. Gur Saran Adhar