CSCI 301 L22 Notes

Lecture 22 - Notes

Goals

Know the definition of and notation for alphabet, symbol, string, and language.
Know how to interpret a diagram of a finite automaton (FA)
Know the formal definition of a FA
Know how to determine if a finite automaton is deterministic
Be able to construct a DFA that accepts simple languages

Announcements

A5 and Week 5 Survey due tomorrow night
A6 out circa Wednesday night, due the following Wednesday

Automata Theory

As we discussed last time, the theory of computation studies computability and complexity, both of which relate to whether and how efficiently a computer can solve a certain (class of) problems.

To study these, we need to formalize and define what we mean by both of these things: computer and problem.

We saw an example of a state machine, a kind of automaton that determines its state in response to some inputs, and reaches an “accept state” depending on its inputs. Automata like these comprise our mathematical model(s) of computers, and we’ll see several types of automata with varying levels of computational power.

To formalize our notion of problems, we will focus on a particular form of problem of language acceptance. Informally, this means that a machine will be able to process a sequence of characters (a string) and determine whether it belongs to a certain family of strings (a language).

For example, a machine might be designed to “accept” strings representing binary numbers that are odd (i.e., they end in the digit 1). In our toll gate example, the automaton accepts the language of strings representing sequences of coins that add up to at least $15\textcent$.

This may seem restrictive, and in some sense it is. However:

it gives us a basis to study questions of theoretical interest
many (or perhaps all) computable problems can be formulated as language acceptance problems
problems of this form do come up often in many practical applications and fields of computer science, including programming languages (parsers, compilers, interpreters), natural language and string processing, pattern matching, security, and so on.

Alphabets, Strings, and Languages

Let’s now get more formal about what these machines can do; we will abstract away the specifics of the problem by rephrasing it in terms of the language accepted by the machine. Some definitions:

In the context of strings, an alphabet is finite set, and its members are called symbols. Symbols can be anything, really, but they are usually represented as numbers or letters.
- Examples: $\Sigma = \{0, 1\}$, and $\Sigma = \{a, b, c, \ldots, z\}$.
A string $w$ over an alphabet $\Sigma$ is a finite (ordered) sequence of symbols, where each symbol is an element of the alphabet. For example, $110$ and $0010$ are strings over the alphabet $\Sigma = \{0, 1\}$.
The length of a string $w$, written $|w|$, is the number of symbols in the string. For example, $|110| = 3$ and $|0010| = 4$.
The empty string, written $\epsilon$, is the string whose length is zero.
A language over an alphabet $\Sigma$ is a set of strings over $\Sigma$. For example: $\{1, 0, 01, 10\}$ is a language over the alphabet $\{0, 1\}$.
The set of all strings (of any length) that can be made from an alphabet $\Sigma$ is written $\Sigma^*$. So formally $L$ is a language over $\Sigma$ if $L \subseteq \Sigma^*$.

Do Exercises Part A

Finite Automata

Given this, let’s formalize our toll gate automaton a bit. Here’s a diagram, where the input coins are represented as $n$ (nickel) for a $5\textcent$ coin and $d$ (dime) for a $10\textcent$ coin:

Let’s formally define the category of machines of which this state machine is an example.

A finite automaton (FA) is a 5-tuple $M = (Q, \Sigma, \delta, q, F)$, where

$Q$ is a finite set, whose elements are called states
$\Sigma$ is a finite set called the alphabet, whose elements are symbols
$\delta : Q \times \Sigma \rightarrow Q$ is a function, called the transition function
$q$ is an element of $Q$, called the start state
$F$ is a subset of $Q$, whose elements are called accept states

Let’s map this definition onto the machine above:

The set of states is $Q = \{s_0, s_5, s_{10}, s_{15+}\}$
The alphabet is $\Sigma = \{d, n\}$
(let’s wait on this one until we’ve done the rest)
The start state $q = s_0$
The set of accept states is $F = \{_{15+}\}$

Let’s look at the transition function now, which encodes the arrows in the diagram. It maps:

From $(s, x) \in Q \times \Sigma$, a pair where $s$ is the current state and $x$ is the next input symbol
to $t \in Q$, the state that the machine is in after seeing the input $x$ while in state $s$.

We can recall the transition function $\delta$ is a function, which is a relation, which is a subset of $(Q \times \Sigma) \times Q$, so we could write it out as a set. Written out in set notation this would be a set whose first 3 elements are $\{((s_0, n), s_5), ((s_0, d), s_10), ((s_5, n), s_10), \ldots\}$; the set has 8 elements but I’m too lazy to write them all out. Since this is tedious, we can also write $\delta$ as a table:

       n    d
 s0 |  s5  s10
 s5 | s10  s15
s10 | s15+ s15+
s15+| s15+ s15+

Here’s more definition.

A finite automaton is deterministic if $\delta$ is a function; this means that if the machine is in state $r$ and reads input $a$, then $\delta(r, a)$ sends the machine to exactly one new state. Later we’ll see nondeterministic finite automata, which loosen this requirement.

Do Exercises Part B

(End of material covered in L22)

The Language Accepted by a FA

Informally: a finite automata accepts a string if the automata can begin in the start state and process each symbol in the string and end in an accept state.

Formally:

Definition: Let $M = (Q, \Sigma, \delta, q, F)$ be a finite automaton and let $w = w_1 w_2 w_3 \ldots w_n$ be a string over $\Sigma$. Define a sequence of states $r_0, r_1, \ldots r_n$ as follows:

$r_0 = q$ (the start state)
$r_{i+1} = \delta(r_i, w_i+1)$ for $i = 0, 1, \ldots, n-1$

If $r_n \in F$, then $M$ accepts $w$.

If $r_n \not\in F$, then $M$ rejects (or does not accept) $w$.

The language accepted by a machine $M$ is the set of all strings accepted by the machine: \[ L(M) = \{w: \text{ w is a string over $\Sigma$ and $M$ accepts $w$}\} \]

Do Exercises Part C