\documentclass[12pt]{article} \usepackage{amsfonts} %\documentstyle[12pt,amsfonts]{article} %\documentstyle{article} \setlength{\topmargin}{-.5in} \setlength{\oddsidemargin}{0 in} \setlength{\evensidemargin}{0 in} \setlength{\textwidth}{6.5truein} \setlength{\textheight}{8.5truein} %\input ../basicmath/basicmathmac.tex % %\input ../adgeomcs/lamacb.tex \input ../adgeomcs/mac.tex \input ../adgeomcs/mathmac.tex \def\fseq#1#2{(#1_{#2})_{#2\geq 1}} \def\fsseq#1#2#3{(#1_{#3(#2)})_{#2\geq 1}} \def\qleq{\sqsubseteq} % \begin{document} \begin{center} \fbox{{\Large\bf Spring, 2009 \hspace*{0.4cm} CIS 511}}\\ \vspace{1cm} {\Large\bf Introduction to the Theory of Computation\\ Jean Gallier \\ \vspace{0.5cm} Homework 2}\\[10pt] February 5, 2009; Due February 19, 2009\\ \end{center} ``A problems'' are for practice only, and should not be turned in. \medskip\noindent {\bf Problem A1.} Recall that two regular expressions $R$ and $S$ are equivalent, denoted as $R \cong S$, iff they denote the same regular language ${\cal L}[R] = {\cal L}[S]$. Show that the following identities hold for regular expressions: $$\eqaligneno{ R^{**} &\cong R^{*}\cr (R + S)^{*} &\cong (R^{*} + S^{*})^{*}\cr (R + S)^{*} &\cong (R^{*}S^{*})^{*}\cr (R + S)^{*} &\cong (R^{*}S)^{*}R^{*}\cr }$$ \vspace {0.25cm}\noindent {\bf Problem A2.} Recall that a homomorphism $\mapdef{h}{\Sigma^*}{\Delta^*}$ is a function such that $h(uv) = h(u)h(v)$ for all $u, v\in\Sigma^*$. Given any language, $L\subseteq \Sigma^*$, we define $h(L)$ as $$h(L) = \{h(w)\ |\ w\in L\}.$$ %Given any language $L'\subseteq \Delta^*$, we define $h^{-1}(L')$ as %$$h^{-1}(L') = \{w\in \Sigma^*\ | \ h(w) \in L'\}.$$ \medskip Prove that if $L\subseteq \Sigma^*$ is a regular language, then so is $h(L)$. \medskip\noindent {\bf Problem A3.} Construct an NFA accepting the language $L = \{aa, aaa\}^*$. Apply the subset construction to get a DFA accepting $L$. \vspace {0.4cm} ``B problems'' must be turned in. \vspace{0.25cm}\noindent {\bf Problem B1 (25 pts).} Let $\Sigma = \{a_1,\ldots, a_n\}$ be an alphabet of $n$ symbols. \medskip (a) Construct an NFA with $2n+1$ (or $2n$) states accepting the set $L_n$ of strings over $\Sigma$ such that, every string in $L_n$ has an odd number of $a_i$, for some $a_i\in\Sigma$. Equivalently, if $L_{n}^{i}$ is the set of all strings over $\Sigma$ with an odd number of $a_i$, then $L_n = L_{n}^{1}\cup \cdots \cup L_{n}^{n}$. \medskip (b) Prove that there is a DFA with $2^n$ states accepting the language $L_n$. \medskip (c) Prove that every DFA accepting $L_n$ has at least $2^n$ states. \medskip\noindent {\it Hint\/}: If a DFA $D$ with $k < 2^n$ states accepts $L_n$, show that there are two strings $u, v$ with the property that, for some $a_i\in\Sigma$, $u$ contains an odd number of $a_i$'s, $v$ contains an even number of $a_i$'s, and $D$ ends in the same state after processing $u$ and $v$. From this, conclude that $D$ accepts incorrect strings. \vspace{0.25cm}\noindent {\bf Problem B2 (25 pts).} (a) Let $T = \{0, 1, 2\}$, let $C$ be the set of $20$ strings of length three over the alphabet $T$, \[ C = \{u\in T^3 \mid u \notin \{110, 111, 112, 101, 121, 011, 211\}\}, \] let $\Sigma = \{0, 1, 2, c\}$ and consider the language \[ L_M = \{w \in \Sigma^* \mid w = u_1cu_2c \cdots cu_n,\, n \geq 1, u_i \in C\}. \] Prove that $L$ is regular. \medskip (b) The language $L_M$ has a geometric interpretation as a certain subset of $\reals^3$ (actually, $\rats^3$), as follows: Given any string, $w = u_1cu_2c \cdots cu_n \in L_M$, denoting the $j$th character in $u_i$ by $u_i^j$, where $j\in \{1, 2, 3\}$, we obtain three strings \begin{eqnarray*} w^1 & = & u_1^1u_2^1 \cdots u_n^1 \\ w^2 & = & u_1^2u_2^2 \cdots u_n^2 \\ w^3 & = & u_1^3u_2^3 \cdots u_n^3. \end{eqnarray*} For example, if $w = 012c001c222c122$ we have $w^1 = 0021$, $w^2 = 1022$, and $w^3 = 2122$. Now, a string $v\in T^+$ can be interpreted as a decimal real number written in base three! Indeed, if \[ v = b_1b_2\cdots b_k, \quad\hbox{where}\quad b_i\in \{0, 1, 2\} = T\>\> (1\leq i \leq k), \] we interpret $v$ as $n(v) = 0.b_1b_2\cdots b_k$, i.e., \[ n(v) = b_1 3^{-1} + b_2 3^{-2} + \cdots + b_k 3^{-k}. \] Finally, a string, $w = u_1cu_2c \cdots cu_n \in L_M$, is interpreted as the point, $(x_w, y_w, z_w)\in \reals^3$, where \[ x_w = n(w^1),\> y_w = n(w^2),\> z_w = n(w^3). \] Therefore, the language, $L_M$, is the encoding of a set of rational points in $\reals^3$, call it $M$. This turns out to be the rational part of a fractal known as the {\it Menger sponge\/}. \medskip Explain the best you can what are the recursive rules to create the Menger sponge, starting from a unit cube in $\reals^3$. Draw some pictures illustrating this process and showing approximations of the Menger sponge. \medskip\noindent {\bf Extra Credit (20 points).} Write a computer program to draw the Menger sponge (based on the ideas above). \vspace {0.25cm}\noindent {\bf Problem B3 (50 pts).} Recall that for two regular expressions $R$ and $S$, $R\cong S$ means that $\s{L}[R] = \s{L}[S]$, i.e. $R$ and $S$ denote the same language. \medskip (i) Show that if $R \cong S^{*}T$, then $R \cong SR + T$. \medskip (ii) Assume that $\epsilon$ (the empty string) is not in the language $\s{L}[S]$ denoted by the regular expression $S$. \medskip Show that if $R \cong SR + T$, then $R \cong S^{*}T$. \medskip\noindent {\it Hint\/}: Prove that $x \in \s{L}[R]$ if and only if $x \in \s{L}[S^{*}T]$, by observing that $R \cong SR + T$ implies that for every $k\geq 0$, $$R \cong S^{k+1}R + (S^{k} + S^{k-1} + \ldots + S^{2} + S + \epsilon)T,$$ and that since $\epsilon \notin \s{L}[S]$, every string in $\s{L}[S^{k+1}R]$ has length at least $k+1$. \medskip (iii) Consider the following system of equations where $X_1,\ldots,X_n$ are variables standing for regular expressions, and the $S_{i, j}$ and the $T_i$ are regular expressions: $$\eqaligneno{ X_1 &\cong S_{1, 1}X_1 + \cdots + S_{1, n}X_n + T_1,\cr \cdots &\cong \cdots \cr X_n &\cong S_{n, 1}X_1 + \cdots + S_{n, n}X_n + T_n.\cr }$$ If $\epsilon\notin\s{L}[S_{i, j}]$ for all $i, j$, $1\leq i, j \leq n$, prove that this system has a unique solution. Do you see any connection between this problem and the node elimination algorithm? \medskip\noindent {\it Hint\/}: Eliminate the $X_i$ one by one. \vspace {0.5cm}\noindent {\bf Problem B4 (20 pts).} Let $R$ be any regular language over some alphabet $\Sigma$. Prove that the language $$L = \{u\ \mid \exists v\in\Sigma^*,\, uv\in R,\, |u| = |v|\}$$ is regular \vspace {0.25cm} \noindent {\bf Problem B5 (60 pts).} let $\Sigma$ be an alphabet. For any language $L$ and any string $x\in\Sigma^*$, the {\it left derivative of $L$ w.r.t. $x$\/,} denoted by $x\backslash L$, or $D_x L$, or $\myfrac{dL}{dx}$, is the language \[D_x L = \{y\in \Sigma^*\ |\ xy\in L\}.\] \medskip (1) Prove the following identities for all languages $L$, $A$, $B$ over $\Sigma$: \[ \eqaligneno{ D_{xy} L &= D_y (D_x L),\cr D_{\epsilon} L &= L,\cr D_x(A\cup B) &= D_x A \cup D_x B,\cr } \] and for every symbol $a\in \Sigma$, \[ \eqaligneno{ D_{a} (AB) &= (D_a A) B \cup (A \cap \{\epsilon\}) D_a B,\cr D_a(L^*) &= (D_a L) L^*.\cr } \] Given a regular expression $R$ and a string $x\in\Sigma^*$, we define the (left) derivative $D_x R$ of $R$ w.r.t. $x$ so that \[\s{L}[D_x R] = D_x \s{L}[R].\] We let \[D_{\epsilon} R = R \quad\hbox{and}\quad D_{xa} R = D_a(D_x R)\] where $a\in \Sigma$ and $x\in \Sigma^*$, \[D_a \emptyset = \emptyset, \quad D_a \epsilon = \emptyset, \quad D_a a = \epsilon, \quad D_a b = \emptyset,\] for all $a, b\in \Sigma$, $a\not= b$, \[ \eqaligneno{ D_a((R + S)) &= (D_a R + D_a S),\cr D_a (R^*) &= (D_a R\> R^*),\cr D_a (RS) &= \cases{ (D_a R \> S) & if $\epsilon \notin \s{L}[R]$,\cr ((D_a R \> S) + D_a S)& if $\epsilon \in \s{L}[R]$,\cr } } \] where $R, S$ are any regular expressions. \medskip (2) Give a simple algorithm to decide whether $\epsilon \in \s{L}[R]$, where $R$ is any given regular expression. \medskip Prove that every regular expression has finitely many distinct derivatives (by distinct derivatives, we mean inequivalent derivatives). \medskip\noindent {\it Hint\/}. Use an induction on the number of occurrences of the symbols from $\Sigma \cup \{\epsilon, \emptyset, +, \cdot, *\}$. When $R = (S\cdot T)$, prove that $D_x R$ is equivalent to an expression of the form \[(D_x S\> T + D_{v_1} T + \cdots + D_{v_k} T),\] where for every $i$, $1\leq i \leq k$, there is some $u_i\in \s{L}[S]$ such that $x = u_iv_i$. When $R = S^*$, prove that $D_x R$ is equivalent to an expression of the form \[(D_{v_1} S + \cdots + D_{v_k} S)S^*,\] where for every $i$, $1\leq i\leq k$, there is some $u_i\in \s{L}[S^*]$ such that $x = u_iv_i$. \medskip (3) Assuming that $R$ has $n$ distinct derivatives, prove that every derivative of $R$ belongs to the finite set \[\{D_x R\ |\ x\in \Sigma^*, \ 0\leq |x| < n\}.\] Show that the upper bound on the number of derivatives is a product of towers of exponentials (in terms of the length of $R$). \medskip (4) Prove that if $D$ is a DFA accepting $\s{L}[R]$ and $D$ has $n$ states, then $R$ has at most $n$ distinct derivatives. \medskip If $\nu(R)$ is the number of occurrences in $R$ of the symbols from $\Sigma \cup \{\epsilon, \emptyset, +, \cdot, *\}$, prove that $R$ has at most \[2^{2\nu(R)} \leq 4^{|R|}\] distinct derivatives (where $|R|$ denotes the length of $R$). \medskip (5) If $L$ is any regular language over $\Sigma^*$, prove that the number of states of every minimal DFA for $L$ is equal to the number of distinct derivatives, $D_u(L)$, of $L$. \medskip (6) Prove that the regular expression \[R = (a + b)^* a \underbrace{(a + b) \cdots (a + b)}_{n}\] has $\nu(R) = 3n + 5$ (if we do not count $\cdot$, otherwise, $\nu(R) = 4n + 6$) and that $R$ has $2^{n+1}$ distinct derivatives. \medskip Prove that there is a $2$-state DFA accepting the language denoted by $(a + b)^* a$ and there is an $(n + 2)$-state DFA accepting the language denoted by $\underbrace{(a + b) \cdots (a + b)}_{n}$. \\ Yet, prove that any minimal DFA for the language denoted by $R$ above has $2^{n+1}$ states. \vspace{0.5cm}\noindent {\bf TOTAL: 180 + 20 points.} \end{document}