Context-Free Grammar

  • A context-free grammar (CFG) is a formal grammar in which each production rule is of the form , where and .
  • Let be a CFG.
    • If and is a rule, then we say that yields , written
    • The language (which is the set of all strings generated by ) is called the language generated by and is said to be a context-free language (CFL).
    • Given a string :
      • A derivation of in is a sequence of substitutions of the form:
        • , where each is in
        • If such a derivation for exists in , we say that generates (or derives ) and write
      • is said to be derived ambiguously in if it has two or more different leftmost derivations.
      • A derivation of is a leftmost derivation if at every step the leftmost remaining variable is the one replaced.
      • The following are equivalent:
        • There exists a parse tree for with respect to
      • A parse tree for with respect to is a rooted ordered tree in which:
        • The root is labeled by the start variable
        • The leaves are labeled by the symbols of
        • Each internal node is labeled by a variable and has children labeled if is a rule in
      • There is a bijection between the set of left-most derivations of and the set of parse trees for with respect to .
    • is ambiguous if it generates at least one string ambiguously.
      • A CFG is ambiguous iff it generates some string with two different parse trees.
    • A CFG is in Chomsky normal form if every rule is of the form , , or , where , , and is the start variable, and and are not .
      • If is a CFG in Chomsky normal form, and , then , where is the height of the parse tree for .
      • (problem 2.26) If is a CFG in Chomsky normal form, and , where , then every derivation of requires exactly steps.
      • (2.9) Every CFL is generated by a CFG in Chomsky normal form.
      • iff its CNF contains the rule .todo
  • A language is said to be a context-free language (CFL) if it is generated by some CFG.
    • A CFL is inherently ambiguous if all CFGs that generate it are ambiguous.
    • Every CFL is generated by a CFG in Chomsky normal form.
    • (2.32) Every regular language is context free.
  • For every language , the following are equivalent:
    • is a context-free language (CFL)
    • is generated by some CFG
    • (2.20) is recognized by some PDA
    • is in Chomsky normal formtodo
  • Two CFGs and are said to be equivalent if .
    • Remark: There can be multiple CFGs that generate the same language, moreoever, some of these may be ambiguous while others are unambiguous.
  • (2.34, Pumping lemma) If is a CFL, then there exists a number (the pumping length) such that any string with can be written as , where:
  • The class of CFLs is closed under: union, concatenation, Kleene star, reversal, and intersection with regular languages. However, it is not closed under intersection or complementation.

Pushdown Automata (PDA)

  • A pushdown automaton (PDA) is a 6-tuple , where , , , and are all finite sets, and
    • is the set of states
    • is the input alphabet
    • is the stack alphabet
    • is the transition function
    • is the start state
    • is the set of accept states
  • A pushdown automaton accepts a string if there exists a sequence of states and strings such that:
    • and
    • For , we have , where and for some and .
  • The set of strings accepted by is the language denoted by , and is called the language recognized by .
  • A PDA can be represented using a state diagram, where each transition is labeled by the notation "" to denote that the PDA:
    • Reads the symbol from the input (or read nothing if )
    • Pops the symbol from the stack (or pops nothing if )
    • Pushes onto the stack (or pushes nothing if )

CFL to CFG

  • Given a CFL , we can construct a CFG such that .todo

CFL to PDA

  • Given a CFL , we can construct a PDA such that .todo

CFG to CNF

  • (2.9) Given a CFG we can construct an equivalent CFG in Chomsky normal form
    • Add new start variable and new rule
    • For each an -rule, , where
      • Remove
      • For every rule , add the rules:
    • For each unit rule
      • Remove
      • For each rule , where
        • Add the rule , unless this was a unit rule previously removed
    • For each rule , where and ,
      • Replace with the rules, , where ‘s are new variables.
        • Repalce each terminal in the previous rule(s) with the new variable and add the rule

DFA to CFG

  • Let be a DFA. We can convert to a CFG as follows:
    1. Make a variable for each state
    2. Add a rule to the CFG if
    3. Add a rule if
    4. Make the start variable of the CFG, where is the start state of