Friday, March 2, 2012

Computability and the Underpinnings of Computer Science

This week I concluded a course I took this quarter that covered the theory behind computer science. How comprehensive the course's coverage of a whole field's underpinnings was, I'm not qualified to speculate (but my guess would be: not very). With that said, it did cover a lot of interesting material. The course covered:

  • The Chomsky hierarchy of formal languages.
    • Regular languages/(non-deterministic and deterministic) finite state automata/regular expressions/left-linear and right-linear context-free grammars.
      • Pumping lemma for proving non-regularity of languages
    • Deterministic push-down automata (and whatever class of languages these are equivalent to)
    • Context-free grammars/push-down automata
      • Ambiguity
      • Griebach Normal Form
      • Chomsky Normal Form and membership test Cocke-Younger-Kasami (CYK) algorithm.
      • Pumping lemma for proving language to be not context-free.
    • An offhand mention of context-sensitive language
    • Recursively enumerable languages/Turing Machines.
  • Computability (unfortunately we were not able to solve does P=NP and win that new car).
    • P time
    • NP time
    • NP-hard vs. NP-complete vs. NP
    • Decidability vs. intractability
Overall I enjoyed the class a lot. I felt the course was not rigorous in its mathematical treatment of a lot of topics, but the professor did provide an alternative practical viewpoint which I appreciated (he was a compiler's guy, so he found grammars to be particularly interesting and snuck in a few question about LL parsers).

What I was disappointed about was the treatment of computability and exactly what the heck it means to be P or NP. I came away feeling only a little less confused about the problem than I was when I started the class. 

As always, I went to google. I found a great explanation on stackoverflow (scroll to grom's answer) that was a good balance between layman's terms and formal terms I learned in class. 

From my understanding now:
  • P problems are a class of problems where there is a known algorithm that can provide an answer/solution to the problem in polynomial time. Polynomial time can be O(1), O(n), or O(n^100000000). It doesn't matter.
  • NP problems are a class of problems where we can guarantee there exists a solution, but we can't compute it in polynomial time; it can only be computed in non-deterministic polynomial time. What that means is that there exists a finite, polynomial amount of steps to find the solution but we can't describe a polynomial algorithm to reach it. We have to resort to an exhaustive, "brute-force" search for the solution. This is a problem because some NP problems are intractable which means even though it's possible to find an answer it may take so long it's not even worth it.
  • NP-complete problems are problems that are NP but have the special property that all other NP problems can be reduced to them in polynomial time. This "simply" means any NP problem can be transformed into the NP-complete problem using a polynomial time algorithm. 
  • NP-hard problems are problems that are at least as difficult to solve as the hardest NP problem(s). Note, this class of problems includes problems that are harder than any NP problem (which means they may not even be solvable in non-deterministic polynomial time).
So the big question is: does P=NP, meaning are all P problems equivalent to NP problems? If this is true, that means a lot of hard problems we haven't been able to reasonably compute would be solvable in an amount of time that's worthwhile. An example of this type of problem would be breaking cryptography and security codes which could be a severely good thing or bad thing (depending who you are). And so I guess that's why the Clay Mathematics Institute is willing to give any person who solves whether P=NP $1 million for their efforts.

No comments:

Post a Comment