Based on top of PR#1447. Adds a simple heuristic check for Hack files vs PHP files (`<?hh` vs other `<?`).
Tested by verifying that the Hack example site was detected as 100% Hack and that Laravel was detected as 100% PHP. (Without the heuristic, Laravel gets detected as about 50% Hack, just by randomness in the classifier since PHP and Hack are very hard to distinguish unless you actually parse the file and look for specific language features.)
Hack is Facebook's dialect of PHP: http://hacklang.org/. This adds support for detecting it via the ".hh" file extension; although that extension techincally conflicts with C++ headers, the files look different enough that the existing classifier based on sample code has no trouble distinguising them.
This diff deliberately does not deal with detecting ".php" as another valid extension for Hack code. That's much trickier since the code looks basically identical to PHP to the classifier, and needs a different approach.