Based on top of PR#1447. Adds a simple heuristic check for Hack files vs PHP files (`<?hh` vs other `<?`).
Tested by verifying that the Hack example site was detected as 100% Hack and that Laravel was detected as 100% PHP. (Without the heuristic, Laravel gets detected as about 50% Hack, just by randomness in the classifier since PHP and Hack are very hard to distinguish unless you actually parse the file and look for specific language features.)