Add detection for Hack files with ".php" file extension

Based on top of PR#1447. Adds a simple heuristic check for Hack files vs PHP files (`<?hh` vs other `<?`).

Tested by verifying that the Hack example site was detected as 100% Hack and that Laravel was detected as 100% PHP. (Without the heuristic, Laravel gets detected as about 50% Hack, just by randomness in the classifier since PHP and Hack are very hard to distinguish unless you actually parse the file and look for specific language features.)
This commit is contained in:
Josh Watzman
2014-08-06 16:30:21 -07:00
parent b2cb74cabf
commit 9c044c5bd0
3 changed files with 43 additions and 0 deletions

View File

@@ -25,6 +25,9 @@ module Linguist
if languages.all? { |l| ["Common Lisp", "OpenCL"].include?(l) }
result = disambiguate_cl(data, languages)
end
if languages.all? { |l| ["Hack", "PHP"].include?(l) }
result = disambiguate_hack(data, languages)
end
return result
end
end
@@ -88,6 +91,13 @@ module Linguist
matches
end
def self.disambiguate_hack(data, languages)
matches = []
matches << Language["Hack"] if data.include?("<?hh")
matches << Language["PHP"] if /<?[^h]/.match(data)
matches
end
def self.active?
!!ACTIVE
end