* Separate find_by_extension and find_by_filename
find_by_extension now takes a path as argument and not only the file extension.
Currently only find_by_extension is used as a strategy.
* Add find_by_filename as first strategy
* Remove deprecated find_by_shebang
* Remove deprecated ace_modes function
* Remove deprecated primary_extension function
Gists don't have a language dropdown anymore
* Remove deprecated Linguist::Language.detect function
* Remove deprecated search_term field
* Generate language_id from language names
The language_id is generated from the SHA256 hash of the language's name
* Test the validity of language ids
All languages should have a positive 32bit integer as an id
* Update languages.yml header in set-language-ids
Remove our own license classification code
Add hashes for any project which does not have a standard license body
Add projects for which a license was not found to the whitelist
Requires Licensee v8.6.0 to correctly recognize TextMate bundles' .mdown README
Since v6.1.0, Licensee exposes the hash of the license
We can use it to uniquely identify unrecognized licenses,
Thus, tests will fail if the content of an unrecognized license changes
Projects for which no license was found are kept in the whitelist
This is a rewrite of the regex that handles Emacs modeline matching. The
current one is a little flaky, causing some files to be misclassified as
"E", among other things.
It's worth noting malformed modelines can still change a file's language
in Emacs. Provided the -*- delimiters are intact, and the mode's name is
decipherable, Emacs will set the appropriate language mode *and* display
a warning about a malformed modeline:
-*- foo-bar mode: ruby -*- # Malformed, but understandable
-*- mode: ruby--*- # Completely invalid
The new pattern accommodates this leniency, making no effort to validate
a modeline's syntax beyond readable mode-names. In other words, if Emacs
accepts certain errors, we should too.