* Add test to demonstrate Perl syntax detection bug
A Perl 5 .pm file containing the word `module` or `class`, even with
an explicit `use 5.*` statement, is recognized as Perl 6 code.
* Improve Perl 5 and Perl 6 disambiguation
The heuristics for Perl 5 and 6 `.pm` files disambiguation was done
searching for keywords which can appear in both languages (`class` and
`module`) in addition to the `use` statement check.
Due to Perl 6 being tested first, code containing those words would
always be interpreted as Perl 6.
Test order was thus reversed, testing for Perl 5 first. Since Perl 6
code would never contain a `use 5.*` statement, this does no harm to
Perl 6 detection while fixing the problem to Perl 5.
Fixes: #3637
* ash: only interpreter, extension is more commonly used for
Kingdom of Loathing scripting, e.g. github.com/twistedmage/assorted-kol-scripts
* dash: only interpreter, extension is more commonly used for
dashboarding-related stuff
* ksh: extension was already present
* mksh
* pdksh
* Separate find_by_extension and find_by_filename
find_by_extension now takes a path as argument and not only the file extension.
Currently only find_by_extension is used as a strategy.
* Add find_by_filename as first strategy
This is a rewrite of the regex that handles Emacs modeline matching. The
current one is a little flaky, causing some files to be misclassified as
"E", among other things.
It's worth noting malformed modelines can still change a file's language
in Emacs. Provided the -*- delimiters are intact, and the mode's name is
decipherable, Emacs will set the appropriate language mode *and* display
a warning about a malformed modeline:
-*- foo-bar mode: ruby -*- # Malformed, but understandable
-*- mode: ruby--*- # Completely invalid
The new pattern accommodates this leniency, making no effort to validate
a modeline's syntax beyond readable mode-names. In other words, if Emacs
accepts certain errors, we should too.
The current expressions fail to match certain permutations of options:
vim: noexpandtab: ft=javascript:
vim: titlestring=foo\ ft=notperl ft=javascript:
Version-specific modelines are also unaccounted for:
vim600: set foldmethod=marker ft=javascript: # >= Vim 6.0
vim<600: set ft=javascript: # < Vim 6.0
See http://vimdoc.sourceforge.net/htmldoc/options.html#modeline
* master: (168 commits)
ruby for example
Bumping version
Updating grammars
Grammar for Less from Atom package
Remove Less grammar
Updating to latest perl6 grammar
Adding Perl6-specific grammar.
Grammar for YANG from Atom package
Support for YANG language
Add detection of GrammarKit-generated files
Add .xproj to list of XML extensions
Test submodules are using HTTPS links
Improved vim modeline detection
Heuristic for Pod vs. Perl
Bumping to v4.7.4
Grammar update
Support .rs.in as a file extension for Rust files.
HTTPS links for submodules
Add the LFE lexer as an example of erlang .xrl
Add the Elixir parser as an example of erlang .yrl
...
TLDR: This greatly increases the flexibility of vim modeline detection
to manually set the language of a file.
In vim there are two forms of modelines:
[text]{white}{vi:|vim:|ex:}[white]{options}
examples: 'vim: syntax=perl', 'ex: filetype=ruby'
-and-
[text]{white}{vi:|vim:|Vim:|ex:}[white]se[t] {options}:[text]
examples: 'vim set syntax=perl:', 'Vim: se ft=ruby:'
As you can see, there are many combinations. These changes should allow
most combinations to be used. The two most important additions are the
use of the keyword 'syntax', as well as the addition of the first form
(you now no longer need to use the keyword 'set' with a colon at the end).
The use of first form with 'syntax' is very, very common across GitHub:
https://github.com/search?l=ruby&q=vim%3A+syntax%3D&ref=searchresults&type=Code&utf8=%E2%9C%93
This change allows the filetype/language to be retrieved from more complex vim modelines. The current regex strictly allows a set line which contains only the filetype/ft parameter and nothing else