123 Commits

Author SHA1 Message Date
Paul Chaignon
54ae7e7b4d Strategies take result from previous strategy into account (#4099)
Each strategy takes as candidates the language outputted by the
previous strategy if any. This was already the case for the
Classifier and Heuristic strategies as these couldn't generate new
candidate languages (as opposed to the Modeline, Filename, Shebang,
and Extension strategies).

In practice, this signifies that if, for example, the Shebang
strategy finds two possible languages for a given file (as is
currently possible with the perl interpreter), the next strategy, the
Extension strategy, will use this information and further reduce the
set of possible language.
Currently, without this commit, the Extension strategy would discard
the results from the previous strategy and start anew, possibly
returning a different language from those returned by the Shebang
strategy.
2018-04-17 10:02:57 +02:00
Juan Julián Merelo Guervós
a9ff59aef5 Additions to the Perl family of languages (#4066)
* Mainly fixing problems with Perl heuristics

And also adding a little bit of text to the README file to help with local use and test.

* Adds new sample

* Adds a couple of samples more, not represented before

* Moves installation intructions to CONTRIBUTING.md

Refs #2309 and also changes github.com to an uniform capitalization.

* Correcting error. Great job, CI

* Moving another file

* Adds samples and new checks for perl/perl6

* Stupid mistake

* Changing regex for perl5 vs perl6

Initial suggestion by @pchaigno, slightly changed to eliminate false positives such as "classes" or "modules" at the beginning of a line in the =pod

BTW, it would be interesting to just eliminate these areas for language detection.

* Eliminates Rexfile from Perl6

And adds .pod6

* Followup to #2709

I just found I had this sitting here, so I might as well follow
instructions to fix it.

* Adds example for pod6

* Eliminates .pod because it's its own language

* Removes bad directory

* Reverting changes that were already there

* Restored CONTRIBUTING.md from head

I see installation of cmake is advised in README.md

* Eliminates `.pod6`

To leave way for #3366 or succeeding PRs.

* Removed by request, since we're no longer adding this extension

* Sorting by alphabetical order filenames

* Moved from sample to test fixtures
2018-04-11 17:32:26 +02:00
Colin Seymour
434023460e Revert "Check generated Jest snap file" (#3984)
* Revert "Remove Arduino as a language (#3933)"

This reverts commit 8e628ecc36.

* Revert "Check generated Jest snap file (#3874)"

This reverts commit ca714340e8.
2018-01-12 11:49:02 +00:00
Yuya Takeyama
ca714340e8 Check generated Jest snap file (#3874)
* Check generated Jest snap file

* Check file name rule first

ref: https://github.com/github/linguist/pull/3874/files#r146168309

* Check extension first

It must be cheaper
ref: https://github.com/github/linguist/pull/3874/files#r146168426
2018-01-11 09:25:13 +00:00
Cesar Tessarin
21babbceb1 Fix Perl 5 and 6 disambiguation bug (#3860)
* Add test to demonstrate Perl syntax detection bug

A Perl 5 .pm file containing the word `module` or `class`, even with
an explicit `use 5.*` statement, is recognized as Perl 6 code.

* Improve Perl 5 and Perl 6 disambiguation

The heuristics for Perl 5 and 6 `.pm` files disambiguation was done
searching for keywords which can appear in both languages (`class` and
`module`) in addition to the `use` statement check.

Due to Perl 6 being tested first, code containing those words would
always be interpreted as Perl 6.

Test order was thus reversed, testing for Perl 5 first. Since Perl 6
code would never contain a `use 5.*` statement, this does no harm to
Perl 6 detection while fixing the problem to Perl 5.

Fixes: #3637
2017-10-23 10:16:56 +01:00
Santiago M. Mola
470a82d9f5 shell: add more interpreters (#3708)
* ash: only interpreter, extension is more commonly used for
  Kingdom of Loathing scripting, e.g. github.com/twistedmage/assorted-kol-scripts

* dash: only interpreter, extension is more commonly used for
  dashboarding-related stuff

* ksh: extension was already present

* mksh

* pdksh
2017-07-20 10:33:28 +01:00
Christoph Pojer
461c27c066 Revert "Added Jest snapshot test files as generated src (#3572)" (#3579)
This reverts commit f38d6bd124.
2017-04-22 14:20:54 +02:00
Hank Brekke
f38d6bd124 Added Jest snapshot test files as generated src (#3572) 2017-04-20 08:58:39 +01:00
sunderls
b36ea7ac9d Add yarn (#3432)
* add yarn.lock

* fix comment

* remove yarn test

* add test

* fix test

* try fix again

* try 3rd time

* check filename and firstline for yarn lockfile
2017-01-23 10:58:53 -08:00
Paul Chaignon
9b941a34f0 Use filenames as a definitive answer (#2006)
* Separate find_by_extension and find_by_filename
find_by_extension now takes a path as argument and not only the file extension.
Currently only find_by_extension is used as a strategy.

* Add find_by_filename as first strategy
2016-12-12 12:34:33 -08:00
Arfon Smith
4efc6f8c95 Merge branch 'master' into go-vendor 2016-10-26 18:34:02 -04:00
Arfon Smith
c8094d3775 Merge branch 'master' into 3227-local 2016-09-21 20:26:51 -07:00
Alhadis
697380336c Revise pattern for Emacs modeline detection
This is a rewrite of the regex that handles Emacs modeline matching. The
current one is a little flaky, causing some files to be misclassified as
"E", among other things.

It's worth noting malformed modelines can still change a file's language
in Emacs. Provided the -*- delimiters are intact, and the mode's name is
decipherable, Emacs will set the appropriate language mode *and* display
a warning about a malformed modeline:

    -*- foo-bar mode: ruby -*-   # Malformed, but understandable
            -*- mode: ruby--*-   # Completely invalid

The new pattern accommodates this leniency, making no effort to validate
a modeline's syntax beyond readable mode-names. In other words, if Emacs
accepts certain errors, we should too.
2016-09-17 19:45:43 +10:00
Alhadis
abf7bee464 Include tests for version-specific Vim modelines 2016-09-12 20:00:05 +10:00
Alhadis
e73a4ecd0e Allow " ex:" to match at beginning of file
Although unlikely to be valid syntax in most programming languages, such
a modeline is valid syntax in Vim, and will trigger any filetype modes.
2016-09-12 19:59:08 +10:00
Alhadis
22d4865c52 Revise patterns for Vim modeline detection
The current expressions fail to match certain permutations of options:

    vim: noexpandtab: ft=javascript:
    vim: titlestring=foo\ ft=notperl ft=javascript:

Version-specific modelines are also unaccounted for:

    vim600: set foldmethod=marker ft=javascript:   # >= Vim 6.0
    vim<600: set ft=javascript:                    # <  Vim 6.0

See http://vimdoc.sourceforge.net/htmldoc/options.html#modeline
2016-09-11 00:51:03 +10:00
James Ko
c7868a95bc Merge pull request #2902 from jamesqo/patch-2
Add App.config + NuGet.config to the XML file list
2016-03-23 20:11:36 -06:00
Farbod Salamat-Zadeh
9bfbd0550c Move cars.csv from test/fixtures/Data to samples/CSV 2016-02-27 14:32:50 +00:00
Brandon Keepers
789607d9bc Merge branch 'master' into go-vendor
* master: (168 commits)
  ruby for example
  Bumping version
  Updating grammars
  Grammar for Less from Atom package
  Remove Less grammar
  Updating to latest perl6 grammar
  Adding Perl6-specific grammar.
  Grammar for YANG from Atom package
  Support for YANG language
  Add detection of GrammarKit-generated files
  Add .xproj to list of XML extensions
  Test submodules are using HTTPS links
  Improved vim modeline detection
  Heuristic for Pod vs. Perl
  Bumping to v4.7.4
  Grammar update
  Support .rs.in as a file extension for Rust files.
  HTTPS links for submodules
  Add the LFE lexer as an example of erlang .xrl
  Add the Elixir parser as an example of erlang .yrl
  ...
2016-02-18 20:12:27 -05:00
Brandon Keepers
d46530989c Only treat .go files in ^vendor/ as generated 2016-02-18 19:57:34 -05:00
chrisarcand
d87fad649c Improved vim modeline detection
TLDR: This greatly increases the flexibility of vim modeline detection
to manually set the language of a file.

In vim there are two forms of modelines:

[text]{white}{vi:|vim:|ex:}[white]{options}
examples: 'vim: syntax=perl', 'ex: filetype=ruby'

-and-

[text]{white}{vi:|vim:|Vim:|ex:}[white]se[t] {options}:[text]
examples: 'vim set syntax=perl:', 'Vim: se ft=ruby:'

As you can see, there are many combinations. These changes should allow
most combinations to be used. The two most important additions are the
use of the keyword 'syntax', as well as the addition of the first form
(you now no longer need to use the keyword 'set' with a colon at the end).
The use of first form with 'syntax' is very, very common across GitHub:

https://github.com/search?l=ruby&q=vim%3A+syntax%3D&ref=searchresults&type=Code&utf8=%E2%9C%93
2016-01-16 08:57:20 -05:00
Ammar Askar
4650368bc2 Make regex for vim modeline more lenient
This change allows the filetype/language to be retrieved from more complex vim modelines. The current regex strictly allows a set line which contains only the filetype/ft parameter and nothing else
2015-08-10 00:42:14 -05:00
Arfon Smith
21e97cc65c Merge pull request #2170 from pchaigno/mod-extension
.mod extension
2015-07-04 20:57:40 +01:00
Paul Chaignon
8c54f68040 Fix conflicts from merging master into 'mod-extension' 2015-07-04 18:01:56 +02:00
Arfon Smith
117735ffb9 Merge pull request #2179 from pchaigno/symlinks
Ignore symbolic links
2015-07-04 16:57:43 +01:00
Arfon Smith
2fac182a90 Improving Vim modeline regex 2015-05-12 16:49:14 -05:00
Paul Chaignon
e073e91d62 Detect GFortran module files as generated 2015-04-19 16:56:38 +02:00
Paul Chaignon
da9bda0e27 Detect KiCAD module files as generated 2015-04-19 16:19:52 +02:00
michael tesch
a5b0b333b0 Merge branch 'master' into tesch1-emacs-patch-1 2015-03-24 09:44:08 +01:00
なつき
67e4212f64 Test detecting generated source maps 2015-03-19 19:50:40 -07:00
michael tesch
fda0f2a042 detect emacs modeline for fundamental as Text 2015-03-14 23:53:17 +01:00
michael tesch
6af4ab6db1 harder test 2015-03-14 23:26:08 +01:00
michael tesch
a364e4a2dc tests for emacs modeline regex 2015-03-14 23:13:59 +01:00
Michael Tesch
1bb639617c Create seeplusplusEmacs1
one type of emacs modeline
2015-03-14 22:44:02 +01:00
Paul Chaignon
730be65514 Ignore symlinks in repository statistics 2015-02-28 16:08:16 +01:00
Arfon Smith
88e79cd3a8 Adding fixtures to test shebang strategy ordering 2015-02-07 10:24:03 -06:00
Arfon Smith
0db1d1c8ca Modifying some modeline fixtures to test case InSeNsItivitY 2015-02-06 08:48:59 -06:00
Lars Brinkhoff
2077fa3837 'Text' doesn't qualify as a valid modeline language. 2015-02-04 08:20:19 +01:00
Arfon Smith
119a8fff1e Emacs modeline fixtures 2015-01-26 15:38:19 -06:00
Arfon Smith
429c791377 Testing Vim modeline support 2015-01-26 14:39:07 -06:00
Paul Chaignon
f93272f0bd Move text files from fixtures to samples when possible 2014-12-10 20:09:14 -05:00
Paul Chaignon
93186947c2 Move binaries and text files from samples folder to fixtures 2014-12-04 23:48:05 -05:00
Paul Chaignon
77444284e3 Data folder in fixtures for files with no language 2014-12-04 19:14:44 -05:00
Arfon Smith
e70cd33323 Moving to fixtures 2014-09-17 08:37:00 -05:00
Joshua Peek
5521dd08a0 Move test fixtures to samples/ 2012-06-22 10:09:24 -05:00
Joshua Peek
540f2a0941 More matlab samples 2012-06-21 10:44:31 -05:00
Joshua Peek
4b9b8a5058 Remove matlab file with bogus keywords 2012-06-21 10:25:30 -05:00
Joshua Peek
a10e52a3c2 Revert removing some fixtures 2012-06-20 11:18:16 -05:00
Joshua Peek
6113e6d548 Remove ambiguous obj-c header example 2012-06-19 16:28:34 -05:00
Joshua Peek
4ea1e8aece Remove ambiguous c header example 2012-06-19 16:26:39 -05:00