948 Commits

Author SHA1 Message Date
Paul Chaignon
54ae7e7b4d Strategies take result from previous strategy into account (#4099)
Each strategy takes as candidates the language outputted by the
previous strategy if any. This was already the case for the
Classifier and Heuristic strategies as these couldn't generate new
candidate languages (as opposed to the Modeline, Filename, Shebang,
and Extension strategies).

In practice, this signifies that if, for example, the Shebang
strategy finds two possible languages for a given file (as is
currently possible with the perl interpreter), the next strategy, the
Extension strategy, will use this information and further reduce the
set of possible language.
Currently, without this commit, the Extension strategy would discard
the results from the previous strategy and start anew, possibly
returning a different language from those returned by the Shebang
strategy.
2018-04-17 10:02:57 +02:00
Brayden Banks
5363e045bb Teach Generated about Cargo lock files (#4100) 2018-04-15 12:04:19 +02:00
Juan Julián Merelo Guervós
a9ff59aef5 Additions to the Perl family of languages (#4066)
* Mainly fixing problems with Perl heuristics

And also adding a little bit of text to the README file to help with local use and test.

* Adds new sample

* Adds a couple of samples more, not represented before

* Moves installation intructions to CONTRIBUTING.md

Refs #2309 and also changes github.com to an uniform capitalization.

* Correcting error. Great job, CI

* Moving another file

* Adds samples and new checks for perl/perl6

* Stupid mistake

* Changing regex for perl5 vs perl6

Initial suggestion by @pchaigno, slightly changed to eliminate false positives such as "classes" or "modules" at the beginning of a line in the =pod

BTW, it would be interesting to just eliminate these areas for language detection.

* Eliminates Rexfile from Perl6

And adds .pod6

* Followup to #2709

I just found I had this sitting here, so I might as well follow
instructions to fix it.

* Adds example for pod6

* Eliminates .pod because it's its own language

* Removes bad directory

* Reverting changes that were already there

* Restored CONTRIBUTING.md from head

I see installation of cmake is advised in README.md

* Eliminates `.pod6`

To leave way for #3366 or succeeding PRs.

* Removed by request, since we're no longer adding this extension

* Sorting by alphabetical order filenames

* Moved from sample to test fixtures
2018-04-11 17:32:26 +02:00
Cyrille Le Clerc
51d3711faf Detect Maven wrapper "mvnw" (#4042)
* Detect Maven wrapper "mvnw"

* Fix build, filenames must be sorted in the "filenames" section of languages.yml, filenames cannot be grouped by topic

* Remove `mvnw` file from languages/Shell/filenames according to @Alhadis recommendation as we are sure that `mvnw` always starts with the shebang `#!/bin/sh`.

* Remove space chars added by mistake
2018-04-08 16:04:34 +01:00
Colin Seymour
f452612666 Update Licensee and Licensed gems (#3982)
* Update licensee version

This pulls in Licensed 0.10.0 too.

* Use a full path to the grammars

Licensed now enforces this as it's easier then guessing.

* Ensure full path

* Use new path for FSProject

* Starting to adjust tests

* require licensee again

* Fix grammar tests

* verify -> status

* whitelist -> allowed

* explicitly set cache_path in configuration

default for licensed v1.0 changed from `vendor/licenses` to `.licenses`

* load configuration from file location

default configuration file location changed from `vendor/licenses/config.yml` to `.licensed.yml`

* update gemspec for licensed 1.0.0

* Remove unused license hash
2018-04-03 16:35:24 +01:00
Paul Chaignon
0bf4b8a482 Remove samples/LANG/filenames as a source of truth (#4078)
All filenames must now be explicitly listed in languages.yml. A test
makes sure they are.
2018-04-02 11:09:06 +02:00
John Gardner
1b3cdda4f7 Add script to alphabetise submodule list (#4054) 2018-03-02 20:33:09 +11:00
Jason Malinowski
c2d3170064 Support VB.NET *.Generated.vb along with *.Generated.cs files (#4027) 2018-02-21 11:56:55 +00:00
Seppe Stas
8da6ddf9d9 Override languages being included by language statistics (#3807)
* Add detectable key to languages

This key allows to override the language being included in the
language stats of a repository.

* Make detectable override-able using .gitattributes

* Mention `linguist-detectable` in README

* Remove detectable key from languages

Reverts changes in 0f7c0df5.

* Update commit hash to the one that was merged

PR #3806 changed the commit hash. The original commit was not
actually merged into the test/attributes branch.

* Fix check to ensure detectable is defined

* Add include in language stats tests when detectable set

* Ignore detectable when vendored, documentation or overridden

* Add documentation on detectable override in README

* Improve documentation on detectable override in README
2018-01-23 12:17:48 +00:00
Colin Seymour
434023460e Revert "Check generated Jest snap file" (#3984)
* Revert "Remove Arduino as a language (#3933)"

This reverts commit 8e628ecc36.

* Revert "Check generated Jest snap file (#3874)"

This reverts commit ca714340e8.
2018-01-12 11:49:02 +00:00
Yuya Takeyama
ca714340e8 Check generated Jest snap file (#3874)
* Check generated Jest snap file

* Check file name rule first

ref: https://github.com/github/linguist/pull/3874/files#r146168309

* Check extension first

It must be cheaper
ref: https://github.com/github/linguist/pull/3874/files#r146168426
2018-01-11 09:25:13 +00:00
Ashe Connor
5fbe9c0902 Allow classifier to run on symlinks as usual (#3948)
* Fixups for symlink detection, incl. test

* assert the heuristics return none for symlink
2018-01-08 09:01:16 +11:00
Vicent Martí
e335d48625 New Grammars Compiler (#3915)
* grammars: Update several grammars with compat issues

* [WIP] Add new grammar conversion tools

* Wrap in a Docker script

* Proper Dockerfile support

* Add Javadoc grammar

* Remove NPM package.json

* Remove superfluous test

This is now always checked by the grammars compiler

* Update JSyntax grammar to new submodule

* Approve Javadoc license

* grammars: Remove checked-in dependencies

* grammars: Add regex checks to the compiler

* grammars: Point Oz to its actual submodule

* grammars: Refactor compiler to group errors by repo

* grammars: Cleanups to error reporting
2017-11-30 16:15:48 +01:00
NachoSoto
4f46155c05 Add Carthage/Build to generated list so it doesn't show in PR diffs (#3920)
Equivalent to #3865, but for Carthage.
2017-11-29 14:26:23 +00:00
NachoSoto
38901d51d2 Changed Carthage vendor regex to allow folder in any subdirectory (#3921)
In monorepro projects, it's not uncommon for `Carthage` to not be in the root directory.
2017-11-29 14:25:04 +00:00
Shai Mishali
ded0dc74e0 Add Cocoapods to generated list so it doesn't show in PR diffs (#3865)
* Add Cocoapods to generated list so it doesn't show in PR diffs

* Removed Cocoapods from vendor.yml

* Enhance regex to match only Cocoapod's Pods folder

* Adds additional test cases for generated Pods folder
2017-11-28 11:04:18 +00:00
Colin Seymour
c5d1bb5370 Unvendor tools/ (#3919)
* Unvendor tools/

* Remove test
2017-11-28 10:52:02 +00:00
Ashe Connor
33be70eb28 Fix failing edges on leading commas in args (#3905) 2017-11-16 11:44:51 +11:00
Alex Arslan
0f4955e5d5 Update Julia definitions to use Atom instead of TextMate (#3871) 2017-11-09 19:39:37 +11:00
Cesar Tessarin
21babbceb1 Fix Perl 5 and 6 disambiguation bug (#3860)
* Add test to demonstrate Perl syntax detection bug

A Perl 5 .pm file containing the word `module` or `class`, even with
an explicit `use 5.*` statement, is recognized as Perl 6 code.

* Improve Perl 5 and Perl 6 disambiguation

The heuristics for Perl 5 and 6 `.pm` files disambiguation was done
searching for keywords which can appear in both languages (`class` and
`module`) in addition to the `use` statement check.

Due to Perl 6 being tested first, code containing those words would
always be interpreted as Perl 6.

Test order was thus reversed, testing for Perl 5 first. Since Perl 6
code would never contain a `use 5.*` statement, this does no harm to
Perl 6 detection while fixing the problem to Perl 5.

Fixes: #3637
2017-10-23 10:16:56 +01:00
Ashe Connor
93cd47822f Only recognise Pod for .pod files (#3863)
We uncomplicate matters by removing ".pod" from the Perl definition
entirely.
2017-10-17 19:05:20 +11:00
Codecat
44048c9ba8 Add Angelscript language (#3844)
* Add AngelScript scriping language

* Add AngelScript sample

* Initial implementation of Angelscript

* Update Angelscript tm_scope and ace_mode

* Move Angelscript after ANTLR

* Updated grammar list

* Alphabetical sorting for Angelscript

* Angelscript grammar license is unlicense

* Add ActionScript samples

* Added a heuristic for .as files

* Whitelist sublime-angelscript license hash

* Added heuristic test for Angelscript and Actionscript

* Remove .acs from Angelscript file extensions
2017-10-14 17:34:12 +01:00
John Gardner
63ff51e2ed Add test to keep grammar-list synced with submodules (#3793)
* Add test to check if grammar list is outdated

* Update grammar list

* Fix duplicate punctuation in error messages
2017-08-24 21:13:30 +10:00
James Dennes
3391dcce6a Make Language methods more resilient to non-String input (#3752)
* Add failing test for finding with non-String input

Show the failing behaviour of find_by_alias, find_by_name, and []
when non-String input is provided.

* Return nil rather than erroring on non-String input
2017-08-02 14:07:44 +02:00
John Gardner
25de4e0ae2 Add Printer Font ASCII to recognised PostScript extensions (#3734)
* Register Adobe Type 1 fonts as PostScript files

* Add logic for recognising generated PFA files

* Extend list of PostScript generators
2017-08-02 21:58:40 +10:00
Jared Harper
4dcf223c8e Support for C++ files generated by protobuf/grpc (#3640)
* Support for C++ files generated by protobuf/grpc

This changeset includes a sample generated file.

[grpc](http://grpc.io) is a high performance, open-source universal
RPC framework.

* Account for older gRPC protobuf plugin message
2017-07-22 14:20:55 +01:00
Santiago M. Mola
329f80d245 fix classifier tests (#3709)
test_classify_ambiguous_languages was not running any test, since
it was looking only for languages that are ambiguous on
filename for known filenames (rather than ambiguous for filename
or extension).

Note that test time and assertions.
Before:
  Finished in 0.149294s, 40.1892 runs/s, 46.8874 assertions/s.
After:
  Finished in 3.043109s, 1.9717 runs/s, 224.7702 assertions/s.
2017-07-22 14:20:15 +01:00
Santiago M. Mola
085604948e Add support for XPM. (#3706)
* .xpm and .pm extensions associated with XPM.

* .pm is disambiguated by searching the /* XPM */ string.
  This is how `file` performs detection and should work with
  every XPM3 file (most XPM generated by software later than 1991).

Added XPM samples:

* stick-unfocus.xpm: extracted from Fluxbox (MIT License)
  0c13ddc0c8/data/styles/Emerge/pixmaps/stick-unfocus.xpm

* cc-public_domain_mark_white.pm: public domain image from
  https://commons.wikimedia.org/wiki/File:Cc-public_domain_mark_white.svg
  converted to XPM with ImageMagick (convert input.svg output.xpm).
2017-07-22 14:19:22 +01:00
Colin Seymour
e60384b018 Release v5.1.0 (#3725)
* sublime-spintools now has a license so no need for whitelist

* Bump version: 5.0.12

* Use the more apt release of v5.1.0
2017-07-22 14:16:16 +01:00
Santiago M. Mola
470a82d9f5 shell: add more interpreters (#3708)
* ash: only interpreter, extension is more commonly used for
  Kingdom of Loathing scripting, e.g. github.com/twistedmage/assorted-kol-scripts

* dash: only interpreter, extension is more commonly used for
  dashboarding-related stuff

* ksh: extension was already present

* mksh

* pdksh
2017-07-20 10:33:28 +01:00
John Gardner
128abe3533 Fix spelling of Perl 6 (#3672)
Resolves #3671.
2017-06-20 19:39:39 +10:00
Colin Seymour
ca6121e3ea Update MD5 digest for testing under Ruby 2.4 (#3643)
* Update md5 sums for Ruby 2.4

Ruby 2.4 deprecated Fixnum & Bignum into Integer. This means the MD5 digests for the integers in our tests have a class of Integer instead of Fixnum which means we need to update the digests specifically for 2.4.

* Use Gem::Version for safer version comparison
2017-05-26 08:16:12 +01:00
Simen Bekkhus
fba4babdcd Don't show npm lockfiles by default (#3611) 2017-05-10 15:55:16 +01:00
Santiago M. Mola
c0e242358a Fix heuristics after rename (#3556)
* fix Roff detection in heuristics

This affects extensions .l, .ms, .n and .rno.

Groff was renamed to Roff in 673aeb32b9851cc58429c4b598c876292aaf70c7,
but heuristic was not updated.

* replace FORTRAN with Fortran

It was already renamed in most places since 4fd8fce08574809aa58e9771e2a9da5d135127be
heuristics.rb was missing though.

* fix caseness of GCC Machine Description
2017-04-26 15:31:36 -07:00
Christoph Pojer
461c27c066 Revert "Added Jest snapshot test files as generated src (#3572)" (#3579)
This reverts commit f38d6bd124.
2017-04-22 14:20:54 +02:00
Hank Brekke
f38d6bd124 Added Jest snapshot test files as generated src (#3572) 2017-04-20 08:58:39 +01:00
Santiago M. Mola
e80b92e407 Fix heuristic for Unix Assembly with .ms extension (#3550) 2017-04-06 22:01:42 +10:00
Paul Chaignon
c59c88f16e Update grammar whitelist (#3510)
* Remove a few hashes for grammars with BSD licenses

There was an error in Licensee v8.8.2, which caused it to not
recognize some BSD licenses. v8.8.3 fixes it.

* Update submodules

Remove 2 grammars from the whitelist because their licenses were
added to a LICENSE file which a proper format (one that Licensee
detects).

MagicPython now supports all scopes that were previously supported
by language-python.
2017-03-13 17:19:06 -07:00
Paul Chaignon
9468ad4947 Fix grammar hashes (#3504)
* Update Licensee hashes for grammar licenses

Licensee v8.8 changed the way licenses are normalized, thus changing hashes for
some grammars

* Update Licensee

Prevent automatic updates to major releases
2017-03-09 23:57:35 -08:00
Eloy Durán
f1be771611 Disambiguate TypeScript with tsx extension. (#3464)
Using the technique as discussed in #2761.
2017-02-20 10:17:18 +00:00
Colin Seymour
01de40faaa Return early in Classifier.classify if no languages supplied (#3471)
* Return early if no languages supplied

There's no need to tokenise the data when attempting to classify without a limited language scope as no action will be performed when it comes to scoring anyway.

* Add test for empty languages array
2017-02-13 18:22:54 +00:00
sunderls
b36ea7ac9d Add yarn (#3432)
* add yarn.lock

* fix comment

* remove yarn test

* add test

* fix test

* try fix again

* try 3rd time

* check filename and firstline for yarn lockfile
2017-01-23 10:58:53 -08:00
John Gardner
93ec1922cb Swap grammar used for CSS highlighting (#3426)
* Swap grammar used for CSS highlighting

* Whitelist license of Atom's CSS grammar

* Explicitly declare grammar as MIT-licensed

Source: https://github.com/atom/language-css/blob/5d4af/package.json#L14
2017-01-11 16:16:25 +11:00
Yuki Izumi
5d09fb67dd Allow for split(",") returning nil (#3424) 2017-01-10 11:44:24 +11:00
Brandon Black
a604de9846 replacing atom grammar due to ST2 compatibility change 2017-01-03 16:46:02 -08:00
Brandon Black
3e224e0039 updating grammars 2017-01-03 16:33:46 -08:00
Zach Brock
f98ab593fb Detect Javascript files generated by Protocol Buffers. 2017-01-03 16:07:26 -08:00
Nate Whetsell
48e4394d87 Add Jison-generated JavaScript to generated files (#3393)
* Fix typos

* Add Jison-generated JavaScript to generated files
2017-01-03 14:08:29 -08:00
yutannihilation
1c4baf6dc2 ignore roxygen2-generated files (#3373) 2017-01-03 13:31:04 -08:00
Arfon Smith
d8b91bd5c4 The grand language renaming bonanza (#3278)
* Removing FORTRAN samples because OS X case-insensitive filesystems :-\

* Adding Fotran samples back

* FORTRAN -> Fortran

* Groff -> Roff

* GAS -> Unix Assembly

* Cucumber -> Gherkin

* Nimrod -> Nim

* Ragel in Ruby Host -> Ragel

* Jade -> Pug

* VimL -> Vim script
2016-12-13 13:39:27 -08:00