Commit Graph

3523 Commits

Author SHA1 Message Date
bruno cuconato
1bbcfa5683 add CoNLL-U format (#4029)
* * add CoNLL-U format
- add to languages.yml
- add textmate grammar
  - add to vendor/README
  - add to grammars.yml
- add samples

* rm other extensions as I couldn't find properly licensed examples of them in the wild

* substitutesamples for something with appropriate license

* update grammar submodule so it finds the LICENSE

* add license to grammar

* * conllu
- readd other extensions
- abridge samples and a new one
- update grammar submodule: correct extension of grammar file

* rm .conllx extension
2018-02-21 15:27:32 +00:00
Jason Malinowski
c2d3170064 Support VB.NET *.Generated.vb along with *.Generated.cs files (#4027) 2018-02-21 11:56:55 +00:00
Paul Chaignon
3769216c7a Associate .x extension to Linker Script language (#4040) 2018-02-19 10:50:05 +01:00
Nathaniel J. Smith
2abf488e65 Treat "python3" as an alias for "python" (#4026)
Pygments has separate highlighters for "python" (meaning python 2) and "python3" (meaning python 3). As a result, there are lots of files out there (especially ReSTructured text) that contain code blocks whose language is explicitly given as "python3" or "py3". Currently these are unrecognized by linguist. Instead, we should use our python highlighter for them (which works for both python 2 and python 3).

References:
  http://pygments.org/docs/lexers/#pygments.lexers.python.Python3Lexer
  https://github.com/github/markup/issues/1019
  https://github.com/python-trio/async_generator/pull/12
2018-02-08 09:52:21 +00:00
Tobias V. Langhoff
812797b51d Add "asm" as alias for assembly (#4019) 2018-01-31 11:56:47 +00:00
Colin Seymour
a18ad1d489 Release v6.0.1 (#4016)
* Update grammar submodule refs

* Bump version to v.6.0.1
2018-01-30 15:17:58 +00:00
Colin Seymour
15e2b74dec Release v6.0.0 (#4002)
* Update all submodules

* Ensure always using lastest docker image

* Allow passing in GEM_VERSION from env

This is useful to building test gems in a cache friendly way using:
`GEM_VERSION=$(git describe --tags 2>/dev/null | sed 's/-/./g' | sed
's/v//') bundle exec rake build_gem`

* Update submodules one last time

* Set version 6.0.0
2018-01-26 13:12:12 +00:00
Dylan Simon
60f748d47b Add .x as XDR/RPCGEN (#3472)
* Add .x as XDR/RPCGEN

XDR/RPC language as documented in RFC5531, RFC4506.
Samples are from glibc and RFCs.

* Add Logos samples

https://github.com/JonasGessner/NoCarrier/blob/master/NoCarrier.x - MIT
cf31f4e466/llvm-gcc-R3/gcc/testsuite/objc/execute/string1.x - GPL2
f6415578fa/perapp-plugin/Tweak.x - GPL3
d1b3e83888/NCHax.x - Apache

* Add disambiguate heuristics for .x

* Add RPC to vendor/README.md
2018-01-25 09:15:09 +00:00
Seppe Stas
8da6ddf9d9 Override languages being included by language statistics (#3807)
* Add detectable key to languages

This key allows to override the language being included in the
language stats of a repository.

* Make detectable override-able using .gitattributes

* Mention `linguist-detectable` in README

* Remove detectable key from languages

Reverts changes in 0f7c0df5.

* Update commit hash to the one that was merged

PR #3806 changed the commit hash. The original commit was not
actually merged into the test/attributes branch.

* Fix check to ensure detectable is defined

* Add include in language stats tests when detectable set

* Ignore detectable when vendored, documentation or overridden

* Add documentation on detectable override in README

* Improve documentation on detectable override in README
2018-01-23 12:17:48 +00:00
Brandon Elam Barker
512f077da8 adding the .kojo extension for Scala (#3960) 2018-01-13 09:38:34 +00:00
Josh Padnick
3260b06241 Format .tfvars file as HashiCorp Config Language. (#3885)
* Format .tfvars file as HashiCorp Config Language.

* Add sample terraform.tfvars file to demonstrate HCL rendering.
2018-01-12 17:27:41 +00:00
BRAMILLE Sébastien
ef3b0b6af3 Add solidity language (#3973)
* add solidity language

* add solidity color

* move samples to test fixtures

they're not used by the bayesian classifier

* Update languages.yml

* Rename RefundVault.sol to RefundVault.solidity

* Rename pygments-example.sol to pygments-example.solidity

* Change color from #383838 to #AA6746

`Color #383838 is too close to ["3F3F3F", "383838"]`

* Fix test

* Remove test/fixtures and add samples

* Remove extension

* Remove sample file
2018-01-12 17:26:51 +00:00
Colin Seymour
434023460e Revert "Check generated Jest snap file" (#3984)
* Revert "Remove Arduino as a language (#3933)"

This reverts commit 8e628ecc36.

* Revert "Check generated Jest snap file (#3874)"

This reverts commit ca714340e8.
2018-01-12 11:49:02 +00:00
oldmud0
8e628ecc36 Remove Arduino as a language (#3933)
* Remove Arduino as a language

* Move Arduino samples to C++

* Move .ino entry to its correct place
2018-01-11 10:48:19 +00:00
Yuya Takeyama
ca714340e8 Check generated Jest snap file (#3874)
* Check generated Jest snap file

* Check file name rule first

ref: https://github.com/github/linguist/pull/3874/files#r146168309

* Check extension first

It must be cheaper
ref: https://github.com/github/linguist/pull/3874/files#r146168426
2018-01-11 09:25:13 +00:00
Egor Zhdan
db1d4f7893 Add Materialize.css to the vendor list (#3943) 2018-01-11 09:48:49 +01:00
Paolo Di Tommaso
bee7e55618 Add Nextflow language support (#3870)
* Added nextflow language
* Added main.nf to list of filenames
* Fixed duplicate groovy scope
* Removed hello-world example
* Update grammar submodule
* Removed main.nf from filenames
* Added nextflow.config example
2018-01-09 12:47:59 +01:00
Ashe Connor
5fbe9c0902 Allow classifier to run on symlinks as usual (#3948)
* Fixups for symlink detection, incl. test

* assert the heuristics return none for symlink
2018-01-08 09:01:16 +11:00
Paul Chaignon
a840668599 perl6 alias for Perl 6 (#3977)
Many repository rely on `perl6` as a Markdown key for code snippet
highlighting. The new Perl 6 name breaks this behavior as it requires
`perl-6` as the Markdown key.
2018-01-07 21:32:55 +01:00
Ashe Connor
d4c2d83af9 Do not traverse symlinks in heuristics (#3946) 2017-12-12 21:53:36 +11:00
Paul Chaignon
e4b9430024 Vendor CSS files in font-awesome directory (#3932) 2017-12-02 15:24:05 +01:00
Paul Chaignon
a76805e40d Improve Prolog .pro heuristic to avoid false positives (#3931)
The `[a:-b]` syntax for index selection in arrays is valid in IDL and
matches the heuristic for Prolog. Update the Prolog heuristic to
exclude `[`.
2017-12-02 15:08:19 +01:00
NachoSoto
4f46155c05 Add Carthage/Build to generated list so it doesn't show in PR diffs (#3920)
Equivalent to #3865, but for Carthage.
2017-11-29 14:26:23 +00:00
NachoSoto
38901d51d2 Changed Carthage vendor regex to allow folder in any subdirectory (#3921)
In monorepro projects, it's not uncommon for `Carthage` to not be in the root directory.
2017-11-29 14:25:04 +00:00
Shai Mishali
ded0dc74e0 Add Cocoapods to generated list so it doesn't show in PR diffs (#3865)
* Add Cocoapods to generated list so it doesn't show in PR diffs

* Removed Cocoapods from vendor.yml

* Enhance regex to match only Cocoapod's Pods folder

* Adds additional test cases for generated Pods folder
2017-11-28 11:04:18 +00:00
Colin Seymour
c5d1bb5370 Unvendor tools/ (#3919)
* Unvendor tools/

* Remove test
2017-11-28 10:52:02 +00:00
Andrey Sitnik
c8ca48856b Add PostCSS syntaxes support (#3916) 2017-11-26 16:21:10 +11:00
John Gardner
7be6fb0138 Test Perl before Turing when running heuristics (#3880)
* Test Perl before Turing when running heuristics

* Revise order of Perl 5 and 6 in `.t` heuristic

See: https://github.com/github/linguist/pull/3880#issuecomment-340319500

* Combine patterns for disambiguating Perl
2017-11-17 21:25:56 +11:00
wesdawg
8c516655bc Add YARA language (#3877)
* Add YARA language grammars

* Add YARA to languages.yml

* Add YARA samples

* Add YARA to README
2017-11-16 12:16:33 +11:00
Michael R. Crusoe
9dceffce2f Add the Common Workflow Language standard (#3902)
* Add the language for the Common Workflow Language standards

* add CWL grammer

* add MIT licensed CWL sample

* script/set-language-ids --update for CWL
2017-11-16 12:15:55 +11:00
Ashe Connor
33be70eb28 Fix failing edges on leading commas in args (#3905) 2017-11-16 11:44:51 +11:00
Jingwen
9c4dc3047c Add BUILD.bazel to Python filenames (#3907)
BUILD.bazel and BUILD are used by Bazel, and both are valid filenames. BUILD.bazel is used in favor of BUILD if it exists.

https://stackoverflow.com/a/43205770/521209
2017-11-15 10:04:36 +00:00
Pratik Karki
d8e5f3c965 Add color for LFE language. (#3895)
* 'Add color to LFE'

* Test passing color for LFE

* Let LFE be independent rather than grouping to Erlang
2017-11-14 07:35:12 +00:00
Ashe Connor
71bf640a47 Release v5.3.3 (#3903)
* Add self to maintainers

* bump to v5.3.3
2017-11-13 18:17:38 +11:00
Paul Chaignon
d968b0e9ee Improve heuristic for XML/TypeScript (#3883)
The heuristic for XML .ts files might match TypeScript generics
starting with TS
2017-11-04 11:16:44 +01:00
Ashe Connor
1f5ed3b3fe Release v5.3.2 (#3882)
* update grammar submodules

* bump to 5.3.2
2017-11-01 10:01:03 +10:00
Robert Koeninger
297be948d1 Set color for Idris language (#3866) 2017-10-31 16:27:21 +00:00
Charles Milette
b4492e7205 Add support for Edje Data Collections (#3879)
* Add support for Edje Data Collections

Fixes #3876

* Add EDC in grammars list
2017-10-31 16:26:44 +00:00
Paul Chaignon
c05bc99004 Vendor a few big JS libraries (#3861) 2017-10-31 15:12:02 +01:00
Ashe Connor
99eaf5faf9 Replace the tokenizer with a flex-based scanner (#3846)
* Lex everything except SGML, multiline, SHEBANG

* Prepend SHEBANG#! to tokens

* Support SGML tag/attribute extraction

* Multiline comments

* WIP cont'd; productionifying

* Compile before test

* Add extension to gemspec

* Add flex task to build lexer

* Reentrant extra data storage

* regenerate lexer

* use prefix

* rebuild lexer on linux

* Optimise a number of operations:

* Don't read and split the entire file if we only ever use the first/last n
  lines

* Only consider the first 50KiB when using heuristics/classifying.  This can
  save a *lot* of time; running a large number of regexes over 1MiB of text
  takes a while.

* Memoize File.size/read/stat; re-reading in a 500KiB file every time `data` is
  called adds up a lot.

* Use single regex for C++

* act like #lines

* [1][-2..-1] => nil, ffs

* k may not be set
2017-10-31 11:06:56 +11:00
Cesar Tessarin
21babbceb1 Fix Perl 5 and 6 disambiguation bug (#3860)
* Add test to demonstrate Perl syntax detection bug

A Perl 5 .pm file containing the word `module` or `class`, even with
an explicit `use 5.*` statement, is recognized as Perl 6 code.

* Improve Perl 5 and Perl 6 disambiguation

The heuristics for Perl 5 and 6 `.pm` files disambiguation was done
searching for keywords which can appear in both languages (`class` and
`module`) in addition to the `use` statement check.

Due to Perl 6 being tested first, code containing those words would
always be interpreted as Perl 6.

Test order was thus reversed, testing for Perl 5 first. Since Perl 6
code would never contain a `use 5.*` statement, this does no harm to
Perl 6 detection while fixing the problem to Perl 5.

Fixes: #3637
2017-10-23 10:16:56 +01:00
Ashe Connor
9b942086f7 Release v5.3.1 (#3864)
* Fix Perl/Pod disambiguation
2017-10-17 19:31:20 +11:00
Ashe Connor
93cd47822f Only recognise Pod for .pod files (#3863)
We uncomplicate matters by removing ".pod" from the Perl definition
entirely.
2017-10-17 19:05:20 +11:00
Colin Seymour
ea3e79a631 Release v5.3.0 (#3859)
* Update grammars

* Update haskell scopes to match updated grammar

* Bump version to 5.3.0
2017-10-15 09:52:27 +01:00
Maickel Hubner
0af9a35ff1 Create association with OpenEdge .w files (#3648)
* Update heuristics.rb

* Update languages.yml

* Create consmov.w

* Create menu.w

* Switch out large samples for smaller ones

* Relax regex
2017-10-14 18:12:16 +01:00
Codecat
44048c9ba8 Add Angelscript language (#3844)
* Add AngelScript scriping language

* Add AngelScript sample

* Initial implementation of Angelscript

* Update Angelscript tm_scope and ace_mode

* Move Angelscript after ANTLR

* Updated grammar list

* Alphabetical sorting for Angelscript

* Angelscript grammar license is unlicense

* Add ActionScript samples

* Added a heuristic for .as files

* Whitelist sublime-angelscript license hash

* Added heuristic test for Angelscript and Actionscript

* Remove .acs from Angelscript file extensions
2017-10-14 17:34:12 +01:00
Chris Llanwarne
e51b5ec9b7 Add WDL language support (#3858)
* Add WDL language support

* Add ace mode
2017-10-14 17:12:47 +01:00
Dan Moore
a0b38e8207 Don't count VCL as Perl for statistics. (#3857)
* Don't count VCL as Perl for statistics.

While the Varnish-specific language was apparently inspired by C and Perl, there's no reason to group it as Perl for repo statistics.

* Re-adding color for VCL.

Which was accidentally removed as part of https://github.com/github/linguist/pull/2298/files#diff-3552b1a64ad2071983c4d91349075c75L3223
2017-10-12 15:42:31 -04:00
Ján Neščivera
0b9c05f989 added VS Code workspace files to vendored path (#3723) 2017-10-08 17:32:01 +01:00
Kerem
4cd558c374 Added natvis extension to XML (#3789)
* natvis extension added to xml.

* Added sample natvis file from the Chromium project.
2017-09-17 13:31:02 +01:00