Adam Roben 2c2c4740a8 Update all grammar submodules
This was performed via `git submodule update --remote`.

* vendor/grammars/Agda.tmbundle 784f435...68a218c (1):
  > Merge pull request #1 from aroben/patch-1

* vendor/grammars/IDL-Syntax 9473b7f...3baeaea (1):
  > Merge pull request #3 from aroben/patch-1

* vendor/grammars/NimLime 7a2fb4e...9cef4b6 (4):
  > Updated command names
  > Updated generated documentation
  > renamed more from nimrod to nim
  > Renamed several files

* vendor/grammars/SCSS.tmbundle d6188e5...4147502 (1):
  > Merge pull request #181 from redgluten/master

* vendor/grammars/Sublime-VimL 6ab7e19...366fdc6 (1):
  > Merge pull request #2 from yous/fix-single-quoted-string

* vendor/grammars/factor 2dc5590...2453a78 (38):
  > json.writer: make sure we make hex values two digits zero padded.
  > json.writer: support escaping unicode > 0x10000.  Thanks @jonenst!
  > mason.git: fix status check.
  > compiler.cfg.*: new unit test vocabs
  > compiler.cfg.*: more docs
  > compiler.cfg.*: refactoring away the compute-global-sets word
  > compiler.cfg.*: docs and more tests
  > compiler.cfg.stacks.local: refactoring making stack-changes and height-changes take and return stuff instead of using variables and the make building
  > compiler.cfg.parallel-copy: docs
  > compiler.cfg.stacks.height: these words are unused
  > compiler.cfg.*: more compiler docs
  > compiler.cfg.stacks.finalize: initial docs
  > io.launcher: fix stack effects.
  > io.launcher: fix docs for with-process-reader and with-process-writer.
  > io.launcher: add versions of with-process that preserve process and status.
  > mason.git: fix use.
  > mason.git: fix for rename.
  > io.launcher: cleanup public interface, make some things private or internal.
  > gopher: set 1 minute timeout by default.
  > brainfuck: cleanup tests.
  > json.writer: don't escape spaces, thats weird.
  > unix: some using cleanups.
  > python: rename startup/shutdown hooks.
  > math.extras: adding the Möbius function.
  > alien.c-types: move definitions of stdint.h from unix.types.
  > gopher: use contents now that it works.
  > io.ports: Make buffered-port not have a length because of Linux virtual files and TCP sockets. Related to issues #1256 and #1259.
  > tools.deploy.backend: add word for deleting cached staging images.
  > command-line: save the executable in a variable so that people don't use (command-line) directly if possible.
  > bootstrap: fix this use of (command-line).
  > tools.deploy.shaker: set the rest of the args to preserve current behavior.
  > vm: store full command-line including executable first argument.
  > gopher: fix bug where empty lines weren't printed properly in menus.
  > gopher: simplify.
  > gopher: change gopher-text to use split1.
  > io.encodings.detect: simplify prolog-tag.
  > gopher: add way to get result without converting to objects.
  > tools.disassembler: allow disassemble of compose and curry.

* vendor/grammars/fsharpbinding af755c8...d097476 (24):
  > Merge pull request #909 from cbowdon/issue877-vim-73-support
  > Merge pull request #913 from 7sharp9/Move_GetColourizations_toBg
  > Merge pull request #912 from 7sharp9/TryFind_opt
  > Merge pull request #911 from 7sharp9/FoldingParser
  > Merge pull request #908 from 7sharp9/TooltipOverhaul_AutoParamFix
  > Merge pull request #907 from 7sharp9/Movegetdefinestomodule
  > Merge pull request #906 from 7sharp9/tooltipfixforclosures
  > Merge pull request #905 from 7sharp9/ResolverProvider_singletimeout
  > Merge pull request #904 from fsharp/revert-903-ResolverProvider_singletimeout
  > Merge pull request #903 from 7sharp9/ResolverProvider_singletimeout
  > Merge pull request #902 from 7sharp9/ParameterCompletion_gatherTimeout
  > Merge pull request #901 from 7sharp9/Changed_invalidate_project
  > Merge pull request #900 from 7sharp9/Syntaxmode_removeextraoperation
  > Merge pull request #899 from 7sharp9/tooltips_ensureTimout
  > Merge pull request #898 from 7sharp9/pathextension_useAddRange
  > Merge pull request #897 from 7sharp9/resolverprovider_ensuretimout
  > Merge pull request #896 from 7sharp9/completion_ensuretimout
  > Merge pull request #895 from cbowdon/894-Vim-fix-for-no-completions-stacktrace
  > Merge pull request #890 from wangzq/gotodecl
  > Merge pull request #893 from 7sharp9/fixfortooltipvaltypes
  > Merge pull request #892 from 7sharp9/fixforprojecttypechecking
  > Added correct indentation
  > Merge pull request #891 from 7sharp9/ImproveImplementInterface
  > Merge pull request #888 from VincentDondain/master

* vendor/grammars/haxe-sublime-bundle 58cad47...e2613bb (4):
  > fixed goto definition / find type
  > clean
  > adaptations for toplevel completion
  > first test

* vendor/grammars/language-gfm c6df027...7b62290 (7):
  > Prepare 0.59.0 release
  > scoped-properties -> settings
  > Prepare 0.58.0 release
  > Merge pull request #67 from davidcelis/master
  > Prepare 0.57.0 release
  > Prepare 0.56.0 release
  > Merge pull request #64 from atom/mb-new-cpp-scope-name

* vendor/grammars/language-javascript 15dc5d1...6690feb (5):
  > Prepare 0.52.0 release
  > Merge pull request #82 from Hurtak/feature/snippets-for
  > Merge pull request #80 from Hurtak/feature/snippets-querySelector
  > Merge pull request #79 from Hurtak/feature/snippets-switch-indentation-fix
  > Merge pull request #81 from Hurtak/feature/snippets-iife

* vendor/grammars/language-python 476a353...f518e49 (5):
  > Prepare 0.28.0 release
  > Use trailing scope name
  > Merge pull request #48 from msabramo/patch-1
  > Prepare 0.27.0 release
  > Add pattern for nonlocal keyword

* vendor/grammars/language-sass 064a8b5...33efa33 (2):
  > Prepare 0.29.0 release
  > Allow + and - in selector argument

* vendor/grammars/language-shellscript e2d62af...cbec163 (2):
  > Prepare 0.11.0 release
  > Merge pull request #4 from hd-deman/patch-1

* vendor/grammars/latex.tmbundle 682c4b7...52b2251 (42):
  > Replaced `python` with `python2.7` in shebangs
  > Make the preferences compatible with Python 3
  > Handle manual spacing in “Reformat” (Table)
  > Fix: Reformatting of table containing empty cells
  > Use more descriptive variable names in `format`
  > Add documentation to `reformat`
  > Fix doctest in `refresh_viewer`
  > Add tests for `reformat`
  > Ignore “exit discard” status in `cramtests`
  > Remove print statements from `reformat` function
  > Fix: Close log window option ignored
  > Automatically scroll to bottom in “HTML Output”
  > Handle “\” signs in the notification window
  > Fix missing logname in default error message
  > Extend list of auxiliary files
  > Remove unused code from `latex_watch`
  > Display default message in notification window
  > Sort error messages by line number
  > Do not store duplicate error messages anymore
  > Close notification window on cleanup
  > Improve reopening of closed notification windows
  > Improve rewrap code in `texparser`
  > Improve readability of verbose log output
  > Only parse log file if there were changes
  > Remove unnecessary function call in “LaTeX Watch”
  > Properly close file in `guess_tex_engine`
  > Handle log messages containing double quotes
  > Left justify severity in notification window
  > Handle manual closing of notification window
  > Add additional information to notification window
  > Remove unused code from `texparser`
  > Close notification when typesetting succeeds
  > Add support for notifications to “LaTeX Watch”
  > Update bundle preference values instantly
  > Make “Reformat” (Table) compatible with Ruby 2
  > Ignore escaped ampersand `\&` in “Format” (Table)
  > Remove warnings reported by `RuboCop`
  > Format code for “Reformat” (Table)
  > Move code for “Reformat” into separate script
  > Save “Reformat” command with TextMate 2
  > Remove unused import
  > Use explicit import in “Itemize Lines In Selection”

* vendor/grammars/mercury-tmlanguage b5a4fd6...eaef0b0 (8):
  > Add require_* and some, all keywords
  > Highlight %f format specifiers, `` as op
  > Correct implementation of '''', """" and 0'<char>
  > README.md: Mention GitHub grammar compatability
  > README.md: add resources and demonstration
  > reformatted whitespace; added foreign mods; missing keywords
  > Highlight variables, determ decls, more pragmas
  > no highlighting of variables, function names, type names, inst's, etc.

* vendor/grammars/sublime-mask 2f59519...632ff3c (4):
  > v0.8.7
  > v0.8.7
  > + expression in component nodes
  < v0.8.6

* vendor/grammars/swift.tmbundle 81a0164...3c7eac5 (9):
  > Use constant scope for booleans
  > Use storage scope instead of keyword
  > Correct typo in include
  > Revamp string literal matching
  > Improve punctuation scopes
  > Allow for functions without a body
  > Add simple folding markers for swift
  > Improved matching of capture specifiers
  > Add Support for UInt, Int[8|16|32|64] & Float80
2015-01-06 10:09:53 -05:00
2014-07-23 11:20:31 -05:00
2014-12-18 12:14:45 -05:00
2015-01-05 15:07:28 -05:00
2014-12-22 17:07:37 -05:00
2014-12-31 11:38:03 -06:00
2014-01-02 12:17:28 -08:00
2014-11-13 11:03:53 -05:00
2014-11-28 12:35:42 -08:00

Linguist

We use this library at GitHub to detect blob languages, ignore binary files, suppress generated files in diffs, and generate language breakdown graphs.

Tips for filing issues and creating pull requests can be found in CONTRIBUTING.md.

Features

Language detection

Linguist defines a list of all languages known to GitHub in a yaml file.

Most languages are detected by their file extension. For disambiguating between files with common extensions, we first apply some common-sense heuristics to pick out obvious languages. After that, we use a statistical classifier. This process can help us tell the difference between, for example, .h files which could be either C, C++, or Obj-C.


Linguist::FileBlob.new("lib/linguist.rb").language.name #=> "Ruby"

Linguist::FileBlob.new("bin/linguist").language.name #=> "Ruby"

See lib/linguist/language.rb and lib/linguist/languages.yml.

Syntax Highlighting

Syntax highlighting in GitHub is performed using TextMate-compatible grammars. These are the same grammars that TextMate, Sublime Text and Atom use.

Every language in languages.yml is mapped to its corresponding TM scope. This scope will be used when picking up a grammar for highlighting. When adding a new language to Linguist, please add its corresponding scope too (assuming there's an existing TextMate bundle, Sublime Text package, or Atom package) so syntax highlighting works for it.

Stats

The Language stats bar that you see on every repository is built by aggregating the languages of each file in that repository. The top language in the graph determines the project's primary language.

The repository stats API, accessed through #languages, can be used on a directory:

API UPDATE

Since Version 3.0.0 Linguist expects a git repository (in the form of a Rugged::Repository) to be passed when initializing Linguist::Repository.

require 'rugged'
require 'linguist'

repo = Rugged::Repository.new('.')
project = Linguist::Repository.new(repo, repo.head.target_id)
project.language       #=> "Ruby"
project.languages      #=> { "Ruby" => 119387 }

These stats are also printed out by the linguist binary. You can use the --breakdown flag, and the binary will also output the breakdown of files by language.

You can try running linguist on the root directory in this repository itself:

$ bundle exec linguist --breakdown

100.00% Ruby

Ruby:
Gemfile
Rakefile
bin/linguist
github-linguist.gemspec
lib/linguist.rb
lib/linguist/blob_helper.rb
lib/linguist/classifier.rb
lib/linguist/file_blob.rb
lib/linguist/generated.rb
lib/linguist/heuristics.rb
lib/linguist/language.rb
lib/linguist/lazy_blob.rb
lib/linguist/md5.rb
lib/linguist/repository.rb
lib/linguist/samples.rb
lib/linguist/tokenizer.rb
lib/linguist/version.rb
test/test_blob.rb
test/test_classifier.rb
test/test_heuristics.rb
test/test_language.rb
test/test_md5.rb
test/test_pedantic.rb
test/test_repository.rb
test/test_samples.rb
test/test_tokenizer.rb

Ignore vendored files

Checking other code into your git repo is a common practice. But this often inflates your project's language stats and may even cause your project to be labeled as another language. We are able to identify some of these files and directories and exclude them.

Linguist::FileBlob.new("vendor/plugins/foo.rb").vendored? # => true

See Linguist::BlobHelper#vendored? and lib/linguist/vendor.yml.

Generated file detection

Not all plain text files are true source files. Generated files like minified js and compiled CoffeeScript can be detected and excluded from language stats. As an extra bonus, these files are suppressed in diffs.

Linguist::FileBlob.new("underscore.min.js").generated? # => true

See Linguist::Generated#generated?.

Overrides

Linguist supports custom overrides for language definitions and vendored paths. Add a .gitattributes file to your project using the keys linguist-language and linguist-vendored with the standard git-style path matchers for the files you want to override.

Please note that the overrides currently only affect the language statistics for a repository and not the syntax-highlighting of files.

$ cat .gitattributes
*.rb linguist-language=Java

$ linguist --breakdown
100.00% Java

Java:
ruby_file.rb

By default, Linguist treats all of the paths defined in lib/linguist/vendor.yml as vendored and therefore doesn't include them in the language statistics for a repository. Use the linguist-vendored attribute to vendor or un-vendor paths.

$ cat .gitattributes
special-vendored-path/* linguist-vendored
jquery.js linguist-vendored=false

Installation

Github.com is usually running the latest version of the github-linguist gem that is released on RubyGems.org.

But for development you are going to want to checkout out the source. To get it, clone the repo and run Bundler to install its dependencies.

git clone https://github.com/github/linguist.git
cd linguist/
script/bootstrap

To run the tests:

bundle exec rake test

A note on language extensions

Linguist has a number of methods available to it for identifying the language of a particular file. The initial lookup is based upon the extension of the file, possible file extensions are defined in an array called extensions. Take a look at this example for example for Perl:

Perl:
  type: programming
  ace_mode: perl
  color: "#0298c3"
  extensions:
  - .pl
  - .PL
  - .perl
  - .ph
  - .plx
  - .pm
  - .pod
  - .psgi
  interpreters:
  - perl

Any of the extensions defined are valid but the first in this array should be the most popular.

Testing

Sometimes getting the tests running can be too much work, especially if you don't have much Ruby experience. It's okay: be lazy and let our build bot Travis run the tests for you. Just open a pull request and the bot will start cranking away.

Here's our current build status, which is hopefully green: Build Status

Releasing

If you are the current maintainer of this gem:

  1. Create a branch for the release: git checkout -b cut-release-vxx.xx.xx
  2. Make sure your local dependencies are up to date: script/bootstrap
  3. Ensure that samples are updated: bundle exec rake samples
  4. Ensure that tests are green: bundle exec rake test
  5. Bump gem version in lib/linguist/version.rb. For example, like this.
  6. Make a PR to github/linguist. For example, #1238.
  7. Build a local gem: bundle exec rake build_gem
  8. Testing:
  9. Bump the Gemfile and Gemfile.lock versions for an app which relies on this gem
  10. Install the new gem locally
  11. Test behavior locally, branch deploy, whatever needs to happen
  12. Merge github/linguist PR
  13. Tag and push: git tag vx.xx.xx; git push --tags
  14. Push to rubygems.org -- gem push github-linguist-3.0.0.gem
Description
Language Savant. If your repository's language is being reported incorrectly, send us a pull request!
Readme MIT 34 MiB
Languages
Ruby 68.7%
C 22.3%
Go 7.1%
Lex 1.2%
Shell 0.4%
Other 0.3%