* Lex everything except SGML, multiline, SHEBANG
* Prepend SHEBANG#! to tokens
* Support SGML tag/attribute extraction
* Multiline comments
* WIP cont'd; productionifying
* Compile before test
* Add extension to gemspec
* Add flex task to build lexer
* Reentrant extra data storage
* regenerate lexer
* use prefix
* rebuild lexer on linux
* Optimise a number of operations:
* Don't read and split the entire file if we only ever use the first/last n
lines
* Only consider the first 50KiB when using heuristics/classifying. This can
save a *lot* of time; running a large number of regexes over 1MiB of text
takes a while.
* Memoize File.size/read/stat; re-reading in a 500KiB file every time `data` is
called adds up a lot.
* Use single regex for C++
* act like #lines
* [1][-2..-1] => nil, ffs
* k may not be set
* fix benchmark
- require json for Hash.to_json
* better heuristic distinction of .d files
- properly recongnize dtrace probes
- recongnize \ in Makefile paths
- recongnize single line `file.ext : dep.ext` make targets
- recognize D module, import, function, and unittest declarations
- add more representative D samples
D changed from 31.2% to 28.1%
DTrace changed from 33.5% to 32.5%
Makefile changed from 35.3% to 39.4%
See
https://gist.github.com/MartinNowak/fda24fdef64f2dbb05c5a5ceabf22bd3
for the scraper used to get a test corpus.
Now that all our grammars are licensed (or grandfathered in), we can
distribute them as part of the standard github-linguist gem. This makes
it easier for projects to get up and running with Linguist.
The purpose of this gem is to package up the language grammars that are
used for syntax highlighting on github.com. The grammars are TextMate,
Sublime Text, or Atom language grammars, converted to JSON and given the
filename SCOPE.json, where SCOPE is the language scope that the grammar
defines.
The github-linguist-grammars gem packages up all the grammars, and also
exports a Linguist::Grammars.path method to locate the directory
containing the grammars.
To build the gem, simply run `rake build_grammars_gem`. The grammars.yml
file lists all the repositories we download grammars from, as well as
which scopes are defined by each repository. The
script/download-grammars script takes that list and downloads and
processes the grammars into the format expected by the gem.
* origin/master: (42 commits)
its always greener
that new green shell
Removing stale extension
Update README.md
Add moon interpreter for MoonScript
Bumping version for 3.4.1 release
Use text.html.erb scope for HTML+ERB files
Add sample .dyalog file for file type APL
Added extra Papyrus sample files.
Add sample Papyrus script
Add Papyrus support
Add LOLCODE support
Add ProGuard config files to vendored files
Recognise *.dyalog as APL sources
Assign a bunch more TextMate scopes
CI step for samples
Add .command as a Shell file extension
CI config
Vendored gems
Update cibuild
...
Conflicts:
Rakefile