Commit Graph

192 Commits

Author SHA1 Message Date
Todd Berman
88c74fa9c2 Convert from mode names to mimetypes for better usage. 2016-09-23 13:40:19 -07:00
Todd Berman
d6d7d38eb8 Fix w/ a test 2016-09-21 20:52:49 -07:00
Todd Berman
cc5f1c57ca Add Codemirror modes 2016-09-20 23:23:22 -07:00
Arfon Smith
a3227c2c27 Adding basic find_by_id functionality to Language 2016-09-13 11:09:05 -07:00
Arfon Smith
7cda13afcb A Language should know about it's language_id 2016-09-12 22:02:49 -07:00
Arfon Smith
1efd4c83f9 Merge pull request #2341 from github/api-changes
Move Linguist::Language.detect to Linguist.detect
2016-03-16 21:15:50 -06:00
Arfon Smith
997c0fca10 Catching one more edge case 2015-08-11 06:48:54 +01:00
Arfon Smith
851c93a1f7 Don't blow up if empty string/nil passed to alias methods 2015-08-10 22:07:28 +01:00
Kevin Butler
3180c5d554 Allow delimiting by comma in the language name 2015-06-10 15:37:31 +01:00
Brandon Keepers
924fddf698 Move Linguist::Language.detect to Linguist.detect 2015-04-17 14:56:08 +12:00
Brandon Keepers
8a42f76f03 Remove .script! hack 2015-04-17 14:09:05 +12:00
Arfon Smith
1da425ae2f Merge pull request #2162 from github/instrumentation
Add instrumentation to detection and classification
2015-03-05 15:03:30 -06:00
Arfon Smith
a1010b8cf8 Actually return the strategy 2015-03-05 13:21:07 -06:00
Brandon Keepers
3dcdc11c1b Avoid passing block to detected instrumenter 2015-03-05 10:03:51 -08:00
Brandon Keepers
e8326529b5 Pass blob to instrumentation 2015-03-05 10:03:01 -08:00
Arfon Smith
4ef925d8be Merge pull request #2087 from pchaigno/case-sensitivity
Detection by extension made case-insensitive
2015-02-27 14:06:50 -06:00
Arfon Smith
9a86b9ad75 Instrument all calls and pass the blob, strategy and language candidates in the payload. 2015-02-26 15:27:33 -06:00
Charlie Somerville
fd7633518f add instrumentation to detection and classification 2015-02-25 12:34:07 +11:00
Adam Roben
2f5b49f4ae Merge pull request #2097 from github/detect-all-markup
Detect all markup languages when computing language statistics
2015-02-13 16:43:41 -05:00
Adam Roben
b2ee2cc7b8 Detect all markup languages when computing language statistics
Originally, only "programming" languages were included in repository
language statistics. In 33ebee0f6a we
started detecting a few selected "markup" languages as well. We didn't
include all "markup" languages because at the time formats like Markdown
and AsciiDoc were labeled as "markup" languages, and we thought that
including those prose (i.e., non-code) languages in repository
statistics on github.com was misleading for repositories that are
largely about code but also contain a lot of documentation (e.g.,
rails/rails).

This hand-picked set of whitelisted "markup" languages can cause strange
categorization for some repositories. For example, it includes CSS (and
some variants) but not HTML. This results in repositories that contain
the source code for a static website being classified as either a
JavaScript (programming) or CSS (markup) repository, with no mention of
HTML anywhere.

Fast-forward to today, and prose languages are no longer "markup"
languages; they're now "prose" languages. So now we can include all
"markup" languages in repository language statistics without worrying
about undesirable effects for documentation-heavy repositories.
2015-02-10 13:39:42 -05:00
Arfon Smith
4543c7a0b3 Use the shebang strategy first 2015-02-07 08:47:17 -06:00
Paul Chaignon
41e1b7bd4e Detection by extension made case-insensitive 2015-02-06 22:14:22 +01:00
Arfon Smith
e8e95f113c Modeline should come first (as it's an override) 2015-01-26 15:03:22 -06:00
Arfon Smith
e536eea5b6 Basic Vim modeline detection strategy 2015-01-26 14:22:09 -06:00
Paul Chaignon
5d0e9484ce Remove last mentions of lexer 2015-01-11 10:02:52 +01:00
Lars Brinkhoff
6ae39e50ae Fix #1731 to allow samples with multiple file extension segments. 2015-01-02 10:41:22 -05:00
Garen Torikian
4e5da23474 Add warn message indicating deprecation 2014-12-09 08:20:15 -08:00
Garen Torikian
ad778571a2 This reject is no longer necessary 2014-12-05 16:57:55 +02:00
Garen Torikian
ab61b06c34 Reject Ace modes that are lacking a mode 2014-12-05 16:25:14 +02:00
Brandon Keepers
878b321b89 Merge remote-tracking branch 'origin/master' into move-shebang
* origin/master:
  Tweak docs
2014-11-28 17:41:10 -06:00
Brandon Keepers
577fb95384 Tweak docs 2014-11-28 17:36:14 -06:00
Brandon Keepers
9020d7c044 Deprecate find_by_shebang
This class doesn’t need to know about shebangs.
2014-11-27 13:18:51 -05:00
Brandon Keepers
fd85f7f112 consolidate shebang logic 2014-11-27 12:18:23 -05:00
Brandon Keepers
bf4baff363 Move call method into existing Classifier class 2014-11-27 11:29:38 -05:00
Brandon Keepers
c1a9737313 Try strategies until one language is returned 2014-11-27 11:12:47 -05:00
Brandon Keepers
a4081498f8 Remove unneded empty blob check 2014-11-27 10:55:03 -05:00
Brandon Keepers
9efd923382 Merge remote-tracking branch 'origin/master' into strategies
* origin/master: (165 commits)
  Add F# and GLSL samples.  Add Forth and GLSL extension .fs. Add heuristic to disambiguate between F#, Forth, and GLSL.
  byebug requires ruby 2.0
  Remove test for removed extension
  Fix typo in test
  add rake interpreter
  add python3 interpreter
  Remove old wrong_shebang.rb sample
  Add byebug
  Link to Lightshow in CONTRIBUTING.md
  Switch to a better F# grammar
  Bump Rugged again
  Checkout the master for testing
  Rugged 0.22.0b3
  Reordering
  Bump version to 4.0.3
  Add some docs for tm_scope
  Change NONE to none
  Checking other case for Chart.jS
  Test that all languages have grammars
  Fix RHTML's tm_scope
  ...

Conflicts:
	lib/linguist/language.rb
2014-11-27 10:52:44 -05:00
Arfon Smith
412af86cb8 Merge pull request #1538 from github/1233-local
Detection based on the shebang (updated)
2014-11-26 14:47:12 -06:00
Brandon Keepers
6a4bf3fa65 Merge pull request #1731 from github/multiple-ext-segments
Support for multiple file extension segments
2014-11-26 15:09:15 -05:00
Arfon Smith
208a3ff480 Merge branch 'master' into 1233-local
Conflicts:
	lib/linguist/language.rb
2014-11-25 17:04:43 -06:00
Brandon Keepers
d7fd12cb32 Remove deprecated method 2014-11-18 15:19:23 -05:00
Brandon Keepers
850ab6dedb #all_extensions already includes primary extension 2014-11-18 15:10:07 -05:00
Brandon Keepers
757801e32f Merge remote-tracking branch 'origin/master' into filename-matches-multiple-langages
* origin/master:
  Allow mime-types 2.x to be used with Linguist
  Upgrade to rugged 0.22.0b1
  Mention that languages need to be quite popular
  fix vendor/cache
  Gemfile.lock is nolonger considered generated
  Tests for BlobHelper#empty?
  remove reference to empty.js
  Remove more empty samples
  Bail earlier if the file is empty.
  Moving comments
  Use heuristics earlier to inform the rest of the classification process
  Removing inconsistency of `find_by_heuristics` (was sometimes returning nil and sometimes returning and empty array)
  Removing unused array of candidate languages.
  Reworking most heuristics to only return one match
2014-11-18 14:09:15 -05:00
Arfon Smith
0443c4db2d Merge pull request #1674 from github/rework-heuristics
Rework heuristics
2014-11-18 10:43:01 -06:00
Brandon Keepers
cd7549390e Extensions aren't actually required 2014-11-17 20:00:09 -05:00
Brandon Keepers
6c106b88c0 Avoid using singular #extension 2014-11-17 15:47:21 -05:00
Brandon Keepers
c46667581d Use the first extension with languages defined 2014-11-17 15:15:39 -05:00
Brandon Keepers
3ca872cea8 Support for multiple file extension segments 2014-11-17 14:54:22 -05:00
Vicent Marti
4a10b27611 Remove Pygments 2014-11-14 17:37:12 +01:00
Adam Roben
160598b9ef Make it safe to pass nil to Language.find_by_name/alias again
This restores compatibility with v3.4.x.
2014-11-10 15:12:29 -05:00