Commit Graph

163 Commits

Author SHA1 Message Date
Arfon Smith
94be1ab277 documentation? should use path too 2015-02-24 15:20:58 -06:00
Arfon Smith
8561f2a6b0 Merge branch 'master' into revert-2014-revert-1976-path-for-fileblob
Conflicts:
	lib/linguist/version.rb
2015-02-24 14:54:14 -06:00
Adam Roben
6a86e8ea97 Add BlobHelper#include_in_language_stats?
This just extracts some logic from Repository#compute_stats and makes it
testable.
2015-02-13 14:27:20 -05:00
Adam Roben
066052ddd2 Exclude documentation files from language statistics
Documentation is an important part of a software project but is not
generally thought of as part of the code for that project. Repository
language statistics are used to quantify the project's code, so it makes
sense to exclude documentation from those computations.

Documentation files are recognized similarly to vendored files.
lib/linguist/documentation.yml contains regular expressions to match
common names for documentation files. A new linguist-documentation Git
attribute can be used to override those conventions.
2015-02-12 10:20:47 -05:00
Arfon Smith
01be9e68ee Revert "Revert "Use path for Generated?"" 2015-01-20 14:34:36 -06:00
Arfon Smith
36120a9122 Revert "Use path for Generated?" 2015-01-20 08:58:11 -06:00
Arfon Smith
f4c1cc576b Modifying BlobHelper and FileBlob to use path 2015-01-09 15:15:34 -06:00
Arfon Smith
ec28ea299f Use path for Generated? 2015-01-08 14:03:35 -06:00
Arfon Smith
0443c4db2d Merge pull request #1674 from github/rework-heuristics
Rework heuristics
2014-11-18 10:43:01 -06:00
Vicent Marti
4a10b27611 Remove Pygments 2014-11-14 17:37:12 +01:00
Brandon Keepers
df55043500 Bail earlier if the file is empty.
This will change behavior for empty files with unique extensions, returning nil instead of the language.
2014-11-06 14:49:24 -06:00
Vicent Marti
7dcc3b3edf Add tm_scope to the BlobHelper 2014-10-13 17:19:38 +02:00
Arfon Smith
61faea0298 Fixing up bin/linguist 2014-07-23 11:20:31 -05:00
Arfon Smith
8c7b54d6e3 Fixing BlobHelper loading issue 2014-07-22 12:26:21 -05:00
Vicent Marti
c4260ae681 Use Rugged when computing Repository stats 2014-06-24 17:41:16 +02:00
Brian Lopez
7e8be1293e Use the :ruby_encoding value from charlock 0.7.2 2014-06-04 15:51:33 -05:00
Andy Lindeman
aa5a94cc3e Handle case where newline chars don't transcode to detected encoding
We've seen cases where binary files are detected as encodings such as
ISO-8859-8-I. This usually happens when the binary files are short, so
while the detector is mistaken, there is also not very much data for use
in the detection algorithm in the first place so it's understandable
that the detector was wrong.

In these cases, the code to convert ASCII newline characters to
encodings such as ISO-8859-8-I fails because there is no conversion
between them.

We now simply assume that the data is all one line in those cases. In
reality the data is binary, but this obviously difficult to detect
reliably.
2014-06-03 12:26:23 -04:00
Andy Lindeman
09a33f8daa Takes a different approach 2014-05-21 15:11:06 -04:00
Andy Lindeman
185db0e8d5 Makes sure we do not fail if encoding == nil
It looks like it's valid to call this method even if `binary?` is true.
Encoding as 'ASCII-8BIT' should always succeed.
2014-05-21 13:36:39 -04:00
Andy Lindeman
85efbde3f7 Counts the number of lines correctly for files with certain multibyte encodings 2014-05-21 13:36:39 -04:00
Ted Nyman
69bfe73165 Not yet on the additional binary check 2014-02-16 19:43:33 -08:00
Ted Nyman
b0894e20ef Merge pull request #301 from andyli/binary
Do not detect language if it is a binary file.
2014-02-16 14:55:07 -08:00
Ted Nyman
7e178cc416 Place guards, checks for multiline shell hacks 2013-12-06 22:04:40 -08:00
Ted Nyman
f51c5e3159 Update documentation related to Pygments 2013-07-12 14:10:55 -07:00
Joshua Peek
032125b114 Axe indexable? 2013-06-10 11:06:18 -05:00
Joshua Peek
b1a137135e Axe colorize_without_wrapper 2013-06-10 10:58:33 -05:00
Joshua Peek
1a53d1973a ws 2013-06-10 10:39:59 -05:00
Joshua Peek
3e3fb0cdfe Say why 2013-06-09 21:02:55 -05:00
Joshua Peek
d907ab9940 Kill mac_format check, buggy 2013-06-09 21:02:11 -05:00
Joshua Peek
9c1d6e154c Always split lines on \n or \r 2013-06-09 21:01:03 -05:00
Joshua Peek
fa797df0c7 Note that BlobHelper is a turd 2013-06-09 20:51:26 -05:00
Joshua Peek
c7100be139 Make mac_format? private 2013-06-09 20:48:45 -05:00
Yaroslav Shirokov
b68732f0c7 Add detection for CSV 2013-04-04 14:01:09 -07:00
Garen Torikian
4148ff1c29 Add PDF detection 2013-03-25 15:45:58 -07:00
Ted Nyman
de94b85c0d Merge pull request #295 from yandy/patch-1
downcase extname when we determin whether it's a image
2013-03-10 15:39:55 -07:00
Pascal Borreli
70eafb2ffc Fixed typos 2013-03-03 21:26:31 +00:00
Mike Skalnik
1766123448 Fix typo in comment 2013-02-26 14:00:42 -08:00
Mike Skalnik
5ea039a74e Remove OBJ files as support solids 2013-02-26 14:00:29 -08:00
Ted Nyman
58a9b56f4d Merge pull request #253 from Tass/master
Binary mime type override if languages.yml says so
2013-02-21 21:49:09 -08:00
Mike Skalnik
041ab041ae Add binary & ascii STLs and OBJs 2013-01-17 14:15:01 -08:00
Andy Li
7c9e973082 Do not detect language if it is a binary file. 2012-11-26 21:54:43 +08:00
Michael Ding
97c998946b determine image with downcase extname 2012-11-22 20:30:59 +08:00
Michael Ding
8529c90a4d use downcase string for extname 2012-11-22 17:14:45 +08:00
Joshua Peek
31e33f99f2 Ensure lang is skipped on any binary file 2012-09-24 10:51:39 -05:00
Simon Hafner
b954d22eba Override for binary mime type based on languages.yml
If the extension already exists in languages.yml, it's probably not a
binary, but code.
2012-09-13 14:55:31 -05:00
Ryan Tomayko
887a050db9 Only search the first 4K chars for \r 2012-09-10 01:56:08 -07:00
Ryan Tomayko
2e49c06f47 Handle Mac Format when splitting lines 2012-09-10 01:05:48 -07:00
Scott J. Goldman
04394750e7 When testing if a blob is safe to colorize, check size first
Similar to e415a13
2012-09-02 00:08:37 -07:00
Scott J. Goldman
e415a1351b When testing if a blob is indexable, check size first
Otherwise, charlock_holmes will allocate another large binary
buffer for testing the encoding, which is a problem if the binary
blob is many hundreds of MB large. It'll just fail and crash ruby.
2012-08-31 22:47:19 -07:00
Joshua Peek
cfe496e9fc Drop mime type module
Closes #206
2012-08-20 11:40:32 -05:00