21 Commits

Author SHA1 Message Date
John Gardner
e7b5e25bf8 Add support for regular expression data (#3441) 2017-02-02 11:08:20 -08:00
Danila Malyutin
04c268e535 Add mysql extension for sql scripts (#3413) 2017-01-03 11:47:19 -08:00
John Gardner
95e83311b6 Add "README.1ST" to recognised text-file names (#3010)
* Add "README.1ST" as a recognised readme name

* Add a fixture for ".1st" readme files
2016-05-22 09:03:21 -05:00
Arfon Smith
f2694f3a74 Merge branch 'master' into 2427-local 2016-03-09 19:49:32 -06:00
Lars Brinkhoff
b032886c21 Add .me and other text filenames.
click.me by Bram Moolenaar; VIM license.
2016-03-09 13:27:59 +01:00
Paul Chaignon
eeedd53f32 Support for Text extension .no (Norwegian text) 2016-03-09 10:38:47 +01:00
Paul Chaignon
11a3b5b73c Support for Text extension .nb (Norwegian text) 2016-03-09 10:37:41 +01:00
rpavlick
2d392581e2 adding NCL language 2015-07-09 07:17:01 -07:00
Lars Brinkhoff
3957a11f25 Add to sample to show that a false positive goes away. 2015-01-08 19:35:02 +01:00
Paul Chaignon
db70630eaa Renamed text in Text 2014-12-11 12:51:09 -05:00
Paul Chaignon
93186947c2 Move binaries and text files from samples folder to fixtures 2014-12-04 23:48:05 -05:00
Brian Lopez
bc04232f87 add the fixture 2014-06-07 15:32:29 -05:00
Brian Lopez
203f6d1944 forgot to add the test fixture 2014-06-04 17:15:33 -05:00
Andy Lindeman
aa5a94cc3e Handle case where newline chars don't transcode to detected encoding
We've seen cases where binary files are detected as encodings such as
ISO-8859-8-I. This usually happens when the binary files are short, so
while the detector is mistaken, there is also not very much data for use
in the detection algorithm in the first place so it's understandable
that the detector was wrong.

In these cases, the code to convert ASCII newline characters to
encodings such as ISO-8859-8-I fails because there is no conversion
between them.

We now simply assume that the data is all one line in those cases. In
reality the data is binary, but this obviously difficult to detect
reliably.
2014-06-03 12:26:23 -04:00
Andy Lindeman
85efbde3f7 Counts the number of lines correctly for files with certain multibyte encodings 2014-05-21 13:36:39 -04:00
Yaroslav Shirokov
b68732f0c7 Add detection for CSV 2013-04-04 14:01:09 -07:00
Mike Skalnik
5ea039a74e Remove OBJ files as support solids 2013-02-26 14:00:29 -08:00
Mike Skalnik
041ab041ae Add binary & ascii STLs and OBJs 2013-01-17 14:15:01 -08:00
Ryan Tomayko
bda895eaae Test Mac Format detection and line splitting 2012-09-10 01:52:30 -07:00
Joshua Peek
16a67cb852 Move shebang detection into classifier
Fixes #203
2012-08-03 15:07:36 -05:00
Joshua Peek
7b6caa0f6c Rename samples subdirectories 2012-07-23 15:52:49 -05:00