Commit Graph

295 Commits

Author SHA1 Message Date
Joshua Peek
7292bdc180 Change Classifier to accept language name Strings 2012-07-20 15:52:27 -05:00
Joshua Peek
bbc5225086 Pending samples work now 2012-07-20 15:36:48 -05:00
Joshua Peek
2637d8dc55 Add tokenize helper to Tokenize class 2012-07-20 15:14:58 -05:00
Joshua Peek
175d4244c2 Extract single and multi line comment parser 2012-07-20 15:06:21 -05:00
Joshua Peek
d063089430 Add coq comments 2012-07-20 14:45:19 -05:00
Joshua Peek
5521dd08a0 Move test fixtures to samples/ 2012-06-22 10:09:24 -05:00
Joshua Peek
2b712dc790 Guard against classify nil data 2012-06-21 11:47:32 -05:00
Joshua Peek
540f2a0941 More matlab samples 2012-06-21 10:44:31 -05:00
Joshua Peek
497da86262 Strip tex and matlab leading inline comments 2012-06-21 10:38:28 -05:00
Joshua Peek
4b9b8a5058 Remove matlab file with bogus keywords 2012-06-21 10:25:30 -05:00
Joshua Peek
5cdd5e206c Improve operator tokenizing 2012-06-20 17:16:53 -05:00
Joshua Peek
516a220d9f Verify classifer counts 2012-06-20 15:48:46 -05:00
Joshua Peek
f68e94f181 Skip number literals 2012-06-20 11:26:14 -05:00
Joshua Peek
e9eae4e008 Skip pending tests 2012-06-20 11:19:02 -05:00
Joshua Peek
e33d8f3685 Merge branch 'master' into bayesian 2012-06-20 11:18:47 -05:00
Joshua Peek
a10e52a3c2 Revert removing some fixtures 2012-06-20 11:18:16 -05:00
Joshua Peek
645a87d02b Remove dead fixture test 2012-06-19 16:34:13 -05:00
Joshua Peek
c114d710f8 Test classifier on ambiguous languages 2012-06-19 16:32:56 -05:00
Joshua Peek
c804d04072 Merge branch 'master' into bayesian 2012-06-19 16:29:01 -05:00
Joshua Peek
6113e6d548 Remove ambiguous obj-c header example 2012-06-19 16:28:34 -05:00
Joshua Peek
fdd81ce0be Merge branch 'master' into bayesian 2012-06-19 16:26:43 -05:00
Joshua Peek
4ea1e8aece Remove ambiguous c header example 2012-06-19 16:26:39 -05:00
Joshua Peek
fcd8c089dc Add some more c header examplesgst 2012-06-19 16:25:09 -05:00
Joshua Peek
9d555862c3 Merge branch 'master' into bayesian 2012-06-19 15:02:02 -05:00
Joshua Peek
79a473cf58 Add some more apex and openedge fixtures 2012-06-19 15:01:58 -05:00
Joshua Peek
ddf3ec4a5b Warn if classifier instance is out of date 2012-06-19 14:32:04 -05:00
Joshua Peek
d566b35020 Allow classifer languages to be scoped 2012-06-19 14:21:42 -05:00
Joshua Peek
8f85a447de Allow tokens to be passed directly to classify 2012-06-19 14:17:27 -05:00
Joshua Peek
d5fa8cbcb7 Refactor tokenizer test helper 2012-06-19 13:12:17 -05:00
Joshua Peek
ecb2397e59 Merge branch 'master' into bayesian 2012-06-19 11:43:48 -05:00
Michael Ficarra
93d0611b4e accidental hard tabs 2012-06-19 11:30:39 -05:00
Michael Ficarra
11166911dc Recognise that PEG.js-generated parsers are in fact generated 2012-06-19 11:18:51 -05:00
Joshua Peek
8a75d4d208 GC classifier db 2012-06-08 16:04:43 -05:00
Joshua Peek
62498cf0e9 Merge branch 'master' into bayesian 2012-06-08 15:46:48 -05:00
Joshua Peek
8a9d8a15af Building an army 2012-06-08 15:46:39 -05:00
Joshua Peek
6f6dd8bc38 Improve tokenizing sgml tags 2012-06-08 14:46:16 -05:00
Joshua Peek
9ecab364d1 Dump classifier results 2012-06-08 14:13:26 -05:00
Joshua Peek
0172623061 Add sample gathering class 2012-06-08 13:51:49 -05:00
Joshua Peek
e0c777d995 Fix test name 2012-06-08 13:43:37 -05:00
Joshua Peek
f747b49347 Add simple classifier 2012-06-07 17:10:28 -05:00
Joshua Peek
e0cbe815a3 Add basic Tokenizer 2012-06-07 14:55:11 -05:00
Joshua Peek
4df3199818 Reorg test fixtures 2012-06-07 12:17:24 -05:00
Joshua Peek
a708993388 Ensure all languages have unique primary extensions 2012-06-07 10:29:19 -05:00
Rob Sanheim
1c7b8ebe71 Make colorize safer:
- don't try to colorize blobs that have a high ratio of
    long lines -- these are most likely minified files or something else
    strange that will blow up Pygments.rb
  - re github/github#3938
2012-05-21 11:33:01 -05:00
Joshua Peek
285c9b4c60 Fix xslt mime type 2012-05-09 10:59:00 -05:00
Joshua Peek
2cbf428176 Use XSLT lexer 2012-05-09 10:41:11 -05:00
Joshua Peek
35fe44549e Fix empty .m file 2012-05-09 09:52:14 -05:00
Andrew D. Horchler
354e1fc85e More robust heuristics for .m files and 3 new Matlab tests. Support for Obj-C detection fully intact; all tests pass. Detection of Obj-C keywords @implementation, @property, @interface, and @synthesize removed to avoid possible conflicts with user-created Matlab function handles. Only @end is needed, which is not valid in Matlab. Matlab class files supported. Comments preceded by whitespace detected for Obj-C and Matlab.
Signed-off-by: Andrew D. Horchler <adh9@case.edu>
2012-05-08 18:31:18 -04:00
Daniel Micay
be42a8411b add detection for Arch Linux PKGBUILDs 2012-05-08 07:39:48 -04:00
Joshua Peek
aa7c8497b1 Merge pull request #151 from aseemk/streamlinejs-support
Add ._js & ._coffee extensions for Streamline.js.
2012-05-07 19:44:14 -07:00