mirror of
https://github.com/KevinMidboe/linguist.git
synced 2026-01-06 01:15:33 +00:00
Exclude documentation files from language statistics
Documentation is an important part of a software project but is not generally thought of as part of the code for that project. Repository language statistics are used to quantify the project's code, so it makes sense to exclude documentation from those computations. Documentation files are recognized similarly to vendored files. lib/linguist/documentation.yml contains regular expressions to match common names for documentation files. A new linguist-documentation Git attribute can be used to override those conventions.
This commit is contained in:
@@ -236,6 +236,21 @@ module Linguist
|
||||
name =~ VendoredRegexp ? true : false
|
||||
end
|
||||
|
||||
documentation_paths = YAML.load_file(File.expand_path("../documentation.yml", __FILE__))
|
||||
DocumentationRegexp = Regexp.new(documentation_paths.join('|'))
|
||||
|
||||
# Public: Is the blob in a documentation directory?
|
||||
#
|
||||
# Documentation files are ignored by language statistics.
|
||||
#
|
||||
# See "documentation.yml" for a list of documentation conventions that match
|
||||
# this pattern.
|
||||
#
|
||||
# Return true or false
|
||||
def documentation?
|
||||
name =~ DocumentationRegexp ? true : false
|
||||
end
|
||||
|
||||
# Public: Get each line of data
|
||||
#
|
||||
# Requires Blob#data
|
||||
|
||||
18
lib/linguist/documentation.yml
Normal file
18
lib/linguist/documentation.yml
Normal file
@@ -0,0 +1,18 @@
|
||||
# Documentation files and directories are excluded from language
|
||||
# statistics.
|
||||
#
|
||||
# Lines in this file are Regexps that are matched against the file
|
||||
# pathname.
|
||||
#
|
||||
# Please add additional test coverage to
|
||||
# `test/test_blob.rb#test_documentation` if you make any changes.
|
||||
|
||||
## Documentation Conventions ##
|
||||
|
||||
- ^docs?/
|
||||
- ^Documentation/
|
||||
|
||||
- (^|/)CONTRIBUTING(\.|$)
|
||||
- (^|/)COPYING(\.|$)
|
||||
- (^|/)LICEN[CS]E(\.|$)
|
||||
- (^|/)README(\.|$)
|
||||
@@ -4,7 +4,7 @@ require 'rugged'
|
||||
|
||||
module Linguist
|
||||
class LazyBlob
|
||||
GIT_ATTR = ['linguist-language', 'linguist-vendored']
|
||||
GIT_ATTR = ['linguist-documentation', 'linguist-language', 'linguist-vendored']
|
||||
GIT_ATTR_OPTS = { :priority => [:index], :skip_system => true }
|
||||
GIT_ATTR_FLAGS = Rugged::Repository::Attributes.parse_opts(GIT_ATTR_OPTS)
|
||||
|
||||
@@ -37,6 +37,14 @@ module Linguist
|
||||
end
|
||||
end
|
||||
|
||||
def documentation?
|
||||
if attr = git_attributes['linguist-documentation']
|
||||
boolean_attribute(attr)
|
||||
else
|
||||
super
|
||||
end
|
||||
end
|
||||
|
||||
def language
|
||||
return @language if defined?(@language)
|
||||
|
||||
|
||||
@@ -159,7 +159,7 @@ module Linguist
|
||||
blob = Linguist::LazyBlob.new(repository, delta.new_file[:oid], new, mode.to_s(8))
|
||||
|
||||
# Skip vendored or generated blobs
|
||||
next if blob.vendored? || blob.generated? || blob.language.nil?
|
||||
next if blob.vendored? || blob.documentation? || blob.generated? || blob.language.nil?
|
||||
|
||||
if DETECTABLE_TYPES.include?(blob.language.type)
|
||||
file_map[new] = [blob.language.group.name, blob.size]
|
||||
|
||||
Reference in New Issue
Block a user