mirror of
https://github.com/KevinMidboe/linguist.git
synced 2025-10-29 17:50:22 +00:00
More documentation
This commit is contained in:
@@ -14,7 +14,7 @@ For disambiguating between files with common extensions, we first apply
|
||||
some common-sense heuristics to pick out obvious languages. After that, we use a
|
||||
[Bayesian
|
||||
classifier](https://github.com/github/linguist/blob/master/lib/linguist/classifier.rb).
|
||||
For an example, this process us tell the difference between `.h` files which could be either C, C++, or Obj-C.
|
||||
For an example, this process can help us tell the difference between `.h` files which could be either C, C++, or Obj-C.
|
||||
|
||||
In the actual GitHub app we deal with `Grit::Blob` objects. For testing, there is a simple `FileBlob` API.
|
||||
|
||||
|
||||
@@ -1,16 +1,14 @@
|
||||
require 'linguist/tokenizer'
|
||||
|
||||
module Linguist
|
||||
# A collection of simple heuristics that can be used to better analysis languages.
|
||||
class Heuristics
|
||||
# Public: Given an array of String language names, a
|
||||
# apply all heuristics against the given data and return an array
|
||||
# Public: Given an array of String language names,
|
||||
# apply heuristics against the given data and return an array
|
||||
# of matching languages, or nil.
|
||||
#
|
||||
# data - Array of tokens or String data to analyze.
|
||||
# languages - Array of language name Strings to restrict to.
|
||||
#
|
||||
# Returns an array of language name Strings, or []
|
||||
# Returns an array of Languages or []
|
||||
def self.find_by_heuristics(data, languages)
|
||||
if languages.all? { |l| ["Objective-C", "C++"].include?(l) }
|
||||
disambiguate_h(data, languages)
|
||||
@@ -19,6 +17,8 @@ module Linguist
|
||||
|
||||
# .h extensions are ambigious between C, C++, and Objective-C.
|
||||
# We want to shortcut look for Objective-C.
|
||||
#
|
||||
# Returns an array of Languages or []
|
||||
def self.disambiguate_h(data, languages)
|
||||
matches = []
|
||||
matches << Language["Objective-C"] if data.include?("@interface")
|
||||
@@ -26,4 +26,3 @@ module Linguist
|
||||
end
|
||||
end
|
||||
end
|
||||
|
||||
|
||||
Reference in New Issue
Block a user