mirror of
https://github.com/KevinMidboe/linguist.git
synced 2026-01-08 10:25:31 +00:00
Some README updates
This commit is contained in:
@@ -10,7 +10,11 @@ Linguist defines the list of all languages known to GitHub in a [yaml file](http
|
||||
|
||||
Most languages are detected by their file extension. This is the fastest and most common situation.
|
||||
|
||||
For disambiguating between files with common extensions, we use a [Bayesian classifier](https://github.com/github/linguist/blob/master/lib/linguist/classifier.rb). For an example, this helps us tell the difference between `.h` files which could be either C, C++, or Obj-C.
|
||||
For disambiguating between files with common extensions, we first apply
|
||||
some common-sense heuristics to pick out obvious languages. After that, we use a
|
||||
[Bayesian
|
||||
classifier](https://github.com/github/linguist/blob/master/lib/linguist/classifier.rb).
|
||||
For an example, this process us tell the difference between `.h` files which could be either C, C++, or Obj-C.
|
||||
|
||||
In the actual GitHub app we deal with `Grit::Blob` objects. For testing, there is a simple `FileBlob` API.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user