Some README updates

This commit is contained in:
Ted Nyman
2013-12-15 12:25:47 -08:00
parent 3bc17e822d
commit 0c668ee179

View File

@@ -10,7 +10,11 @@ Linguist defines the list of all languages known to GitHub in a [yaml file](http
Most languages are detected by their file extension. This is the fastest and most common situation. Most languages are detected by their file extension. This is the fastest and most common situation.
For disambiguating between files with common extensions, we use a [Bayesian classifier](https://github.com/github/linguist/blob/master/lib/linguist/classifier.rb). For an example, this helps us tell the difference between `.h` files which could be either C, C++, or Obj-C. For disambiguating between files with common extensions, we first apply
some common-sense heuristics to pick out obvious languages. After that, we use a
[Bayesian
classifier](https://github.com/github/linguist/blob/master/lib/linguist/classifier.rb).
For an example, this process us tell the difference between `.h` files which could be either C, C++, or Obj-C.
In the actual GitHub app we deal with `Grit::Blob` objects. For testing, there is a simple `FileBlob` API. In the actual GitHub app we deal with `Grit::Blob` objects. For testing, there is a simple `FileBlob` API.