Some README updates

This commit is contained in:
Ted Nyman
2013-12-15 12:25:47 -08:00
parent 3bc17e822d
commit 0c668ee179

View File

@@ -10,7 +10,11 @@ Linguist defines the list of all languages known to GitHub in a [yaml file](http
Most languages are detected by their file extension. This is the fastest and most common situation.
For disambiguating between files with common extensions, we use a [Bayesian classifier](https://github.com/github/linguist/blob/master/lib/linguist/classifier.rb). For an example, this helps us tell the difference between `.h` files which could be either C, C++, or Obj-C.
For disambiguating between files with common extensions, we first apply
some common-sense heuristics to pick out obvious languages. After that, we use a
[Bayesian
classifier](https://github.com/github/linguist/blob/master/lib/linguist/classifier.rb).
For an example, this process us tell the difference between `.h` files which could be either C, C++, or Obj-C.
In the actual GitHub app we deal with `Grit::Blob` objects. For testing, there is a simple `FileBlob` API.