mirror of
https://github.com/KevinMidboe/linguist.git
synced 2025-10-29 17:50:22 +00:00
Merge pull request #3193 from Alhadis/grammar-scripts
Add script to add or replace grammars
This commit is contained in:
@@ -17,7 +17,7 @@ To add support for a new extension:
|
||||
In addition, if this extension is already listed in [`languages.yml`][languages] then sometimes a few more steps will need to be taken:
|
||||
|
||||
0. Make sure that example `.yourextension` files are present in the [samples directory][samples] for each language that uses `.yourextension`.
|
||||
0. Test the performance of the Bayesian classifier with a relatively large number (1000s) of sample `.yourextension` files. (ping @arfon or @bkeepers to help with this) to ensure we're not misclassifying files.
|
||||
0. Test the performance of the Bayesian classifier with a relatively large number (1000s) of sample `.yourextension` files. (ping **@arfon** or **@bkeepers** to help with this) to ensure we're not misclassifying files.
|
||||
0. If the Bayesian classifier does a bad job with the sample `.yourextension` files then a [heuristic](https://github.com/github/linguist/blob/master/lib/linguist/heuristics.rb) may need to be written to help.
|
||||
|
||||
|
||||
@@ -28,10 +28,7 @@ We try only to add languages once they have some usage on GitHub. In most cases
|
||||
To add support for a new language:
|
||||
|
||||
0. Add an entry for your language to [`languages.yml`][languages]. Omit the `language_id` field for now.
|
||||
0. Add a grammar for your language. Please only add grammars that have [one of these licenses](https://github.com/github/linguist/blob/257425141d4e2a5232786bf0b13c901ada075f93/vendor/licenses/config.yml#L2-L11).
|
||||
0. Add your grammar as a submodule: `git submodule add https://github.com/JaneSmith/MyGrammar vendor/grammars/MyGrammar`.
|
||||
0. Add your grammar to [`grammars.yml`][grammars] by running `script/convert-grammars --add vendor/grammars/MyGrammar`.
|
||||
0. Download the license for the grammar: `script/licensed`. Be careful to only commit the file for the new grammar, as this script may update licenses for other grammars as well.
|
||||
0. Add a grammar for your language: `script/add-grammar https://github.com/JaneSmith/MyGrammar`. Please only add grammars that have [one of these licenses][licenses].
|
||||
0. Add samples for your language to the [samples directory][samples] in the correct subdirectory.
|
||||
0. Add a `language_id` for your language using `script/set-language-ids`. **You should only ever need to run `script/set-language-ids --update`. Anything other than this risks breaking GitHub search :cry:**
|
||||
0. Open a pull request, linking to a [GitHub search result](https://github.com/search?utf8=%E2%9C%93&q=extension%3Aboot+NOT+nothack&type=Code&ref=searchresults) showing in-the-wild usage.
|
||||
@@ -39,7 +36,7 @@ To add support for a new language:
|
||||
In addition, if your new language defines an extension that's already listed in [`languages.yml`][languages] (such as `.foo`) then sometimes a few more steps will need to be taken:
|
||||
|
||||
0. Make sure that example `.foo` files are present in the [samples directory][samples] for each language that uses `.foo`.
|
||||
0. Test the performance of the Bayesian classifier with a relatively large number (1000s) of sample `.foo` files. (ping @arfon or @bkeepers to help with this) to ensure we're not misclassifying files.
|
||||
0. Test the performance of the Bayesian classifier with a relatively large number (1000s) of sample `.foo` files. (ping **@arfon** or **@bkeepers** to help with this) to ensure we're not misclassifying files.
|
||||
0. If the Bayesian classifier does a bad job with the sample `.foo` files then a [heuristic](https://github.com/github/linguist/blob/master/lib/linguist/heuristics.rb) may need to be written to help.
|
||||
|
||||
Remember, the goal here is to try and avoid false positives!
|
||||
@@ -82,9 +79,9 @@ Here's our current build status: [
|
||||
- @larsbrinkhoff
|
||||
- @pchaigno
|
||||
- **@arfon** (GitHub Staff)
|
||||
- **@larsbrinkhoff**
|
||||
- **@pchaigno**
|
||||
|
||||
As Linguist is a production dependency for GitHub we have a couple of workflow restrictions:
|
||||
|
||||
@@ -113,5 +110,6 @@ If you are the current maintainer of this gem:
|
||||
|
||||
[grammars]: /grammars.yml
|
||||
[languages]: /lib/linguist/languages.yml
|
||||
[licenses]: https://github.com/github/linguist/blob/257425141d4e2a5232786bf0b13c901ada075f93/vendor/licenses/config.yml#L2-L11
|
||||
[samples]: /samples
|
||||
[new-issue]: https://github.com/github/linguist/issues/new
|
||||
|
||||
93
script/add-grammar
Executable file
93
script/add-grammar
Executable file
@@ -0,0 +1,93 @@
|
||||
#!/usr/bin/env ruby
|
||||
|
||||
require "optparse"
|
||||
|
||||
ROOT = File.expand_path("../../", __FILE__)
|
||||
|
||||
|
||||
# Break a repository URL into its separate components
|
||||
def parse_url(input)
|
||||
hosts = "github\.com|bitbucket\.org|gitlab\.com"
|
||||
|
||||
# HTTPS/HTTP link pointing to recognised hosts
|
||||
if input =~ /^(?:https?:\/\/)?(?:[^.@]+@)?(?:www\.)?(#{hosts})\/([^\/]+)\/([^\/]+)/i
|
||||
{ host: $1.downcase(), user: $2, repo: $3.sub(/\.git$/, "") }
|
||||
# SSH
|
||||
elsif input =~ /^git@(#{hosts}):([^\/]+)\/([^\/]+)\.git$/i
|
||||
{ host: $1.downcase(), user: $2, repo: $3 }
|
||||
# provider:user/repo
|
||||
elsif input =~ /^(github|bitbucket|gitlab):\/?([^\/]+)\/([^\/]+)\/?$/i
|
||||
{ host: $1.downcase(), user: $2, repo: $3 }
|
||||
# user/repo - Common GitHub shorthand
|
||||
elsif input =~ /^\/?([^\/]+)\/([^\/]+)\/?$/
|
||||
{ host: "github.com", user: $1, repo: $2 }
|
||||
else
|
||||
raise "Unsupported URL: #{input}"
|
||||
end
|
||||
end
|
||||
|
||||
# Isolate the vendor-name component of a submodule path
|
||||
def parse_submodule(name)
|
||||
name =~ /^(?:.*(?:vendor\/)?grammars\/)?([^\/]+)/i
|
||||
path = "vendor/grammars/#{$1}"
|
||||
unless File.exist?("#{ROOT}/" + path)
|
||||
warn "Submodule '#{path}' does not exist. Aborting."
|
||||
exit 1
|
||||
end
|
||||
path
|
||||
end
|
||||
|
||||
# Print debugging feedback to STDOUT if running with --verbose
|
||||
def log(msg)
|
||||
puts msg if $verbose
|
||||
end
|
||||
|
||||
|
||||
usage = """Usage:
|
||||
#{$0} [-v|--verbose] [--replace grammar] url
|
||||
Examples:
|
||||
#{$0} https://github.com/Alhadis/language-roff
|
||||
#{$0} --replace sublime-apl https://github.com/Alhadis/language-apl
|
||||
"""
|
||||
|
||||
$replace = nil
|
||||
$verbose = false
|
||||
|
||||
OptionParser.new do |opts|
|
||||
opts.banner = usage
|
||||
opts.on("-v", "--verbose", "Print verbose feedback to STDOUT") do
|
||||
$verbose = true
|
||||
end
|
||||
opts.on("-rSUBMODULE", "--replace=SUBMODDULE", "Replace an existing grammar submodule.") do |name|
|
||||
$replace = name
|
||||
end
|
||||
end.parse!
|
||||
|
||||
|
||||
$url = ARGV[0]
|
||||
|
||||
# No URL? Print a usage message and bail.
|
||||
unless $url
|
||||
warn usage
|
||||
exit 1;
|
||||
end
|
||||
|
||||
# Ensure the given URL is an HTTPS link
|
||||
parts = parse_url $url
|
||||
https = "https://#{parts[:host]}/#{parts[:user]}/#{parts[:repo]}"
|
||||
repo_new = "vendor/grammars/#{parts[:repo]}"
|
||||
repo_old = parse_submodule($replace) if $replace
|
||||
|
||||
if repo_old
|
||||
log "Deregistering: #{repo_old}"
|
||||
`git submodule deinit #{repo_old}`
|
||||
`git rm -rf #{repo_old}`
|
||||
end
|
||||
|
||||
log "Registering new submodule: #{repo_new}"
|
||||
`git submodule add -f #{https} #{repo_new}`
|
||||
exit 1 if $?.exitstatus > 0
|
||||
`script/convert-grammars --add #{repo_new}`
|
||||
|
||||
log "Confirming license"
|
||||
`script/licensed --module "#{repo_new}"`
|
||||
@@ -4,6 +4,7 @@
|
||||
|
||||
require "bundler/setup"
|
||||
require "licensed/cli"
|
||||
require "optparse"
|
||||
|
||||
module Licensed
|
||||
module Source
|
||||
@@ -32,7 +33,14 @@ module Licensed
|
||||
end
|
||||
end
|
||||
|
||||
source = Licensed::Source::Filesystem.new("vendor/grammars/*/", type: "grammar")
|
||||
module_path = nil
|
||||
OptionParser.new do |opts|
|
||||
opts.on("-mPATH", "--module=PATH", "Cache license file for specific grammar") do |p|
|
||||
module_path = p
|
||||
end
|
||||
end.parse!
|
||||
|
||||
source = Licensed::Source::Filesystem.new(module_path || "vendor/grammars/*/", type: "grammar")
|
||||
config = Licensed::Configuration.new
|
||||
config.sources << source
|
||||
|
||||
@@ -43,4 +51,5 @@ else
|
||||
end
|
||||
|
||||
command.run
|
||||
`git checkout -- vendor/licenses/grammar/` if module_path
|
||||
exit command.success?
|
||||
|
||||
Reference in New Issue
Block a user