diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index cfba2222..80b20508 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -17,7 +17,7 @@ To add support for a new extension: In addition, if this extension is already listed in [`languages.yml`][languages] then sometimes a few more steps will need to be taken: 0. Make sure that example `.yourextension` files are present in the [samples directory][samples] for each language that uses `.yourextension`. -0. Test the performance of the Bayesian classifier with a relatively large number (1000s) of sample `.yourextension` files. (ping @arfon or @bkeepers to help with this) to ensure we're not misclassifying files. +0. Test the performance of the Bayesian classifier with a relatively large number (1000s) of sample `.yourextension` files. (ping **@arfon** or **@bkeepers** to help with this) to ensure we're not misclassifying files. 0. If the Bayesian classifier does a bad job with the sample `.yourextension` files then a [heuristic](https://github.com/github/linguist/blob/master/lib/linguist/heuristics.rb) may need to be written to help. @@ -28,10 +28,7 @@ We try only to add languages once they have some usage on GitHub. In most cases To add support for a new language: 0. Add an entry for your language to [`languages.yml`][languages]. Omit the `language_id` field for now. -0. Add a grammar for your language. Please only add grammars that have [one of these licenses](https://github.com/github/linguist/blob/257425141d4e2a5232786bf0b13c901ada075f93/vendor/licenses/config.yml#L2-L11). - 0. Add your grammar as a submodule: `git submodule add https://github.com/JaneSmith/MyGrammar vendor/grammars/MyGrammar`. - 0. Add your grammar to [`grammars.yml`][grammars] by running `script/convert-grammars --add vendor/grammars/MyGrammar`. - 0. Download the license for the grammar: `script/licensed`. Be careful to only commit the file for the new grammar, as this script may update licenses for other grammars as well. +0. Add a grammar for your language: `script/add-grammar https://github.com/JaneSmith/MyGrammar`. Please only add grammars that have [one of these licenses][licenses]. 0. Add samples for your language to the [samples directory][samples] in the correct subdirectory. 0. Add a `language_id` for your language using `script/set-language-ids`. **You should only ever need to run `script/set-language-ids --update`. Anything other than this risks breaking GitHub search :cry:** 0. Open a pull request, linking to a [GitHub search result](https://github.com/search?utf8=%E2%9C%93&q=extension%3Aboot+NOT+nothack&type=Code&ref=searchresults) showing in-the-wild usage. @@ -39,7 +36,7 @@ To add support for a new language: In addition, if your new language defines an extension that's already listed in [`languages.yml`][languages] (such as `.foo`) then sometimes a few more steps will need to be taken: 0. Make sure that example `.foo` files are present in the [samples directory][samples] for each language that uses `.foo`. -0. Test the performance of the Bayesian classifier with a relatively large number (1000s) of sample `.foo` files. (ping @arfon or @bkeepers to help with this) to ensure we're not misclassifying files. +0. Test the performance of the Bayesian classifier with a relatively large number (1000s) of sample `.foo` files. (ping **@arfon** or **@bkeepers** to help with this) to ensure we're not misclassifying files. 0. If the Bayesian classifier does a bad job with the sample `.foo` files then a [heuristic](https://github.com/github/linguist/blob/master/lib/linguist/heuristics.rb) may need to be written to help. Remember, the goal here is to try and avoid false positives! @@ -82,9 +79,9 @@ Here's our current build status: [![Build Status](https://api.travis-ci.org/gith Linguist is maintained with :heart: by: -- @arfon (GitHub Staff) -- @larsbrinkhoff -- @pchaigno +- **@arfon** (GitHub Staff) +- **@larsbrinkhoff** +- **@pchaigno** As Linguist is a production dependency for GitHub we have a couple of workflow restrictions: @@ -113,5 +110,6 @@ If you are the current maintainer of this gem: [grammars]: /grammars.yml [languages]: /lib/linguist/languages.yml +[licenses]: https://github.com/github/linguist/blob/257425141d4e2a5232786bf0b13c901ada075f93/vendor/licenses/config.yml#L2-L11 [samples]: /samples [new-issue]: https://github.com/github/linguist/issues/new diff --git a/script/add-grammar b/script/add-grammar new file mode 100755 index 00000000..25c12176 --- /dev/null +++ b/script/add-grammar @@ -0,0 +1,93 @@ +#!/usr/bin/env ruby + +require "optparse" + +ROOT = File.expand_path("../../", __FILE__) + + +# Break a repository URL into its separate components +def parse_url(input) + hosts = "github\.com|bitbucket\.org|gitlab\.com" + + # HTTPS/HTTP link pointing to recognised hosts + if input =~ /^(?:https?:\/\/)?(?:[^.@]+@)?(?:www\.)?(#{hosts})\/([^\/]+)\/([^\/]+)/i + { host: $1.downcase(), user: $2, repo: $3.sub(/\.git$/, "") } + # SSH + elsif input =~ /^git@(#{hosts}):([^\/]+)\/([^\/]+)\.git$/i + { host: $1.downcase(), user: $2, repo: $3 } + # provider:user/repo + elsif input =~ /^(github|bitbucket|gitlab):\/?([^\/]+)\/([^\/]+)\/?$/i + { host: $1.downcase(), user: $2, repo: $3 } + # user/repo - Common GitHub shorthand + elsif input =~ /^\/?([^\/]+)\/([^\/]+)\/?$/ + { host: "github.com", user: $1, repo: $2 } + else + raise "Unsupported URL: #{input}" + end +end + +# Isolate the vendor-name component of a submodule path +def parse_submodule(name) + name =~ /^(?:.*(?:vendor\/)?grammars\/)?([^\/]+)/i + path = "vendor/grammars/#{$1}" + unless File.exist?("#{ROOT}/" + path) + warn "Submodule '#{path}' does not exist. Aborting." + exit 1 + end + path +end + +# Print debugging feedback to STDOUT if running with --verbose +def log(msg) + puts msg if $verbose +end + + +usage = """Usage: + #{$0} [-v|--verbose] [--replace grammar] url +Examples: + #{$0} https://github.com/Alhadis/language-roff + #{$0} --replace sublime-apl https://github.com/Alhadis/language-apl +""" + +$replace = nil +$verbose = false + +OptionParser.new do |opts| + opts.banner = usage + opts.on("-v", "--verbose", "Print verbose feedback to STDOUT") do + $verbose = true + end + opts.on("-rSUBMODULE", "--replace=SUBMODDULE", "Replace an existing grammar submodule.") do |name| + $replace = name + end +end.parse! + + +$url = ARGV[0] + +# No URL? Print a usage message and bail. +unless $url + warn usage + exit 1; +end + +# Ensure the given URL is an HTTPS link +parts = parse_url $url +https = "https://#{parts[:host]}/#{parts[:user]}/#{parts[:repo]}" +repo_new = "vendor/grammars/#{parts[:repo]}" +repo_old = parse_submodule($replace) if $replace + +if repo_old + log "Deregistering: #{repo_old}" + `git submodule deinit #{repo_old}` + `git rm -rf #{repo_old}` +end + +log "Registering new submodule: #{repo_new}" +`git submodule add -f #{https} #{repo_new}` +exit 1 if $?.exitstatus > 0 +`script/convert-grammars --add #{repo_new}` + +log "Confirming license" +`script/licensed --module "#{repo_new}"` diff --git a/script/licensed b/script/licensed index ea3f538f..68214d34 100755 --- a/script/licensed +++ b/script/licensed @@ -4,6 +4,7 @@ require "bundler/setup" require "licensed/cli" +require "optparse" module Licensed module Source @@ -32,7 +33,14 @@ module Licensed end end -source = Licensed::Source::Filesystem.new("vendor/grammars/*/", type: "grammar") +module_path = nil +OptionParser.new do |opts| + opts.on("-mPATH", "--module=PATH", "Cache license file for specific grammar") do |p| + module_path = p + end +end.parse! + +source = Licensed::Source::Filesystem.new(module_path || "vendor/grammars/*/", type: "grammar") config = Licensed::Configuration.new config.sources << source @@ -43,4 +51,5 @@ else end command.run +`git checkout -- vendor/licenses/grammar/` if module_path exit command.success?