Changed tokenizer number literals to be more encompassing

Number literals now skips hexadecimal, and C style literals.
2026-06-20 11:19:32 +00:00 · 2015-02-20 14:08:39 +11:00
parent d28f5e87c0
commit 885b5aab41
2 changed files with 5 additions and 1 deletions
--- a/lib/linguist/tokenizer.rb
+++ b/lib/linguist/tokenizer.rb
@@ -94,7 +94,7 @@ module Linguist
          end

        # Skip number literals
-        elsif s.scan(/(0x)?\d(\d|\.)*/)
+        elsif s.scan(/(0x\h(\h|\.)*|\d(\d|\.)*)([uU][lL]{0,2}|([eE][-+]\d*)?[fFlL]*)/)

        # SGML style brackets
        elsif token = s.scan(/<[^\s<>][^<>]*>/)