mirror of
https://github.com/ruby/ruby.git
synced 2026-01-29 05:24:23 +00:00
Fixes ticket:68.
NOTE that this involves an API change! Entity declarations in the doctype
now generate events that carry two, not one, arguments.
Implements ticket:15, using gwrite's suggestion. This allows Element to be
subclassed.
Two unrelated changes, because subversion is retarded and doesn't do
block-level commits:
1) Fixed a typo bug in previous change for ticket:15
2) Fixed namespaces handling in XPath and element.
***** Note that this is an API change!!! *****
Element.namespaces() now returns a hash of namespace mappings which are
relevant for that node.
Fixes a bug in multiple decodings
The changeset 1230:1231 was bad. The default behavior is *not* to use the
native REXML encodings by default, but rather to use ICONV by default. I know
that this will piss some people off, but defaulting to the pure Ruby version
isn't the correct solution, and it breaks other encodings, so I've reverted it.
* Fixes ticket:61 (xpath_parser)
* Fixes ticket:63 (UTF-16; UNILE decoding was bad)
* Cleans up some tests, removing opportunities for test corruption
* Improves parsing error messages a little
* Adds the ability to override the encoding detection in Source construction
* Fixes an edge case in Functions::string, where document nodes weren't
correctly converted
* Fixes Functions::string() for Element and Document nodes
* Fixes some problems in entity handling
Addresses ticket:66
Fixes ticket:71
Addresses ticket:78
NOTE: that this also fixes what is technically another bug in REXML. REXML's
XPath parser used to allow exponential notation in numbers. The XPath spec
is specific about what a number is, and scientific notation is not included.
Therefore, this has been fixed.
Cross-ported a fix for ticket:88 from CVS.
Fixes ticket:80
Documentation cleanup. Ticket:84
Applied Kou's fix for an un-trac'ed bug.
------------------------------------------------------------------------
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@11548 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
67 lines
2.0 KiB
Ruby
67 lines
2.0 KiB
Ruby
# -*- mode: ruby; ruby-indent-level: 2; indent-tabs-mode: t; tab-width: 2 -*- vim: sw=2 ts=2
|
|
module REXML
|
|
module Encoding
|
|
@encoding_methods = {}
|
|
def self.register(enc, &block)
|
|
@encoding_methods[enc] = block
|
|
end
|
|
def self.apply(obj, enc)
|
|
@encoding_methods[enc][obj]
|
|
end
|
|
def self.encoding_method(enc)
|
|
@encoding_methods[enc]
|
|
end
|
|
|
|
# Native, default format is UTF-8, so it is declared here rather than in
|
|
# an encodings/ definition.
|
|
UTF_8 = 'UTF-8'
|
|
UTF_16 = 'UTF-16'
|
|
UNILE = 'UNILE'
|
|
|
|
# ID ---> Encoding name
|
|
attr_reader :encoding
|
|
def encoding=( enc )
|
|
old_verbosity = $VERBOSE
|
|
begin
|
|
$VERBOSE = false
|
|
enc = enc.nil? ? nil : enc.upcase
|
|
return false if defined? @encoding and enc == @encoding
|
|
if enc and enc != UTF_8
|
|
@encoding = enc
|
|
raise ArgumentError, "Bad encoding name #@encoding" unless @encoding =~ /^[\w-]+$/
|
|
@encoding.untaint
|
|
begin
|
|
require 'rexml/encodings/ICONV.rb'
|
|
Encoding.apply(self, "ICONV")
|
|
rescue LoadError, Exception
|
|
begin
|
|
enc_file = File.join( "rexml", "encodings", "#@encoding.rb" )
|
|
require enc_file
|
|
Encoding.apply(self, @encoding)
|
|
rescue LoadError => err
|
|
puts err.message
|
|
raise ArgumentError, "No decoder found for encoding #@encoding. Please install iconv."
|
|
end
|
|
end
|
|
else
|
|
@encoding = UTF_8
|
|
require 'rexml/encodings/UTF-8.rb'
|
|
Encoding.apply(self, @encoding)
|
|
end
|
|
ensure
|
|
$VERBOSE = old_verbosity
|
|
end
|
|
true
|
|
end
|
|
|
|
def check_encoding str
|
|
# We have to recognize UTF-16, LSB UTF-16, and UTF-8
|
|
return UTF_16 if /\A\xfe\xff/n =~ str
|
|
return UNILE if /\A\xff\xfe/n =~ str
|
|
str =~ /^\s*<?xml\s*version=(['"]).*?\2\s*encoding=(["'])(.*?)\2/um
|
|
return $1.upcase if $1
|
|
return UTF_8
|
|
end
|
|
end
|
|
end
|