Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ConstructingParser does not accept U+00B7 character in namespace prefix #9060

Closed
scabug opened this issue Dec 22, 2014 · 2 comments
Closed
Assignees

Comments

@scabug
Copy link

scabug commented Dec 22, 2014

This unicode character U+00B7 is explicitly allowed by the XML spec as a character in a namespace prefix, but this fails to parse (that middle-dot character is the U+B7).

This problem is in the ConstructingParser, but does not exist in the regular XML loader.

Here's an XML doc that illustrates the problem:

<?xml version="1.0" encoding="UTF-8"?> 
<!-- Note that in b· that middle dot is exactly that. Unicode 0xB7. -->
<!-- This middle doc character is expressly allowed by XML 1.0 syntax for namespace prefixes. -->
<имен:schema 
xmlns:имен="http://www.w3.org/2001/XMLSchema" 
xmlns:dfdl="http://www.ogf.org/dfdl/dfdl-1.0/" 
xmlns:="http://example.com" 
targetNamespace="http://example.com">

  <имен:include schemaLocation="xsd/built-in-formats.xsd"/>
      
  <имен:annotation>
    <имен:appinfo source="http://www.ogf.org/dfdl/">
      <dfdl:format ref="b·:daffodilTest1" separator="" alignment="1" alignmentUnits="bytes" lengthUnits="bytes"
        trailingSkip="0" initiator="" terminator="" leadingSkip='0' textTrimKind="none" initiatedContent="no"
        ignoreCase="no" representation="text" textNumberRep="standard" encoding="ASCII"/>
    </имен:appinfo>
  </имен:annotation>
  
    <имен:simpleType name="simTyp" dfdl:lengthKind="delimited" dfdl:initiator="1:">
      <имен:restriction base="имен:int"/>
    </имен:simpleType>

    <имен:element name="one" type="b·:simTyp"/>
  
</имен:schema>
@scabug
Copy link
Author

scabug commented Dec 22, 2014

Imported From: https://issues.scala-lang.org/browse/SI-9060?orig=1
Reporter: Michael Beckerle (mbeckerle.dfdl)
Affected Versions: 2.10.4

@scabug
Copy link
Author

scabug commented Dec 23, 2014

@som-snytt said:
I see, when you say it's "explicitly allowed" by the spec, the spec actually says, "COLON ... and MIDDLE DOT are explicitly permitted."

That's some spec they have over there.

scala/scala-xml#44

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants