New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scala.xml.pull.XMLEventReader not keeping up with InputStream #4599
Comments
Imported From: https://issues.scala-lang.org/browse/SI-4599?orig=1 |
J (tumeow) said: |
@dcsobral said: |
J (tumeow) said:
I very much doubt that. This is the way XML streams are. XML does not permit multiple root nodes. Even if it was permitted.. XMLEventReader is capable of detecting other nodes it comes across that have opened and not closed, and without the ability to do it at the root level it makes this class useless for XML streams. |
J (tumeow) said: |
huynhjl said: val src = new io.Source {
val iter = "<stream><features><foo/></features>".iterator ++
Iterator.continually{ println("wait 5s"); Thread.sleep(5000); '\n' }.take(1) ++
"</stream>".iterator
}
val reader = new xml.pull.XMLEventReader(src)
while (reader.hasNext) System.out.println(reader.next) One would hope that EvElemStart(null,stream,,)
EvElemStart(null,features,,)
EvElemStart(null,foo,,)
EvElemEnd(null,foo)
wait 5s
EvElemEnd(null,features)
EvText(
)
EvElemEnd(null,stream) Not sure if it is something inherent to the XML parser. |
J (tumeow) said:
It will block forever before I can determine the final tag in the stream is closed... at best I can't make my second request knowing I've got the full information to respond to. |
@dcsobral said: def xEndTag(startName: String) {
xToken('/')
if (xName != startName)
errorNoEnd(startName)
xSpaceOpt
xToken('>')
} Since that tag is properly sent, it runs up to def xToken(that: Char) {
if (ch == that) nextch
else xHandleError(that, "'%s' expected instead of '%s'".format(that, ch))
} Evidently, it calls def nextch = {
if (curInput.hasNext) {
ch = curInput.next
pos = curInput.pos
} else { The point here is that it needs to know if there is a next or not. Since this is a stream, it blocks until the stream is closed or something else is sent. Unfortunately, the whole parser is based on having |
@dcsobral said: My idea is moving nextch's logic into ch and creating a flag that ch checks to see if it executes nextch's logic or return the last read character. It also adds a var to keep that last read character. Then, nextch is modified to call ch (in case it is called twice in a row), and then setting the flag to true. Finally, it is modified to return Unit instead of Char, and all (of two) places where its return value is used are changed to call ch afterwards. Unfortunately, nextch and "var ch" are both public, so this means a change in their API, however unlikely it is for anyone to be using them directly. |
J (tumeow) said:
Does this mean it will take a while to hit the repository? I don't mind using the bleeding edge version of scala but it it won't even drop to the repository for a while perhaps I should abandon this class so I can continue my project. |
@dcsobral said: It does fix the issue reported, and it doesn't seem to break anything else. However, Scala's test are a bit low on XML tests. |
J (tumeow) said (edited on May 20, 2011 3:34:04 PM UTC):
Fixing xEndTag here alone would be enough to at least get the most usual cases fixed. |
@dcsobral said: |
J (tumeow) said: |
@dcsobral said: |
J (tumeow) said: |
J (tumeow) said: :1:244: '/' expected instead of '' ^ It seems it's matching the end of the stream (but not eof) as '' or the empty string. Alas of course it cannot find a closing tag to stream:stream as.. well it is a stream. |
J (tumeow) said: |
Commit Message Bot (anonymous) said: Makes MarkupParser.nextch lazy, only reaching out for the next char Contributed by Daniel Sobral, no review. |
=== What steps will reproduce the problem (please be specific and use wikiformatting)? ===
The following code is used to read XMPP events from talk.google.com
=== What is the expected behavior? ===
I should receive all of the XML items from the input stream.
=== What do you see instead? ===
I miss the final </stream:features> tag. I verified with tcpdump that this is present. If I send further data (to trigger more data from the server) then eventually I will see the closing tag.. but then I will lack tags at the end again.
=== Additional information ===
Sometimes more than the last tag is missing. I've tried not using ".buffered" and without Buffered{Input,Output}Stream too. All the same results.
=== What versions of the following are you using? ===
The text was updated successfully, but these errors were encountered: