|
Numeric Character References (NCRs):
This is a problem that came up in microsoft.public.dotnet.xml.
The message is
here.
I have a stylesheet (given below) that's sole purpose
is to transform a UTF-8 document to us-ascii using Numeric Character
References (NCRs). This allows our software that's not Unicode
compatible to at least work with it - to some degree.
However, *ALL I GET* is UTF-8 - it's like .NET XslTransform is
somehow ignoring the new encoding. I've tried changing the
XmlTextWriter encoding from UTF8 to ASCII but neither helps.
Please Help!
Here's my code and stylesheet:
XPathDocument xmlDoc;
XslTransform xslT;
XmlTextWriter myWriter;
myWriter = new XmlTextWriter(
Server.MapPath( "outXML.xml" ),
System.Text.Encoding.UTF8 );
xmlDoc = new XPathDocument(
Server.MapPath( "XMLCompareFiles\\s2v21chp8f11529.xml" ) );
xslT = new XslTransform();
xslT.Load( Server.MapPath
( "Transform2NCR.xslt" ) );
xslT.Transform( xmlDoc, null,
myWriter );
myWriter.Close();
<!----- STYLESHEET ---------->
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output omit-xml-declaration="no" method="xml"
media-type="text/xml" indent="no" encoding="iso8859-1" />
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node
()" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
The original source was UTF-8 and the goal was to output the
document as NCR's because the system the document needed to be sent
to did not support Unicode.
We tried about 5 different ways to try to create a .Net solution
that would not destroy the output. Even if the encoding method on
the XSLT was set to iso8559-1, the output was:
<?xml version="1.0"
encoding="iso-8859-1"?><objects><customers><name>?????</name>
<otherInfo>??????????????</otherInfo></customers></objects>
The correct output should be:
<?xml version="1.0" encoding="iso8859-1"?>
<objects><customers><name>第學期第章</name>
<otherInfo>有開機系統命令
路由器會到哪裡</otherInfo>
</customers></objects>
The following .Net Methods were all tried with the output seen in
the first example.
public
string
MemoryStream()
{
XslTransform
Xsl =
new
XslTransform();
string
Result;
//string FullPath = xmlUrl;
MemoryStream
ms =
new
MemoryStream();
XmlTextWriter
tr =
new
XmlTextWriter(ms,System.Text.Encoding.GetEncoding(1252));
//Load the stylesheet.
Xsl.Load("Transform2NCR.xslt");
//Create a new XPathDocument
and load the XML data to be transformed.
XPathDocument
mydata =
new
XPathDocument("SampleXML.xml");
Xsl.Transform(mydata,
null,
tr);
tr.Flush();
ms.Position
= 1;
StreamReader
sr =
new
StreamReader(ms);
Result =
sr.ReadToEnd();
tr.Close();
return (Result);
}
public
string
StringWriter()
{
XslTransform
myXslTransform;
XPathDocument
myXPathDocument
= new
XPathDocument("SampleXML.xml");
myXslTransform
= new
XslTransform();
myXslTransform.Load("Transform2NCR.xslt");
System.Globalization.CultureInfo
ci =
new
System.Globalization.CultureInfo(0x0409);
System.IO.StringWriter
stWrite
= new
System.IO.StringWriter(ci);
myXslTransform.Transform(myXPathDocument,
null,
stWrite);
return (stWrite.ToString());
}
public
void
usingFileStream()
{
XslTransform
Xsl =
new
XslTransform();
//string Result;
//string FullPath = xmlUrl;
Stream
fs =
new
FileStream("outputFileStream.xml",FileMode.Create,FileAccess.Write);
//XmlTextWriter tr = new
XmlTextWriter(ms,System.Text.Encoding.ASCII);
//Load the stylesheet.
Xsl.Load("Transform2NCR.xslt");
//Create a new XPathDocument
and load the XML data to be transformed.
XPathDocument
mydata =
new
XPathDocument("SampleXML.xml");
Xsl.Transform(mydata,
null,
fs);
fs.Flush();
//fs.Position = 1;
//StreamReader sr = new StreamReader(fs);
//Result = sr.ReadToEnd();
fs.Close();
//return (Result);
}
public
void
WriteToFile()
{
XslTransform
Xsl =
new
XslTransform();
//Load the stylesheet.
Xsl.Load("Transform2NCR.xslt");
Xsl.Transform("SampleXML.xml","outFileToFile.xml");
}
As you can tell, we were getting pretty desperate.
Then we Tried to interop with MSXML4.0, performing the
transformation using XSLTemplate40 and setting the output to a
second MSXML Document. When you moused over the .XML property of the
DomDocument, the output was correct, but when you read the
result into a StreamWriter, the result was messed up again.
Ok so not really having the time to care if it's an interop problem,
I tried a different method. I Transformed the object into an
ADODB.Stream object, setting the encoding to ISO-8559-1 then
Returned the result using the ReadText Method. IT WORKS!!!
The only hack we could figure out was to interop with MSXML and
ADODB and Return the Result from a function call.
Here is the sample code that returned the correct result.
Public
Function
MsXMLADOStreamVB()
As
String
Dim oXML
As
New
MSXML2.DOMDocument40()
Dim
oXSL
As
New
MSXML2.DOMDocument40()
Dim
oStream
As
New
ADODB.Stream()
Dim
bLoaded
As
Boolean
Dim
Result
As
String
oXML.async
= False
oXSL.async
= False
oXML.validateOnParse
= False
oXSL.validateOnParse
= False
bLoaded
= oXML.load("SampleXML.xml")
If
Not
bLoaded
Then
MsgBox("XML Not Loaded")
End
If
bLoaded
= oXSL.load("Transform2NCR.xslt")
If
Not
bLoaded
Then
MsgBox("XSL Not Loaded")
End
If
Try
oStream.Charset
= "iso8859-1"
oStream.Open()
oXML.transformNodeToObject(oXSL,
oStream)
Catch
e
As
Exception
MsgBox(e.ToString)
End
Try
oStream.Position
= 0
Result =
oStream.ReadText()
Return (Result)
End
Function
If anyone knows the real solution, or knows if this is a bug, please
email me at nntp@fesersoft.com.
Thank you,
Joe
|