Invalid <META> tag in Transformer output with Java

I have a Java String with some xml-formatted xhtml code. First I parse this into a Document with

	public Document getDocumentFromString(String xml) throws SAXException, IOException, ParserConfigurationException {
		DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
		DocumentBuilder db = dbf.newDocumentBuilder();
 
		return db.parse(new InputSource(new StringReader(xml)));
	}

After some time I transform this back to a String with

	public String getStringFromDocument(Document doc) throws TransformerException {
		StringWriter writer = new StringWriter();
		StreamResult result = new StreamResult(writer);
 
		TransformerFactory tf = TransformerFactory.newInstance();
		Transformer transformer = tf.newTransformer();
 
		DOMSource source = new DOMSource(doc);
 
		transformer.transform(source, result);
 
 
		return writer.toString();
	}

The thing is that in my StringWriter I get an extra

<META http-equiv="Content-Type" content="text/html; charset=utf-8">
tag, which is not XHTML compliant since it does not have an closing tag. This regarless if I have a valid meta tag included or not.

After some research I found that these properties set to the Transformer helped and gave me the original String back.

	public static String getStringFromDocument(Document doc) throws TransformerException {
		StringWriter writer = new StringWriter();
		StreamResult result = new StreamResult(writer);
 
		TransformerFactory tf = TransformerFactory.newInstance();
		Transformer transformer = tf.newTransformer();
		transformer.setOutputProperty(OutputKeys.METHOD, "xml"); // make sure Transformer does not add an invalid <META> tag.
		transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
 
		DOMSource source = new DOMSource(doc);
 
		transformer.transform(source, result);
 
 
		return writer.toString();
	}

Post new comment

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
  • You can enable syntax highlighting of source code with the following tags: <code>, <blockcode>. The supported tag styles are: <foo>, [foo].

More information about formatting options