我需要解析一个XML流。由于我只需要做一次就可以构建我的java对象,因此SAX看起来很自然。我正在扩展DefaultHandler并实现startElement,endElement和character方法,在我的类中具有保存当前读取值(在characters方法中使用)的成员。
我可以轻松完成所需的工作,但是我的代码变得相当复杂,并且我确信没有理由这样做,并且我可以做不同的事情。我的XML的结构是这样的:
<players> <player> <id></id> <name></name> <teams total="2"> <team> <id></id> <name></name> <start-date> <year>2009</year> <month>9</month> </start-date> <is-current>true</is-current> </team> <team> <id></id> <name></name> <start-date> <year>2007</year> <month>11</month> </start-date> <end-date> <year>2009</year> <month>7</month> </end-date> </team> </teams> </player> </players>
当我意识到在文件的多个区域中使用相同的标签名称时,我的问题就开始了。例如,一个球员和一个团队都存在id和name。我想创建我的Java类Player和Team的实例。解析时,我会保持布尔值标记来告诉我是否属于“团队”部分,以便在endElement中,我将知道该名称是团队的名称,而不是玩家的名称,依此类推。
我的代码如下所示:
public class MyParser extends DefaultHandler { private String currentValue; private boolean inTeamsSection = false; private Player player; private Team team; private List<Team> teams; public void characters(char[] ch, int start, int length) throws SAXException { currentValue = new String(ch, start, length); } public void startElement(String uri, String localName, String name, Attributes attributes) throws SAXException { if(name.equals("player")){ player = new Player(); } if (name.equals("teams")) { inTeamsSection = true; teams = new ArrayList<Team>(); } if (name.equals("team")){ team = new Team(); } } public void endElement(String uri, String localName, String name) throws SAXException { if (name.equals("id")) { if(inTeamsSection){ team.setId(currentValue); } else{ player.setId(currentValue); } } if (name.equals("name")){ if(inTeamsSection){ team.setName(currentValue); } else{ player.setName(currentValue); } } if (name.equals("team")){ teams.add(team); } if (name.equals("teams")){ player.setTeams(teams); inTeamsSection = false; } } }
因为在我的真实场景中,除了团队之外,我还有一个玩家节点,而且这些节点还具有名称和ID之类的标签,所以我发现自己陷入了类似于inTeamsSection的几个布尔值,并且endElement方法变得冗长而复杂条件。
我应该怎么做?例如,我怎么知道名称标签属于什么?
谢谢!
编写SAX解析器时有一个巧妙的窍门:允许ContentHandler在解析时更改 XMLReader的。这允许将不同元素的解析逻辑分为多个类,从而使解析更具模块化和可重用性。当一个处理程序看到其结束元素时,它将切换回其父级。您将实现多少个处理程序。代码如下所示:
ContentHandler
public class RootHandler extends DefaultHandler { private XMLReader reader; private List<Team> teams; public RootHandler(XMLReader reader) { this.reader = reader; this.teams = new LinkedList<Team>(); } public void startElement(String uri, String localName, String name, Attributes attributes) throws SAXException { if (name.equals("team")) { // Switch handler to parse the team element reader.setContentHandler(new TeamHandler(reader, this)); } } } public class TeamHandler extends DefaultHandler { private XMLReader reader; private RootHandler parent; private Team team; private StringBuilder content; public TeamHandler(XMLReader reader, RootHandler parent) { this.reader = reader; this.parent = parent; this.content = new StringBuilder(); this.team = new Team(); } // characters can be called multiple times per element so aggregate the content in a StringBuilder public void characters(char[] ch, int start, int length) throws SAXException { content.append(ch, start, length); } public void startElement(String uri, String localName, String name, Attributes attributes) throws SAXException { content.setLength(0); } public void endElement(String uri, String localName, String name) throws SAXException { if (name.equals("name")) { team.setName(content.toString()); } else if (name.equals("team")) { parent.addTeam(team); // Switch handler back to our parent reader.setContentHandler(parent); } } }