Get nodes where child node contains an attribute

Suppose I have the following XML:

<book category="CLASSICS">
<title lang="it">Purgatorio</title>
<author>Dante Alighieri</author>
<year>1308</year>
<price>30.00</price>
</book>


<book category="CLASSICS">
<title lang="it">Inferno</title>
<author>Dante Alighieri</author>
<year>1308</year>
<price>30.00</price>
</book>


<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>


<book category="WEB">
<title lang="en">XQuery Kick Start</title>
<author>James McGovern</author>
<author>Per Bothner</author>
<author>Kurt Cagle</author>
<author>James Linn</author>
<author>Vaidyanathan Nagarajan</author>
<year>2003</year>
<price>49.99</price>
</book>


<book category="WEB">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>

I would like to do an xpath that gets back all book nodes that have a title node with a language attribute of "it".

My attempt looked something like this:

//book[title[@lang='it']]

But that didn't work. I expect to get back the nodes:

<book category="CLASSICS">
<title lang="it">Purgatorio</title>
<author>Dante Alighieri</author>
<year>1308</year>
<price>30.00</price>
</book>


<book category="CLASSICS">
<title lang="it">Inferno</title>
<author>Dante Alighieri</author>
<year>1308</year>
<price>30.00</price>
</book>

Any hints?

153754 次浏览

Try

//book[title/@lang = 'it']

This reads:

  • get all book elements
    • that have at least one title
      • which has an attribute lang
        • with a value of "it"

You may find this helpful — it's an article entitled "XPath in Five Paragraphs" by Ronald Bourret.

But in all honesty, //book[title[@lang='it']] and the above should be equivalent, unless your XPath engine has "issues." So it could be something in the code or sample XML that you're not showing us -- for example, your sample is an XML fragment. Could it be that the root element has a namespace, and you aren't counting for that in your query? And you only told us that it didn't work, but you didn't tell us what results you did get.

//book[title[@lang='it']]

is actually equivalent to

 //book[title/@lang = 'it']

I tried it using vtd-xml, both expressions spit out the same result... what xpath processing engine did you use? I guess it has conformance issue Below is the code

import com.ximpleware.*;
public class test1 {
public static void main(String[] s) throws Exception{
VTDGen vg = new VTDGen();
if (vg.parseFile("c:/books.xml", true)){
VTDNav vn = vg.getNav();
AutoPilot ap = new AutoPilot(vn);
ap.selectXPath("//book[title[@lang='it']]");
//ap.selectXPath("//book[title/@lang='it']");


int i;
while((i=ap.evalXPath())!=-1){
System.out.println("index ==>"+i);
}
/*if (vn.endsWith(i, "< test")){
System.out.println(" good ");
}else
System.out.println(" bad ");*/


}
}
}

Try to use this xPath expression:

//book/title[@lang='it']/..

That should give you all book nodes in "it" lang

I would think your own suggestion is correct, however the xml is not quite valid. If you are running the //book[title[@lang='it']] on <root>[Your"XML"Here]</root> then the free online xPath testers such as one here will find the expected result.

Years later, but a useful option would be to utilize XPath Axes (https://www.w3schools.com/xml/xpath_axes.asp). More specifically, you are looking to use the descendants axes.

I believe this example would do the trick:

//book[descendant::title[@lang='it']]

This allows you to select all book elements that contain a child title element (regardless of how deep it is nested) containing language attribute value equal to 'it'.

I cannot say for sure whether or not this answer is relevant to the year 2009 as I am not 100% certain that the XPath Axes existed at that time. What I can confirm is that they do exist today and I have found them to be extremely useful in XPath navigation and I am sure you will as well.