介绍

Xpath全称XML Path Language,功能上使用类似路径的语法来识别和导航XML文档中的节点,同时支持HTML语言。XPath是W3C的推荐标准

Nodes

Xpath中有七种节点:

  • element
  • attribute
  • text
  • namespace
  • processing-instruction
  • comment
  • root nodes

句法

选中Nodes

表达式 描述
nodename 选中所有同名节点
/ 从root node开始选择
// 从当前节点开始选择,不管层级关系
. 选择当前节点
.. 选择当前节点的父节点
@ 选择属性

举例如下

表达式 效果
bookstore Selects all nodes with the name “bookstore”
/bookstore Selects the root element bookstore
bookstore/book Selects all book elements that are children of bookstore
//book Selects all book elements no matter where they are in the document
bookstore//book Selects all book elements that are descendant of the bookstore element, no matter where they are under the bookstore element
//@lang Selects all attributes that are named lang

Predicates 谓词

Path Expression Result
/bookstore/book[1] Selects the first book element that is the child of the bookstore element.
/bookstore/book[last()] Selects the last book element that is the child of the bookstore element
/bookstore/book[last()-1] Selects the last but one book element that is the child of the bookstore element
/bookstore/book[position()<3] Selects the first two book elements that are children of the bookstore element
//title[@lang] Selects all the title elements that have an attribute named lang
//title[@lang=‘en’] Selects all the title elements that have a “lang” attribute with a value of “en”
/bookstore/book[price>35.00] Selects all the book elements of the bookstore element that have a price element with a value greater than 35.00
/bookstore/book[price>35.00]/title Selects all the title elements of the book elements of the bookstore element that have a price element with a value greater than 35.00

选择未知节点

Wildcard Description
* Matches any element node
@* Matches any attribute node
node() Matches any node of any kind
Path Expression Result
/bookstore/* Selects all the child element nodes of the bookstore element
//* Selects all elements in the document
//title[@*] Selects all title elements which have at least one attribute of any kind

选择多条路径

通过使用|或操作符,可以选择多条路径

Path Expression Result
//book/title | //book/price Selects all the title AND price elements of all book elements
//title | //price Selects all the title AND price elements in the document
/bookstore/book/title | //price Selects all the title elements of the book element of the bookstore element AND all the price elements in the document

Axes

Axes(轴)用来表示和当前节点之间的关系,用于定位树上当前节点的关联节点

AxisName Result
ancestor Selects all ancestors (parent, grandparent, etc.) of the current node
ancestor-or-self Selects all ancestors (parent, grandparent, etc.) of the current node and the current node itself
attribute Selects all attributes of the current node
child Selects all children of the current node
descendant Selects all descendants (children, grandchildren, etc.) of the current node
descendant-or-self Selects all descendants (children, grandchildren, etc.) of the current node and the current node itself
following Selects everything in the document after the closing tag of the current node
following-sibling Selects all siblings after the current node
namespace Selects all namespace nodes of the current node
parent Selects the parent of the current node
preceding Selects all nodes that appear before the current node in the document, except ancestors, attribute nodes and namespace nodes
preceding-sibling Selects all siblings before the current node
self Selects the current node
路径定位表达式
  • 绝对路径以/开头
  • 定位的语法如下:axisname::nodetest[predicate]
    • Axisname轴名称
    • Nodetest节点选择
    • Predicate 谓词细化选择
Example Result
child::book Selects all book nodes that are children of the current node
attribute::lang Selects the lang attribute of the current node
child::* Selects all element children of the current node
attribute::* Selects all attributes of the current node
child::text() Selects all text node children of the current node
child::node() Selects all children of the current node
descendant::book Selects all book descendants of the current node
ancestor::book Selects all book ancestors of the current node
ancestor-or-self::book Selects all book ancestors of the current node - and the current as well if it is a book node
child::*/child::price Selects all price grandchildren of the current node

操作符

Xpath操作符可以返回如下类型的值:

  • node-set
  • string
  • number
  • boolean
Operator Description Example
| Computes two node-sets //book | //cd
+ Addition 6 + 4
- Subtraction 6 - 4
* Multiplication 6 * 4
div Division 8 div 4
= Equal price=9.80
!= Not equal price!=9.80
< Less than price<9.80
<= Less than or equal to price<=9.80
> Greater than price>9.80
>= Greater than or equal to price>=9.80
or or price=9.80 or price=9.70
and and price>9.00 and price<9.90
mod Modulus (division remainder) 5 mod 2

Functions函数

函数 作用
name() 获取node的名称//[starts-with(name(), ‘h’)]
text() 获取node的文本//button[text()=“Submit”]
lang(str) 获取字符串的语言
count() node计数//table[count(tr)=1]
position() 计算位置//ol/li[position()=2]
number()
boolean()
not() 取反button[not(starts-with(text(),“Submit”))]
contains() 字符串包含font[contains(@class,“head”)]
starts-with() font[starts-with(@class,“head”)]
ends-with() font[ends-with(@class,“head”)]
concat(x, y)
substring(str, start, len)

浏览器使用示例

chrome浏览器有两种使用方法:

  • Devtool-Elements, CTRL+F
  • Devtool-Console, $x(xpath_syntax)
案例 解释
/bookstore/book/title 选择所有书的title
/bookstore/book[1]/title 选择第一本书的title
/bookstore/book/price[text()] 选择所有书的价格
/bookstore/book[price>35]/price 选择价格大于35的price节点
/bookstore/book[price>35]/title 选择价格大于35的title节点
$x('//bookstore/book/title')
(4) [title, title, title, title]
$x('//bookstore/book[1]/title')
[title]
$x('//bookstore/book[position() > 2]/title')
(2) [title, title]
$x('//bookstore/book[price > 30]/title')
(2) [title, title]

$x('.//bookstore/book[1]/price')[0].textContent
'30.00'
$x('.//bookstore/book[1]/price[text()]')[0].textContent
'30.00'
$x('.//bookstore/book[1]/price/text()')[0].textContent
'30.00'

$x('.//bookstore/book[@id="testid"]/title')[0].textContent
'Everyday Italian'
$x('.//bookstore/book[@class="testclass"]/title')[0].textContent
'Harry Potter'
$x('.//bookstore/book[contains(@class, "test")]/title')[0].textContent
'Harry Potter'

Xpath测试html打开开发者工具可以试验 Everyday Italian Giada De Laurentiis 2005 30.00 Harry Potter J K. Rowling 2005 29.99 XQuery Kick Start James McGovern Per Bothner Kurt Cagle James Linn Vaidyanathan Nagarajan 2003 49.99 Learning XML Erik T. Ray 2003 39.95
<?xml version="1.0" encoding="UTF-8"?>

<bookstore>

<book id="testid" category="cooking">
  <title lang="en">Everyday Italian</title>
  <author>Giada De Laurentiis</author>
  <year>2005</year>
  <price>30.00</price>
</book>

<book class="testclass" category="children">
  <title lang="en" href="test.pdf">Harry Potter</title>
  <author>J K. Rowling</author>
  <year>2005</year>
  <price>29.99</price>
</book>

<book category="web">
  <title lang="en" href="/test">XQuery Kick Start</title>
  <author>James McGovern</author>
  <author>Per Bothner</author>
  <author>Kurt Cagle</author>
  <author>James Linn</author>
  <author>Vaidyanathan Nagarajan</author>
  <year>2003</year>
  <price>49.99</price>
</book>

<book category="web">
  <title lang="en" href="https://www.google.com">Learning XML</title>
  <author>Erik T. Ray</author>
  <year>2003</year>
  <price>39.95</price>
</book>

</bookstore>

参考

Xpath cheatsheet

XPath Tutorial