[email protected] wrote:
We got all rows by:
library(XML)
doc =
htmlParse('http://www.statcan.gc.ca/daily-quotidien/090520/t090520b1-eng.htm')
rows = xpathSApply(doc, '//table/tbody/tr')
The last row is:
row_last = rows[15]
row_last
[[1]]
<tr><td id="t1stub17" class="stub1 RGBShade"><b>Unsmoothed composite
leading indicator</b></td>
<td align="right" headers="hdt1r1c2 t1stub17"
class="data"><b>221.8</b></td>
<td align="right" headers="hdt1r1c3 t1stub17"
class="data"><b>218.4</b></td>
<td align="right" headers="hdt1r1c4 t1stub17"
class="data"><b>217.1</b></td>
<td align="right" headers="hdt1r1c5 t1stub17"
class="data"><b>211.2</b></td>
<td align="right" headers="hdt1r1c6 t1stub17"
class="data"><b>209.4</b></td>
<td align="right" headers="hdt1r1c7 t1stub17"
class="data"><b>210.5</b></td>
<td align="right" headers="hdt1r1c8 t1stub17"
class="data"><b>0.5</b></td>
</tr>
How to find these b entries: Unsmoothed composite leading indicator,
221.8, 218.4, 217.1, 211.2, 209.4, 210.5, 0.5?
Use XPath again and restrict the search for the <b> nodes to
this 15-th row
as.numeric(xpathSApply( row_last, ".//b", xmlValue))
Note the . at the beginning of the XPath expression
which anchors the search at the <tr> in row_last.
D.
Thanks,
-james
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.