Counting Blog Posts using xidel
I wanted to write 100 posts in 2021, and I am nowhere close to that. I tried to look at the posts by year and see how I have performed over the years. Of course, I could have done that manually by looking at the year archive count or running a query on the database. But recently, I have started using Xidel, so why not use it? :)
So its the number inside a span with class sya_yearcount
. I am getting xpath3. text()
gets the content of the node.
xidel https://thejeshgn.com/archives \
--xpath3="//span[@class='sya_yearcount']/text()"
I also wanted the years, which is the value inside an anchor tag where id
contains yearXXXX
, where XXXX
is an year. So I am going to add year2
based filtering because all blogs where done post 2000. xidel can run multiple falterings on the same page in the same command. Also you can use contains
to check if the string contains a value.
xidel https://thejeshgn.com/archives \
--xpath3="//a[contains(@id,'year2')]/@id" \
--xpath3="//span[@class='sya_yearcount']/text()"
I want a JSON output. xidel outputs simple text by default but it supports other formats like xml, html, json etc
xidel https://thejeshgn.com/archives \
--xpath3="//a[contains(@id,'year2')]/@id" \
--xpath3="//span[@class='sya_yearcount']/text()" \
--output-format json-wrapped
And then I don't want logs. I want only JSON output, so I am adding --silent
I want text output to be numbers and not text
xidel https://thejeshgn.com/archives \
--xpath3="//a[contains(@id,'year2')]/@id" \
--xpath3="//span[@class='sya_yearcount']/text()" \
--output-format json-wrapped
--silent
I want text output to be numbers and not text. So we convert them into numbers
xidel https://thejeshgn.com/archives \
--xpath3="//a[contains(@id,'year2')]/@id" \
--xpath3="//span[@class='sya_yearcount']/text() / number(.)" \
--output-format json-wrapped \
--silent
And the final output looks like this.
[
[
"year2021",
"year2020",
"year2019",
"year2018",
"year2017",
"year2016",
"year2015",
"year2014",
"year2013",
"year2012",
"year2011",
"year2010",
"year2009",
"year2008",
"year2007",
"year2006",
"year2005",
"year2004",
"year2003"
],
[
51,
34,
33,
37,
53,
54,
57,
40,
25,
34,
46,
48,
73,
71,
203,
20,
13,
3,
4
]
]
Now you can use that json with any graphing library that you love. A bit of over engineering that's not needed. But its one way to learn.
Hello Thejesh,
You may find this command useful:
$ xidel -s https://thejeshgn.com/archives -e ‘
let $a:=//div[@class=”sya_container”]
for $x at $i in $a/text()
return
string-join(($x,$a/span[$i]))
‘
2022 (5)
2021 (60)
2020 (34)
[…]