Counting Blog Posts using xidel

I wanted to write 100 posts in 2021, and I am nowhere close to that. I tried to look at the posts by year and see how I have performed over the years. Of course, I could have done that manually by looking at the year archive count or running a query on the database. But recently, I have started using Xidel, so why not use it? :)

So its the number inside a span with class sya_yearcount. I am getting xpath3. text() gets the content of the node.

xidel https://thejeshgn.com/archives \
--xpath3="//span[@class='sya_yearcount']/text()"

I also wanted the years, which is the value inside an anchor tag where id contains yearXXXX, where XXXX is an year. So I am going to add year2 based filtering because all blogs where done post 2000. xidel can run multiple falterings on the same page in the same command. Also you can use contains to check if the string contains a value.

xidel https://thejeshgn.com/archives \
--xpath3="//a[contains(@id,'year2')]/@id" \
--xpath3="//span[@class='sya_yearcount']/text()"

I want a JSON output. xidel outputs simple text by default but it supports other formats like xml, html, json etc

xidel https://thejeshgn.com/archives \
--xpath3="//a[contains(@id,'year2')]/@id" \
--xpath3="//span[@class='sya_yearcount']/text()" \
--output-format json-wrapped

And then I don't want logs. I want only JSON output, so I am adding --silent

I want text output to be numbers and not text
xidel https://thejeshgn.com/archives \
--xpath3="//a[contains(@id,'year2')]/@id" \
--xpath3="//span[@class='sya_yearcount']/text()" \
--output-format json-wrapped
--silent

I want text output to be numbers and not text. So we convert them into numbers

xidel https://thejeshgn.com/archives \
--xpath3="//a[contains(@id,'year2')]/@id" \
--xpath3="//span[@class='sya_yearcount']/text() / number(.)" \
--output-format json-wrapped \
--silent

And the final output looks like this.

[
    [
        "year2021",
        "year2020",
        "year2019",
        "year2018",
        "year2017",
        "year2016",
        "year2015",
        "year2014",
        "year2013",
        "year2012",
        "year2011",
        "year2010",
        "year2009",
        "year2008",
        "year2007",
        "year2006",
        "year2005",
        "year2004",
        "year2003"
    ],
    [
        51,
        34,
        33,
        37,
        53,
        54,
        57,
        40,
        25,
        34,
        46,
        48,
        73,
        71,
        203,
        20,
        13,
        3,
        4
    ]
]

Now you can use that json with any graphing library that you love. A bit of over engineering that's not needed. But its one way to learn.


You can read this blog using RSS Feed. But if you are the person who loves getting emails, then you can join my readers by signing up.

Join 2,242 other subscribers

1 Response

  1. Reino says:

    Hello Thejesh,

    You may find this command useful:

    $ xidel -s https://thejeshgn.com/archives -e ‘
    let $a:=//div[@class=”sya_container”]
    for $x at $i in $a/text()
    return
    string-join(($x,$a/span[$i]))

    2022 (5)
    2021 (60)
    2020 (34)
    […]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.