Counting My RSS Feed Subscribers

by Thejesh GN · April 29, 2022

How many subscribers does my blog have? It’s difficult to answer the question. A while back, I moved from Feedburner to my feed hosting. So I can do some estimation. It’s based on the following assumptions.

Some of the hosted multi-user feedreaders report the subscribers. We can extract that and use it.
Self-hosted cloud-based ones usually don’t. But I consider them as “1” based on the IP address.
Folks using clients on their phone/PC without any cloud component as considered “1” per IP address.
Some cloud-hosted multi-user feedreaders don’t report subscribers. Currently, I consider them as one subscriber. I need a better way to figure this out.

These are the ones that run through my script that replaced the Feedburner. Some folks use the blog’s built-in feed. I have not counted that yet. I need to figure that.

def subscribers_log(event):
    event_data = {}
    event_data["event"] = "subscriber"
    event_data["datetime"] = datetime.utcnow().isoformat()
    event_data["date"] = datetime.utcnow().isoformat()[:10]

    if "requestContext" in event:
        requestContext = event["requestContext"]
        if "path" in requestContext:
            event_data["feed"] = requestContext["path"]
        if "identity" in requestContext:
            identity = requestContext["identity"]
            if "sourceIp" in identity:
                event_data["source_ip"] = identity["sourceIp"]
            if "userAgent" in identity:
                user_agent = identity["userAgent"]
                simplified_user_agent = subscribers_re.sub("X subscribers", user_agent)
                event_data["user_agent"] = user_agent
                event_data["simplified_user_agent"] = simplified_user_agent
                match = subscribers_re.search(user_agent)
                if match:
                    event_data["count"] = int(match.group(1))
                else:
                    event_data["count"] = 1

        # Insert only if you have data
        k = (
            event_data["source_ip"]
            + event_data["simplified_user_agent"]
            + event_data["feed"]
        ).encode("utf8")
        if event_data["count"] > 1:
            k = (event_data["simplified_user_agent"] + event_data["feed"]).encode(
                "utf8"
            )

        # create unique key
        h = hashlib.md5(k).hexdigest()
        event_data["_id"] = event_data["date"] + "_" + str(h)
        print("**----------------------**")
        print(event_data)
        try:
            req = urllib.request.Request(DB_URL)
            req.add_header("Authorization", AUTH_KEY)
            req.add_header("Content-Type", "application/json; charset=utf-8")
            jsondataasbytes = json.dumps(event_data).encode("utf8")
            req.add_header("Content-Length", len(jsondataasbytes))
            response = urllib.request.urlopen(req, jsondataasbytes)
        except:
            print("Error posting data")

Above is the script I use to extract the subscription and add it to CouchDB. It has borrowed from Simon Willison‘s script. Since _id is the primary key in CouchDB, duplicate inserts are ignored.

{
  "_id": "2022-04-29_04e730996daaa95df14a71e01c9ae326",
  "_rev": "1-d1680d2ec633dffd17c17b70adde0b35",
  "event": "subscriber",
  "datetime": "2022-04-29T03:46:30.710509",
  "date": "2022-04-29",
  "feed": "/thejeshgn",
  "source_ip": "8.29.198.27",
  "user_agent": "Feedly/1.0 (+http://www.feedly.com/fetcher.html; 101 subscribers; like FeedFetcher-Google)",
  "simplified_user_agent": "Feedly/1.0 (+http://www.feedly.com/fetcher.html; X subscribers; like FeedFetcher-Google)",
  "count": 101
}

Then once a day, I pull the previous day’s data and do aggregation. I also pull data from WordPress and add it to the aggregate JSON document.

{
    "_id": "2022-04-29_subscriber_count",
    "event": "subscriber_count",
    "date": "2022-04-29",
    "feed": "/thejeshgn",
    "data":
    [
        {
            "provider": "feedly",
            "count": 101,
            "type": "rss"
        },
        {
            "provider": "theoldreader",
            "count": 23,
            "type": "rss"
        },
        {
            "provider": "bloglovin",
            "count": 3,
            "type": "rss"
        },
        {
            "provider": "inoreader",
            "count": 2,
            "type": "rss"
        },
        {
            "provider": "independent",
            "count": 70,
            "type": "rss"
        },
        {
            "provider": "wordpress",
            "count": 1015,
            "type": "follow"
        },
        {
            "provider": "wordpress",
            "count": 1132,
            "type": "email"
        }
    ]
}

The next step is to observe the script for a couple of days and weed out any bugs. And then update my subscription page and widget to reflect these numbers. My final goal is to beat my Twitter followers, and I know it’s not easy.

You can read this blog using RSS Feed. But if you are the person who loves getting emails, then you can join my readers by signing up.