Author: Thejesh GN

2

Programmatically Creating Embulk Configuration Files

Embulk needs a YAML file configuration for each data load. It’s a simple format, very human-readable. But there are cases where I want the YAML files to generate dynamically. Embulk does support an experimental feature that involves liquid templates. But my team is well versed in Python and Jinja2. Hence that is what we use.

Echo resting under the tree. 1

Weekly Notes 51/2022

We are in Thrissur. It took us eight hours to reach Thrissur from Bangalore, which is a bit faster than last year or so. Primarily due to the Kuthiran tunnel. The tolls have also increased quite a bit. I will have a page to track it.

3

My Boring Yet Modern Data Stack

We have a data stack that we have been using for years now. We have used it with medium to large customers, and they have worked very well. The goal has always been simple, stable, composable tools that can be used on the developer’s machine and scaled to work with massive data on production. You can self-host them, host them on the cloud, or get managed services based on your need.

Very similar to my web stack. It’s called “Boring” not because it’s dull but because there are minimal unwanted surprises. So my current stack for data looks like this. This stack is both “Modern” and “Boring.”

0

Weekly Notes 50/2022

Since I write weekly notes and there is a week number in the title, I pay attention to how fast the time moves. It’s been an excellent way to keep track of time. We are just two weeks away from 2023. Happy holidays folks.

3

Embulk for extracting and loading data

Embulk is a bulk data loader. It helps transfer data between different types of databases, storages, file formats, cloud services, etc. It’s like a Unix tool. It’s simple, robust, and works well with other tools.

My followers 2

microblog.pub : Where do my Followers and Following come from?

I run the microblog.pub as my ActivityPub server. Since I had almost reached 100+ followers and following, I was curious about the diversity or spread of instances. So I wrote some SQL queries to explore the downloaded SQLite.