Thejesh GN

A Blog, A Website and A container for all my views with excerpts from technology, travel, films, india, photography, kannada, friends and other interests. I am Thejesh GN. Friends call me Thej

HappyTuesday Week 3 – CitizenMatters

Posted by Thejesh GN On July - 15 - 20141 COMMENT

oorvani_cm I have never written about CitizenMatters before, it seems very strange. CitizenMatters as they define is a Bangalore focused, citizen-oriented news-magazine, covering city public affairs, community and culture. They do a great job of it. Its run by Oorvani Foundation which is also behind IndiaTogether. They have been a great supporter of Open Data movement and have helped me and DataMeet in organizing OpenDataCamp – Bangalore.

Okay coming back to my HappyTuesdays. Its been one of the lowest productive tuesdays. I started with scraper for BBMP financial data. This request came from CitizenMatters. I am half done and I should be able to complete this week. This week I also completed the actual scraping of Coimbatore Property Tax and uploaded it to Internet Archive

It looks like I am going to do lot of web scraping in future too. I have a pattern for all my scraping projects. It’s a waste of time to create that project structure every time and hence I created a bare bone project structure called scraping that I can fork for every new scraping project.

On a side note, Community funding is the only way to build a truly public media to produce consistent and quality journalism. I can’t think of any better candidate than CitizenMatters. So please do contribute to Oorvani foundation. Nothing is small.

Update 18/July/2014:
Citizen matters is running this quote on their website. Thanks for thinking my views are valuable. You can click on it to visit their support page.
quotes_thejesh

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading ... Loading ...

Indian CA – NIC issues fake Google SSL certificates

Posted by Thejesh GN On July - 11 - 20144 COMMENTS

I was listening to the latest episode of Security Now this morning. I came to know that an Indian CA was issuing a fake SSL certificates for Google subdoamins.

Later got to know that it was NIC as per Google’s blog post

The certificates were issued by the National Informatics Centre (NIC) of India, which holds several intermediate CA certificates trusted by the Indian Controller of Certifying Authorities (India CCA).
The India CCA certificates are included in the Microsoft Root Store and thus are trusted by the vast majority of programs running on Windows, including Internet Explorer and Chrome. Firefox is not affected because it uses its own root store that doesn’t include these certificates.

Its scary because NIC is owned by Govt. of India. As far I know the main use of faked certificates is to do man in the middle attacks. Basically fake the end users that they are Google and read that content. Though the Google blog post doesn’t say which subdomains were faked. Google’s subdomains include Gmail, Drive etc (mail.google.com is Gmail). Which makes it very scary.

At this time, India CCA is still investigating this incident. This event also highlights, again, that our Certificate Transparency project is critical for protecting the security of certificates in the future.

Update Jul 9: India CCA informed us of the results of their investigation on July 8. They reported that NIC’s issuance process was compromised and that only four certificates were misissued; the first on June 25. The four certificates provided included three for Google domains (one of which we were previously aware of) and one for Yahoo domains. However, we are also aware of misissued certificates not included in that set of four and can only conclude that the scope of the breach is unknown.

The intermediate CA certificates held by NIC were revoked on July 3, as noted above. But a root CA is responsible for all certificates issued under its authority. In light of this, in a future Chrome release, we will limit the India CCA root certificate to the following domains and subdomains thereof in order to protect users:
gov.in
nic.in
ac.in
rbi.org.in
bankofindia.co.in
ncode.in
tcs.co.in

Chrome has been updated and I am sure Microsoft also has taken measures. Firefox is clean as they maintain their own root certificates and doesnt include these.

Not sure what else users can do as of now. Try and use Firefox as much as possible.

Also its not a bad idea to access your internet through OpenVPN in a different country. Make sure DNS pings also go through the VPN.

Good work Google Security team.

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading ... Loading ...

HappyTuesday Week 2 – Coimbatore City Data

Posted by Thejesh GN On July - 9 - 2014ADD COMMENTS

Most of my Tuesday went into baby sitting my nephew and niece. I must tell you its lot of work. Babies and computers dont go together that well. I got an hour or so to work the project I started last week. I am done with the code required to scrape the Coimbatore city property tax. This data request came from good folks of ATREE. I like to see what they will do with that :)
happytuesday_2
As of now I am running the script slowly to scrape the data ward by ward, keeping in mind the load on the servers. As it stands now, I have scraped about 93 lakh rows for 12 wards. Coimbatore has hundred wards. Once I scrape for all the wards, I will upload the data to archive.org. GitHub is bad for huge data storage.

Once property tax is done. Now that I have started, I will try to gather more data about Coimbatore. In long-term it will help some one who wants to work on Coimbatore. If you want to be part of HappyTuesday or if you have any requests, email me.

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading ... Loading ...

I am very happy that I came across this project and was able to contribute to it. Given the amount of data scraping I do, you can imagine how close this story is to my heart. Watch it on vimeo on demand if possible. You can download the cc licensed version using bittorrent from Internet Archive or just stream it

Let me know what do you think?

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading ... Loading ...

HappyTuesday Week 1 – Scraping the Web continues

Posted by Thejesh GN On July - 1 - 20141 COMMENT

Notes from my first ever HappyTuesdays. I couldn’t get to sit at HappyBelly Cafe, it seems they open only at 10am. Instead I went to french loaf. I sat outside under the green, I really liked it, weather was awesome.

I continued my work on earlier scraping projects

  • Fixed bugs and updated the data for reservoir project. Now its in a good shape to deploy for auto updating and publishing.
  • I had not updated e-procurement data in a long time. I ran the scraper today and scheduled it on one of my servers
  • Also started coding new scraping project to scrape coimbatore city municipal data. Half of it done, I am struggling with a weird form submission error. But it shouldn’t take more than couple of hours to get it running.

Well that took three hours. I spent about 30 mins in ordering, paying etc. May be next time I will end up getting more things done.

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading ... Loading ...

Get in touch