Thejesh GN

A Blog, A Website and A container for all my views with excerpts from technology, travel, films, india, photography, kannada, friends and other interests. I am Thejesh GN. Friends call me Thej

Archive for the ‘Technology’ Category

CIS has a detailed blog post explaining why all of us who care about freedom of expression on internet should support annulment motion by MP P. Rajeeve.

You should actually read that blog post but the following are the most important points

No chance to defend.
There is no need to inform users before this content is removed. So, even material put up by a political party can be removed based on anyone’s complaint, without telling that party. This was done against a site called *CartoonsAgainstCorruption.com”. This goes against Article 19(1)(a).

Government censorship, not ‘self-regulation’.
The government says these are industry best-practices in existing terms of service agreements. But the Rules require all intermediaries to include the government-prescribed terms in an agreement, no matter what services they provide. It is one thing for a company to choose the terms of its terms of service agreement, and completely another for the government to dictate those terms of service.


Video: why the IT Rules are a threat to your Internet as you know it

Government, police or an angry mob can force your blog (freedom of expression) or your business (web-based) to go offline without notice. Which is kind of insane isn’t it?

Thankfully there are some smart MPs in our parliament who are trying to pass a motion to annul these rules. We can’t just sit and watch. We need to support MPs who are supporting this motion. Call other MPs and ministers to urge them to support this motion.

To call your MP, ministers and to send them an email, use this online form provided by CIS. Its easy doesn’t take more than a minute. If you want to do more use the postal address to send hard-copies of the letter or use the given official phone number to call them.

To support Member of Parliament P. Rajeeve, go change.org and sign the petition. It doesn’t stop there. Share this information with your friends and family. Urge them to do the same.

How to stay anonymous online

Posted by Thejesh GN On April - 10 - 2012

I like to use my real name everywhere. But I can afford to that because I am in a reasonably safe place. But there are people who rationally fear retaliation from employers, bad police, bullies, or the rogue state. So it makes sense to stay anonymous.

It is very difficult to stay anonymous online. I believe there is no way you can remain anonymous forever. Finding you depends on how much time, intelligence, computer power and motivation your opponents have. Identification can only be delayed for practically long time depending on the steps you take.

This how-to is not probably complete. My knowledge is quite limited. But if I ever want to remain anonymous, I would follow all of them. In fact I follow most of these steps just to be safe online.

I have tried to make it as simple as possible for non techies. But there are pointers for the geeks to follow into the rabbit hole. They are usually marked with [g] for geek alert.

Trustable Fake Identity

There are two ways to remain anonymous online. Say if you are a blogger, your username could be so obvious that your adversaries will know you are trying to hide or you could use a very realistic fake identity which will confuse them. I would go second way. But I would give a thought about country, city, religion, sex and other identifiable characteristics of my identity before I decide on upon one. It’s very important to create a trustable fake identity. It’s not an easy job. So work on it.

Computer OS – Linux

The most important thing to remain safe online is to keep your computer safe. You need to be able to trust your computer to do the right thing. Linux I use Linux (Debian/Ubuntu). I trust them because its open source and most of the vulnerabilities are available in the public domain. So I know whats going on in the system.

So use Linux as much as possible. If you can’t avoid using Windows or Mac try dual booting or a live CD. Linux like before is not very difficult to use these days. Distros like Ubuntu are very user-friendly. Try it.

Tails is a live CD or live USB that aims at preserving your privacy and anonymity. It helps you to use the Internet anonymously almost anywhere you go and on any computer. All connections to the Internet are forced to go through the Tor network. The OS leaves no trace on the computer you’re using unless you ask it explicitly. It uses state-of-the-art cryptographic tools to encrypt your files, email and instant messaging. It’s an all in one package which is very useful non geeks. Install it on your USB and use it.

For highly paranoid ones there are distros like TinFoilHat [g] available. It even protects you against electromagnetic radiation eavesdropping [g].
Read the rest of this entry »

Personal Notes from Open DataCamp Bangalore

Posted by Thejesh GN On April - 6 - 2012

When we planned for the Open DataCamp, we never expected to attract such a big gathering of interesting people. The venue we picked could accommodate hundred participants. We didn’t expect more that on a long weekend in any way.
But on the last day we had around 200 participants on the list. I was hoping for the worst :) I woke up at six and was at Google by eight. Mostly because I couldn’t sleep.

As usual we started with arranging tables and setting up the projector. I haven’t seen one conference where projectors weren’t a PITA. It took more time than I expected. Surprisingly enough my Ubuntu was the most easiest to work with projector and next best was Mac.

By 9.45 we had more people than I expected. Main hall was full and people were standing. We had around 140+ participants already. They were very comfortable in having conversations in corridor or making a place for themselves to sit. Thats the best thing that could happen to us. So the number didn’t worry me after that.

We started at sharp 10 am with me introducing to the concept of BarCamp and the day, followed by panel discussion.

From then on it was a smooth ride. I didn’t have to do other than time management.

I liked all the morning session talks. I could not attend most of the noon sessions. Among the ones I attended Anand’s talk on Pictures through Numbers

and Shekhar’s Open Data & Free Maps are my favorites.

I spend most of my afternoon either tweeting or scheduling or in conversations.I couldn’t do my session, may be next datacamp.

This camp was marginally different from the regular Barcamps. Morning sessions were done in a single hall. And it was curated to keep the audience interested. After noon sessions were in three halls + corridor. This time due to lack of time I had to do the curation of talks all by myself. But I would settle for standard barcamp way next time. Never the less most participants liked my curation. So I am happy.

By six we were done. But the conversation in the corridor and bar continued till eight. I reached home by 10 dead tired. A day well spent.

A big thanks to my co-organizer Nisha and all the volunteers from DataMeet. It would not have been possible without them. Thanks to Meera for all the pictures. They are on Flickr. All videos were shot and edited by our friends at HasGeek. And at last thanks to all the sponsors (Google, MSR, IWP, Gramener, Akshara, CIS and HasGeek) for working with us and trusting us. Now I am waiting for the next DataCamp.

First Open DataCamp is here

Posted by Thejesh GN On February - 29 - 2012

Open DataCamp is a one day unconference for people working with data from various sectors to come together and share their projects and ideas. The first one is scheduled on this March 24th, in Bangalore. Google has very graciously agreed to host us in their cafeteria. We are still working on the details of the sessions. I am sure there will be enough sessions for both technical and non-technical attendees. It will most probably be structured like a workshop in the morning half and like a barcamp in the second half. And hence the event page is still under construction. Thanks to VSR for the design. The site code and other things are in public domain, and are at bitbucket. Any help is welcome.

Yesterday, I sent my first invitation mail to datameet group.

We are writing to invite you to the 1st Open Data Camp in India on Saturday, March 24, 2012, at Bangalore.

This event is dedicated to all aspects of open data, from working with data, to getting it, and of course how to use it to create impact. This event is being organized by DataMeet, an online group of data enthusiast who hope to use data to create an impact in the lives of people living and working in India.

This Open Data Camp would bring together all the main development sector actors working with/on open data. Some of them include Nonprofit Organizations like India Water Portal, Akshara FoundationAccountability InitiativeAzim Premji University and PRS Legislative;
Also, the Indian Government, has just passed the National Data Sharing and Accessibility Policy - essentially the Open Data Policy for India. The policy itself is rotten and nothing to do with ‘open’ and ‘sharing’; but considering that before this policy all data in India was part of the ‘Official Secrets Act’, this is no small gain.

Policy and Advocacy Groups like Centre for Internet and SocietyTactical Tech Collective, and of course the Government of India;

and, the broader interest group that include Technologists, and Technology Companies (from Google and Gramener), designers and Design Schools, Journalists and Media Groups.

I hope you can attend the event. Registration is compulsory to attend the event.   Please register at doattend.com

You can find more about the event @ http://odc.datameet.org    — it’s still under progress. I will keep updating it.

If you have any questions feel free to contact me at any time.

Thanks,
Thej, Nisha and Team.

Please do register.

Octopoda – MapReduce for Human Beings in Python

Posted by Thejesh GN On February - 21 - 2012

I have been wanting to learn MapReduce for a long time. I never got a requirement where I could use it. Last few weeks I have dabbling with huge datasets. It was time and as usual I started with wikipedia.

There are huge systems and frameworks built on the concept of MapReduce. They use distributed filesystem, have fault tolerance and can process petabytes of data. But I wanted something simple. I wanted something that’s minimalistic and does everything that a MapReduce framework should do and is written in Python.

“Map” : The master node takes the input, partitions it up into smaller sub-problems, and distributes them to worker nodes.

“Reduce” : The master node then collects the answers to all the sub-problems and combines them in some way to form the output.

I found MinceMeatPy and Octo.py. Both are single python file MapReduce frameworks. mincemeatpy is actively developed, where as last checkin to octo.py was probably in 2008.

I thought the best way to learn the concept is to write the framework that implements it. But then reinventing the wheel is waste of everybody’s time. So I choose the middle ground and forked Octo.py and called it Octopoda.

I removed lot of code and in turn made it simple and inflexible. Added simple auth, added some examples, created a wiki and road map and how could I forget ASCII art :)

============================================================
        _____                                  _
       / ___ \       _                        | |
      | |   | | ____| |_  ___  ____   ___   _ | | ____
      | |   | |/ ___)  _)/ _ \|  _ \ / _ \ / || |/ _  |
      | |___| ( (___| |_| |_| | | | | |_| ( (_| ( ( | |
       \_____/ \____)\___)___/| ||_/ \___/ \____|\_||_|
                  MapReduce for HumanBeings
          Repo: http://code.thejeshgn.com/octopoda
============================================================

I am now working on channel encryption. I need help. The project is hosted on bitbucket. Go ahead and fork and send me pull request with your changes.

A standard MapReduce example is counting words.

#wordCount.py
source = {1:"Humpty Dumpty sat on a wall",
2:"Humpty Dumpty had a great fall",
3:"All the King's horses and all the King's men",
 4:"Couldn't put Humpty together again" }

def final(key, value):
    print key, value

# client
def mapfn(key, value):
    for w in value.split():
        yield w, 1

def reducefn(key, value):
    result = 0
    for v in value:
        result += v
    return result

On server:
$ python octopoda.py server ./examples/wordCount.py

On client or nodes:
$ python octopoda.py client localhost_or_server_ip

You can start as many clients as you want. Server will handle task distribution and aggregation. I know this is an overly simplistic example. With a little modification the same example can be made to calculate the word count from all the files in a directory. I will write about that in my next post. Until then have fun.

Get in touch