Building HOPCOMS Fruit/Veg Rate API using CouchDB

I love HOPCOMS stores. There are quite a few stores across Bangalore. They are popular for fresh vegetables and fruits.

The Horticultural Producers’ Co-operative Marketing and Processing Society Ltd. or HOPCOMS was established with the principal objective of establishing a proper system for the marketing of fruits and vegetables; one that benefits both the farming community and the consumers. Prior to the establishment of HOPCOMS, no proper system existed in Karnataka for the marketing of horticultural produce. Farmers were in the clutches of the middlemen and the system benefited neither the farmers nor the consumers.

On their site they publish rates of vegetables on every weekday. I used to visit their website to check the rates once in a week or so. But I always wondered if HOPCOMS had a proper API what would I use it for? I could have an alert on Onion rate based on historical data. Let’s say 10% increase in onion rate in last one month or a simple question to Alexa for Tomato rate.

HOPCOMS API

HOPCOMS API

Since HOPCOMS doesn’t have an API, I built one. I have a simple script that pulls data once a day from HOPCOMS and stores it as a JSON document in CouchDB. Any access to API will pull the data from CouchDB. That way no extra load on HOPCOMS server. Since it has run for months now. I can confidently write about it. It’s a simple API which gives the list of item codes and rates for a given date. It also has meta data API to give the item code to item name listing.

The code with some historical data is on GitHub. If you update daily.py with couchdb details it should be able to pull the data. Best would be setup a CRON to pull the data everyday.

Since its built on CouchDB, API is web accessible by default. For example: Jan 16, 2018 data will be available at https://mycouchdb.ext/hopcoms_daily/20180116 where 20180116 is the date in yyyymmdd format and mycouchdb.ext is where your couchDB is running. The data is just key value pair of item code and rates.

{"_id":"20180116","_rev":"1-6177e5919a7f6a22dae47b44e4ff5a18","1":110.0,"4":198.0,"7":189.0,"9":240.0,"11":29.0,"12":62.0,"13":80.0,"14":54.0,"19":58.0,
"20":75.0,"21":86.0,"22":40.0,"27":100.0,"28":118.0,"29":142.0,"31":75.0,"32":425.0,"36":148.0,"37":32.0,"39":77.0,"40":76.0,"42":34.0,"43":48.0,
"44":100.0,"46":80.0,"48":60.0,"49":148.0,"51":53.0,"52":57.0,"54":96.0,"55":68.0,"56":65.0,"57":95.0,"58":70.0,"59":160.0,"63":120.0,"66":21.0,
"67":27.0,"69":39.0,"70":42.0,"78":20.0,"79":20.0,"81":80.0,"84":52.0,"87":88.0,"88":250.0,"101":60.0,"102":19.0,"108":42.0,"109":26.0,"110":34.0,
"112":24.0,"113":42.0,"114":40.0,"115":22.0,"116":16.0,"117":17.0,"121":12.0,"122":42.0,"123":37.0,"125":82.0,"126":50.0,"127":30.0,"131":60.0,
"132":45.0,"135":37.0,"136":36.0,"137":28.0,"139":14.0,"140":80.0,"142":26.0,"145":34.0,"147":148.0,"149":70.0,"150":80.0,"153":24.0,"155":54.0,
"156":29.0,"157":30.0,"158":60.0,"160":200.0,"162":188.0,"163":56.0,"167":140.0,"168":80.0,"171":13.0,"173":16.0,"174":23.0,"179":18.0,"180":80.0,
"181":26.0,"182":28.0,"183":18.0,"184":74.0,"185":56.0,"186":59.0,"187":46.0,"190":10.0,"191":40.0,"193":65.0,"196":50.0,"201":25.0,"202":66.0,
"203":38.0,"204":58.0,"205":40.0,"206":40.0,"208":34.0,"213":38.0,"215":56.0,"218":16.0,"220":4.6,"221":33.0,"226":32.0,"227":145.0,"250":45.0,"251":25.0}

You can get the item code to item name from hopcoms_meta accessible at https://mycouchdb.ext/hopcoms_meta/item_details. Data again is a simple JSON

{
  "_id": "item_details",
  "_rev": "3-b3808e206cd2263b5ae67e7fd543f245",
  "1": {
    "name_en": "Apple Delicious",
    "name_kn": "ಸೇಬು ಡಲೀಷಿಯಸ್"
  },
  "2": {
    "name_en": "Apple Simla",
    "name_kn": "ಸೇಬು ಶಿಮ್ಲ"
  },
...
...
...

}

So using both you can figure that on 20180116 rate of Apple Delicious was INR 110. It’s that easy. You could cache the item_details. You could do other CouchDB key based queries to get data worth of a month or year or between specific dates etc. I have written the initial API documentation. Please send/add your comments. As of now you have to self host the API. Python code below shows how simple it is to pull the data and update CouchDB.

Here is the code that powers

import requests
import json
import csv
import datetime
import couchdb
from BeautifulSoup import BeautifulSoup

all_items_load = False

all_item_list = {}
db_full_url= ""
couch = couchdb.Server(db_full_url)
hopcoms_meta 	= couch["hopcoms_meta"]

if all_items_load:
	with open('item_list.csv', "r") as csv_file:
		reader = csv.reader(csv_file)
		header = True
		for row in reader:
			if header:
				header = False
				continue
			label = (row[1]).strip()
			label =	label.replace(" ","")
			label = label.lower()
			all_item_list[label]=int(row[0])

	try:
		if hopcoms_meta["item_codes"]:
			pass
	except couchdb.http.ResourceNotFound:
			print "add"
			all_item_list["_id"]="item_codes"

	print str(all_item_list)
	hopcoms_meta.save(all_item_list)
else:
	all_item_list = hopcoms_meta["item_codes"]
	#print str(all_item_list)



hopcoms_daily 	= couch["hopcoms_daily"]



web_data_url = "http://www.hopcoms.kar.nic.in/(S(vks0rmawn5a2uf55i2gpl3zo))/RateList.aspx"
table_x_path ="""//*[@id="ctl00_LC_grid1"]"""
total_data = {}
r = requests.get(web_data_url)
if r.status_code == 200:
	html = r.text
	soup = BeautifulSoup(html)
	date_span = soup.find(id='ctl00_LC_DateText')
	date_span_text = date_span.text
	date_span_text = date_span_text.strip()
	date_span_text = date_span_text.replace("Last Updated Date: ","")
	date_span_text_array = date_span_text.split("/")
	final_date = date_span_text_array[2]+date_span_text_array[1]+date_span_text_array[0]
	print str("Updating for ="+final_date)


	table = soup.find(id='ctl00_LC_grid1')
	#print str(table)
	for tr in table:
		if str(tr).strip() == "":
			continue
		if len(tr.findChildren('th')) > 0:
			continue
		#six elements	
		row = 0
		row_data = {}
		label1 =""
		data1=0
		label2 =""
		data2=0

		for th in tr.findChildren('td'):			
			for span in th.findChildren('span'):
				content = str(span.text).strip()

				row = row + 1
				if row == 1 or row == 4:
					continue
				if row == 2 and content != "":
					label1 = str(content)				
				if row == 3 and content != "":
					data1 = float(content)				
				
				if row == 5 and content != "":
					label2 = str(content)				
				if row == 6 and content != "":
					data2 = float(content)

		if label1.strip() != "":
			label1 = label1.replace(" ","")
			label1 = label1.lower()
			item_id = all_item_list[label1]
			total_data[item_id]=data1				
			

		if label2.strip() != "":
			label2 = label2.replace(" ","")
			label2 = label2.lower()
			item_id = all_item_list[label2]
			total_data[item_id]=data2				


dt = datetime.date.today()
_id = '{:%Y%m%d}'.format(dt)
print str(_id)
if str(final_date) == str(_id):
	print "MATCHES final_date and _id"
	total_data["_id"]=_id
	print str("----------------------------------------------------------")
	try:
		if hopcoms_daily[_id]:
			print "exists"
			x = hopcoms_daily[_id]
			print str(x["_id"])
			print str(x["_rev"])
			total_data["_rev"] = x["_rev"]						
	except couchdb.http.ResourceNotFound:
			print "add"
			total_data["_id"]=_id

	print str(total_data)
	hopcoms_daily.save(total_data)
else:
	print "DOESN'T MATCH"

Also I can give access to my API server to some of you. Considering I run a small server it will be very few. Send me an email. I will send you the details. No promises. In the mean time, tell me how will you use the API?

You may also like...

1 Response

  1. Prakash Hebballi says:

    Very interesting. Thank you for details.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.