Simple analysis of place names in Kerala


Read {count} times since 2020

I noticed something unique when I walked around in Kakkanad, some of the local place names ended with the word “മുഗൾ”.

Nilampathinjamugal

I haven’t heard of any other place having this suffix in Kerala before, so it was quite interesting to me. The curiosity instantly brewed in my mind, is this really specific to Kakkanad? Why so?

I never got the time to explore this idea more until last week at a Wikimedia event in my alma mater.

Goal

To list all the places in Kerala with a particular suffix and show them as dots on an interactive map. This’ll help to know the frequency better.

Gathering data

Where to get the list of all place names in Kerala & their coordinates?

It is only when you try to do data analysis, you realize the lack of data and errors in the data that is available.

OpenStreetMap

  • Get the list of all place names

https://overpass-api.de/api/interpreter?data=[out:json];area[name=%22Kerala%22];node(area)[place];out;

  • Get the list of all bus stops

https://overpass-api.de/api/interpreter?data=[out:json];area[name=%22Kerala%22]-%3E.searchArea;node[%22highway%22=%22bus_stop%22](area.searchArea);out;

Bus stop names are usually the place names itself, I noticed in OSM that some place name nodes are not on the map but the bus stops are. Hence why I collected them both, there will be duplicates because of this but that’s fine, this is a simple human analysis.

Wikidata

Wikidata query service is used to run these SPARQL queries.

  • Get the list of human settlements in Kerala:
SELECT DISTINCT ?item ?len ?lml ?coord
WHERE
{
  ?item wdt:P31 wd:Q486972 .
  ?item wdt:P131/wdt:P131* wd:Q1186.
  ?item wdt:P625 ?coord.
  OPTIONAL { ?item rdfs:label ?len. FILTER(LANG(?len)="en") }
  OPTIONAL { ?item rdfs:label ?lml. FILTER(LANG(?lml)="ml") }
}
LIMIT 100

The good folks at OpenDataKerala has made ward data openly accessible on Wikidata and OpenStreetMap. Some of these wards have coordinates assigned.

  • Get list of local body wards that has a coordinate.
SELECT DISTINCT ?item ?len ?lml ?coord
WHERE
{
  ?item wdt:P31 wd:Q1195098 .
  ?item wdt:P131* wd:Q1186.
  ?item wdt:P625 ?coord.
  OPTIONAL { ?item rdfs:label ?len. FILTER(LANG(?len)="en") }
  OPTIONAL { ?item rdfs:label ?lml. FILTER(LANG(?lml)="ml") }
}

Combining data

We have four sources of data in JSON. They need to be combined so that it’s easy to filter out data.

I figured putting them all into a SQLite DB would be the best way. For this I wrote a Ruby script. Rails’ ActiveRecord makes it pretty easy to manage the DB.

ActiveRecord without Ruby on Rails

One of the main features of Ruby on Rails is the ActiveRecord ORM. This can be used without Ruby on Rails as well.

require 'active_record'
require 'sqlite3'

ActiveRecord::Base.establish_connection(
  adapter: 'sqlite3',
  database: 'db.sqlite3'
)

# This is a model that corresponds to a table
class Place < ActiveRecord::Base
end

# Table name in SQLite will be plural: "places"
@first_run = !Place.table_exists?

# 3. Create the table (migration-like setup)
if @first_run
  ActiveRecord::Schema.define do
    create_table :places do |t|
      t.string :name
      t.string :lat
      t.string :lon
      t.timestamps
    end
    add_index :places, :name, unique: true
  end
end

The above script will setup the DB. Sometimes you’ll want to reset and start over, simply just the delete the db.sqlite3 file (Good part of SQLite being simple).

Insert data

The next step is to insert the data, for this I created separate functions.

if @first_run
  osm_bus_stops
  osm_place_nodes
  wikidata_places
  wikidata_wards

  Place.insert_all(@records)
end

Now that we have the DB setup, we can do the analysis.

Ruby interactive console can be used to debug and query easily with ActiveRecord, for this I trigger the runtime developer console at the end of the file.

require 'pry'
...
...
binding.pry

Using the console to fetch all the places that has the word mugal in it:

Place.where("name LIKE '%mugal%'").pluck(:name, :lat, :lon)

binding.pry runtime developer console

Analysis

The most effective way to show the result is with a map of marked points. I figured using Leaflet will be the easiest because I can programmatically control it and I have seen it being used everywhere on the web.

Showing just Kerala region

Leaflet showed the full map of the Earth, but I wanted to show just Kerala. This was difficult to achieve. I needed to get the exact boundaries of the Kerala region. My first attempt was to use the geometry data from Wikipedia.

Kerala region colored in red

But I wanted to distinguish the districts better. I explored many ways and finally reached the best solution. Write a query in Overpass, download the geojson and load it in leaflet.

The geojson can be obtained by running this query on https://overpass-turbo.eu/

[out:xml][timeout:500];
{{geocodeArea:Kerala}}->.searchArea;
(
  nwr["boundary"="administrative"]["admin_level"="5"](area.searchArea);
);
// print results
out meta;
>;
out meta qt;

If you notice the query, it fetches administrative regions with level 5, this is the districts of Kerala.

Coloring districts uniquely

Since the geojson is made up of district region boundaries, based on the name of the region, the fill color can be changed. This is how that looks like:

L.geoJSON(json, {
  style: function(feature) {
    return {
      fillColor: colors[feature.properties.name], // colors["Thrissur"]
      fillOpacity: 0.5,
      color: "#000",
      weight: 0.2,
    }
  }
})

Kerala districts colored separately

Result

I’ve analyzed suffixes of “kari”, “mugal”, “ssery”, and “kulam” suffixes, plotted it on to the map, then use Firefox’s screenshot tool to grab the boundary box. This is how the webpage of Leaflet looks like:

Leaflet page in Firefox

Firefox’ screenshot tool is pretty nice to grab just the rectangle box by the border:

Leaflet page in Firefox

Conclusion

mugal

The place names ending with “mugal” suffix is indeed a specialty of Kakkanad. The logical reasoning I have so far is that it’s because of the nature of the place. Since they’re hilly areas, using the name mugal/മുകൾ (top) makes sense.

Mugal suffixed places in Kakkanad

kari

Kuttanad has a reason why it has a lot of place names ending with kari/കരി. From Wikipedia:

Kuttanad was once believed to be a wild forest with dense tree growth which was destroyed subsequently by a wild fire. Chuttanad (place of the burnt forest), was eventually called Kuttanad. Until the recent past burned black wooden logs were mined from paddy fields called as “Karinilam” (Black paddy fields). This fact substantiates the theory of Chuttanad evolving to Kuttanad. Ramankary, Puthukkary, Amichakary, Oorukkary, Mithrakary, Mampuzhakary, Kainakary, Chathurthiakary, Thakazhy, Edathua, Chambakkulam, Mankombu and Chennamkary are some familiar place names in Kuttanad

kari suffixed places in Kerala

Interestingly, Kannur mountain ranges also have a lot of such places! I don’t have a good answer to this but I think it must be because the people who settled in these mountain regions are people from south Kerala. When people migrate, they tend to name the new places from the places they came from or are familar with.

kulam

Kulam means pond and Kerala has a lot of it. So it obviously makes snese that it’s present everywhere in Kerala.

kulam suffixed places in Kerala

ssery

Interestingly it’s not present in Kasargod or Thiruvananthapuram but is present everywhere else.

ssery suffixed places in Kerala

You can see more imagery here.

Credits

Thanks to OpenStreetMap, Wikidata contributors. Data analysis is all fancy and all, but no analysis is possible without data. So first and foremast thanks to all the people who contibute!

Thanks to Jinoy, Manoj K & Ranjith Siji for answering my queries at the event. This made things faster to build.

Show Comments