Understanding the spread of Extinct and Endangered languages across India.

Data Description

The dataset of extinct and endangered languages around the world is created by The Guardian and is available here.

Variables: The name of language, longitude, latitude, degree of endangerment and the number of speakers.

Data Visualisation

Languages by latitude/longitude and Population are plotted.

The interactive visualisation of Languages by Latitude/longitude inform us, where all the extinct and endangered languages are located across India.

Similarly, Languages by number of speakers is plotted.

#collapse-hide
import numpy as np
import pandas as pd
pd.options.mode.chained_assignment = None

from IPython.display import HTML

import plotly.plotly as py
import plotly.graph_objs as go
from plotly import tools
from plotly.offline import iplot, init_notebook_mode
init_notebook_mode()

language_data = pd.read_csv('../data/data.csv', usecols=[0, 1, 5, 7, 10, 12, 13])
language_data = language_data.rename(
    columns={'Name in English':'language', 'Country codes alpha 3':'locations',
             'Degree of endangerment':'risk', 'Number of speakers':'population'})
language_data.columns = language_data.columns.str.lower()
language_data['risk'] = language_data['risk'].str.title()
language_data['population'] = language_data['population'].fillna(-1)

# endangered or extinct languages in India
language_ind = language_data[language_data['locations'].str.contains('IND') == True]

language_ind['risk'] = language_ind['risk'].replace(
    ['Vulnerable', 'Definitely Endangered', 'Severely Endangered',
     'Critically Endangered', 'Extinct'], [1, 2, 3, 4, 5])

language_ind = language_ind[['language', 'risk', 'population', 'latitude', 'longitude']]