Google’s AI just got better at understanding Indian languages: here’s how it works

Google is known for its efforts in research and development that aim to provide contextual information to users across countries and languages. To further this initiative in India, the tech giant has announced Multilingual Representations for Indian Languages, or simply MuRIL, at its Google for India event. MuRIL is being touted as a modern multilingual detection model that aims to provide context and sentiment-based search results across several Indian regional languages. India is a diverse country with dozens of mainstream languages; this makes reaching users more complex for an American company like Google. But with MuRIL, the task might just become slightly easier. Let’s get into the details of how Google is looking to transform Search in India with this announcement.

What is Google’s MuRIL?

Multilingual Representations for Indian Languages is a BERT, short for Bidirectional Encoder Representations from Transformers. Now, what’s that? BERT is an AI-based technique that uses natural language to distinguish between the nuances of what people are searching for. 

For instance, the word “bank” can mean separate things in search terms, such as “bank branch” or “river bank”. To assist in these queries, Google included the BERT algorithm into its search product last year as an attempt to make queries more relevant. 

The newly launched MuRIL aims to further fix the issue with relevancy for Indian regional language users on Google Search. It has been pre-trained on a total of 17 Indian languages – Assamese, Bengali, English, Gujarati, Hindi, Kannada, Kashmiri, Malayalam, Marathi, Nepali, Oriya, Punjabi, Sanskrit, Sindhi, Tamil, Telugu, and Urdu (in alphabetical order). These major languages cover a vast majority of the Indian population.

Why has it been put in place?

Google says that there are a lot of users in India who make use of Google Search in English. For instance, consider there are several Oriya or Telugu speakers who would type their queries in English but might not be proficient in the language to interpret the results. This is likely to appear more common when you realise that typing in an Indian language in its native script is typically more difficult, and can often take three times as long, compared to English, according to Google.

For such users, Search will show relevant content in the 17 supported Indian languages where it seems appropriate. While Google does not clarify what factors go into these automatic recommendations, we are guessing geographic location and search history have a lot to do with it. This feature will first roll out in five Indian languages: Hindi, Bangla, Marathi, Tamil, and Telugu.

Another way that Google’s new MuRIL algorithm can help with is when interpreting transliterated text including the times you write Hindi words using the Roman script. For instance, “Achha hua account bandh nahi hua” is a phrase that will now have a positive connotation, instead of negative, which was the case previously. Google also says that search queries such as “Shirdi ke sai baba” will now give accurate results for the personality instead of a location.

Then, Google is also using MuRIL to extend the support for Indian language picker to apps like Google Assistant, Discover, and Google Maps. Furthermore, you can now toggle between English and four additional Indian languages – Tamil, Telugu, Bangla, and Marathi, apart from Hindi – on Google Search pages. 

The possibilities are endless

Google has announced that it is making MuRIL free and open source. This means the thousands of app developers in its ecosystem can make use of this sophisticated natural language technique to develop better products for Indian regional language users. For example, developers of apps with search engines, including e-commerce apps like Amazon, social apps like Facebook, and more, might be able to take advantage of this technology and show more relevant content to Indian users.

To fulfil Google’s dream of reaching the next billion users, MuRIL seems like a step in the right direction.