Category: Voice Based Query Interface for Database

Posts

After a long delay I finally mustered up the courage to build the query ranking module. Some scary stuff. Here are the problems I’ve been facing whenever I start building this. The ranking requires the table’s Foreign Key structure. Once the query generation is done, the recursive calls along with Javascript’s callbacks is a nightmare. Callbacks are being received even after sending the output. Some relations between the tables are discovered even after ranking.
For now, I have chosen packet for the trial server even though the cost is high. The specs are good (even for entry level containers) and I got some credits to work with initially. So I setup the ssh keys for login and got to work. The github repository for parsey mcparseface and syntaxnet is available here and it also lists out the steps required to setup syntaxnet and get it working.
Before I start building up the server for the application, I wanted to pen down the structure for the app. The structure is represented below. Abbreviations: WSA: Web Speech API (Speech to Text) NLP: Natural Language Processing Tool (Syntaxnet or Google Cloud NLP API) DIS: Database Indexed Search DDI: Database Dynamic Indexing DES: Database Exhaustive Search NES: NoSQL Database Elastic Search SFT: SQL Database Full Text Search  
Converting the Natural Language to Database Query project over to Javascript requires the implementation of the Web Speech API in a single Javascript file. I will discuss the steps taken for the integration in this blog post. Kinda unorthodox but here’s a reference before the content, if you want to experiment on Web Speech API using the official guide. First we check if the API is available with the browser using the following code
Speech recognition and synthesis technology provides a natural interaction method for many computing tasks. It allows users to communicate with computers naturally using spoken language, requiring very little training. Although this technology has existed for several decades, it has recently become widely usable because of the increasing capabilities of consumer . Benefits of Speech Interaction (in general):  Improved interaction for people with disabilities. Hands-free interaction with virtual reality and augmented reality applications.
We have collected the data for the application area that we are working on and that is in the field of medical where we will get to know what medicine a person should take and what are its symptoms and in what quantity it should be taken. Such queries can arise in the mind of a patient so we have collected basic queries that a person can ask using our day-to-day language and in this we have taken about 150-200 queries that a person can put up.
A natural language parser is a program that works out the grammatical structure of sentences, for instance, which groups of words go together (as “phrases”) and which words are the subject or object of a verb. Dan Klein wrote the original version of this parser and Christopher Manning helped him by his support code and linguistic grammar development. A Part-Of-Speech Tagger (POS Tagger) is a piece of software that reads text in some language and assigns parts of speech to each word (and other token), such as noun, verb, adjective, etc.
As a part of our 2nd Minor Project, we decided to take up the topic: “Voice based query interface for database”. The aim of the project is to develop a system that takes a natural language statement as a voice (audio) input from the user and then convert it into a relevant query for a specific database.   What we are trying to solve   In this project, we are trying to tackle the problem of learning and understanding a language before we are able to start working on projects concerning it.