What sort of problems do we solve?
Any problem related to textual data analysis. Specifically - what and how text mining can help extract relevance from your textual data sources. Building on, how to find the value and how do we go about it to develop a working solution?
Typical problems we hear from our clients are related to finding the critical indicators/information from the abundant text data (i.e. search issue), summarizing the large set of text data into something smaller that enables quicker analysis (i.e. summarizing issue, finding opinions and sentiments from reviews/comments) and developing richer understanding from multiple data sources (i.e. events and relations between extracted entities).
Areas of Development
Information Extraction Demo
If Text Mining is an art of extracting hidden facts from unstructured pieces of text, Information Extraction (IE) provides the basic help needed to carry out this more challenging task of identifying unknown things. Aim of the IE is to identify pre-specified types of events, entities and relationships. It is an act of locating information that is already present in the unstructured text but requires proper labelling.
Be it identifying specific information such as names of people, locations, organizations, job-titles, phone numbers, credit card numbers, dates, product names, feature names, measurements, drug names, allergies etc., we can develop or help you in developing customised applications to identify such information in various domains and variety of unstructured sources such as online full-length articles, blogs, newswires, social media texts, reviews etc.
Language Practice is a fast, easy and innovative tool to develop grammar exercises for your students. We have just started to develop Language Practice and constantly improving it with your selected teachers' feedback. It already helps you in developing a range of grammar topics and we are constantly adding more! Use your own text or select from our listed topics to generate exercises instantly. Language practice is available for English but will soon have other languages. We are recruiting 50 teachers and schools worldwide.
If you are a teacher of English, French, German or Arabic, please write to us to get an invitation to use Language Practice. If you are already using it, write your suggestions, feed back, and questions to us.
Semantic Annotation & Search
Information Extraction can help with recognising entities, events and relationships but assigning true identities to these individual entities is semantic annotation. Recognising which two mentions of persons are referring to the same entity in the real world is semantic annotation. If a mention of Paris is annotated as City by IE, identifying if this is a city of France or United States, and recording this fact is semantic annotation.
Allowing users to search over semantically annotated data is Semantic search. When a user searches for "outdoor sport", returning documents mentioning tennis, cricket, hocky, rugby etc. is an example of semantic search. If an environmental scientist searches for "dense locations in UK affected by flood in year 2010", and if he succeeds finding this information, it would be due to semantic search. In a nutshell it is all about understanding users' intent and returning the most suitable results. We are here to provide you similar search solutions for your needs.
Event Detection & Relation Extraction
If a curfew has been declared in your area, or an important announcement has been made by a politician, if there is going to be a DJ party in your town, or a water-cut scheduled in your area, if a movie has been declared a block-buster or an incidence of murder in your city, identifying such 'events of interest' has been an interesting area of research in recent times. We have the expertise to build applications that can extract such events from articles suitable to your needs.
We can also help you with identifying relations between entities. For example, Mr X is a CEO of company Y or Mr X and Y are siblings. Using these relations, we can help you develop a search interface that will allow reasoning over these relations and answering questions such as "Where does Mr Y's brother work"?
People post their thoughts/opinions/experiences of various events/products on social media websites. One of the recent studies has shown trends of stock market following mood of users on Twitter. Many news channels, nowadays, consider opinions of users on social media websites before publishing pre-election poll results. Finding out what others think is a study called sentiment analysis or opinion mining.
If you're a company interested in finding out what consumers think about your products online, or what specific features of these products are liked/disliked with what comments, we are here to help you. We specialise in producing entity/feature specific opinions from blog posts as well as other short and noisy social media content.
With billions of pages available at a touch of finger, copying someone else's content is the most tempting but illegal activity should it happen without a proper attribution to the original author. Be it a website content or a student assignment, copying is the most unwanted yet unstoppable problem. With one of the advanced natural language processing algorithms and capabilities to identify content theft of different natures: a simple copy-paste, changes to the tense or morphology, or even paraphrase shuffling, we can provide you a solution that can help you identify the copied content. If you are an academic organization, a journal publisher, a patent validator or someone looking for a solution to identify copied contents, approach us to find the right technology for you.