This course is about semantic annotations and semantic search using linked open data (LOD) resources (such as DBPedia and Geonames, Yago etc). Our focus is on analyzing full-length articles as well as short and noisy data collected from social networking websites such as twitter.

We start with some real-life problems to understand why semantic annotations and search are useful and how LOD resources can help us in solving these problems. We look at some of the existing tools and techniques available to achieve specific practical tasks such as identity resolution, relations extraction and ontology population, sentiment analysis etc. using LODs. We analyse unstructured data sources and try to extract useful information from them. We show how such information can be extracted, indexed and how semantics enabled search interfaces can be developed on top of the indexed data. We use GATE for development and demonstration purposes.

Training Agenda

Day 1: Introduction to Text Mining

  • What is Text Mining?
  • Semantic annotations and search including discussion on search interfaces
  • Linked Open Data repositories and its use in Semantic Applications

Day 2: Social Media Analysis

  • How to process tweets?
  • IE for tweets
  • Sentiment Analysis and Opinion Mining

Day 3: Introduction to GATE

  • Brief introduction to GATE
  • Example applications and hands-on experiments


No background in natural language processing is required but you should have reasonable programming ability and be able to write programs in Java. Just come with keen interest to learn and apply semantic technology.

Trainer Biography - Dr. Niraj Aswani

Dr. Aswani is a semantic technology expert and a recognized GATE GURU. His hands-on expertise in developing semantic technology applications cuts across several domains including environmental science, digital forensic analysis, political tweets, patent retrieval, health records, financial markets, etc. He has developed NLP solutions for English, French, Arabic and Hindi. He has been advising challenging assignments on Information Extraction, Semantic Annotation & Search, Text Alignment and Machine Translation.