Saturday, June 28, 2014

Tool to convert Relational Database to RDF




The D2RQ Platform is a system for accessing relational databases as virtual, read-only RDF graphs. It offers RDF-based access to the content of relational databases without having to replicate it into an RDF store. Also D2RQ is open source software licensed under Apache license. D2RQ support databases such as Oracle, SQL Server, PostgreSQL, MySQL or HSQLDB etc.

Using D2RQ one can,
  • query a non-RDF database using SPARQL
  • access the content of the database as Linked Data over the Web
  • create custom dumps of the database in RDF formats for loading into an RDF store
  • access information in a non-RDF database using the Apache Jena API

D2R Server can be run either as a stand-alone web server or inside an existing servlet container.
To map the Relational Database to RDF a mapping file is used. This mapping file can be created manually or using generate-mapping tool that generates a skeleton “default mapping” from the database schema automatically. Creating mapping file using generate-mapping tool is much faster than manually creating the mapping file.

The mapping defines a virtual RDF graph that contains information from the database. This is similar to the concept of views in SQL, except that the virtual data structure is an RDF graph instead of a virtual relational table. The virtual RDF graph can be accessed in various ways, depending on what's offered by the implementation. The D2RQ Platform provides SPARQL access, a Linked Data server, an RDF dump generator, a simple HTML interface, and Jena API access to D2RQ-mapped databases.

Friday, June 13, 2014

Methodology

In this Project we are going to develop business Intelligence software using semantic web and
Natural Language Processing. By having a business intelligence system in a business
organization can assist people in the organization to make decision on the fly. In this project we
are going to use Agile Methodology for developing the software. Also we have planned to
maintain a blog about the work carried by each group members. First of all, we are going to
study the technologies we can use in this project (RDF , OWL , SPARQL etc.).

Then we are going to use a dummy relational database as the input data for the project. Then this
database is converted to Resource Description Framework (RDF). In order to this we will have to
find a library/tool to convert the relational database to RDF. If not we have to develop a library
to do this work. When we are able to convert the relational database to RDF we can extend this
process to gather data from other sources such as web resources, email to RDF.

After creating the RDF file we are going to use SPARQL queries to get data from RDF file. In
order to do that we will have to study about the SPARQL and find libraries to query RDF file
using SPARQL in Java.
But the business people don’t know SPARQL. So we are going to use Natural Language
Processing (NLP) to convert Natural language inputs to relevant SPARQL queries and get the
output.

Project Objective

The main objective of our project is to do a comprehensive analysis of semantic web and data
technologies in making structured data more meaningful for an organization and also find tools
and their capabilities in delivering the above requirements. Then use NLP in generating queries
on semantic data space to create working software.