Saturday, June 28, 2014

Tool to convert Relational Database to RDF




The D2RQ Platform is a system for accessing relational databases as virtual, read-only RDF graphs. It offers RDF-based access to the content of relational databases without having to replicate it into an RDF store. Also D2RQ is open source software licensed under Apache license. D2RQ support databases such as Oracle, SQL Server, PostgreSQL, MySQL or HSQLDB etc.

Using D2RQ one can,
  • query a non-RDF database using SPARQL
  • access the content of the database as Linked Data over the Web
  • create custom dumps of the database in RDF formats for loading into an RDF store
  • access information in a non-RDF database using the Apache Jena API

D2R Server can be run either as a stand-alone web server or inside an existing servlet container.
To map the Relational Database to RDF a mapping file is used. This mapping file can be created manually or using generate-mapping tool that generates a skeleton “default mapping” from the database schema automatically. Creating mapping file using generate-mapping tool is much faster than manually creating the mapping file.

The mapping defines a virtual RDF graph that contains information from the database. This is similar to the concept of views in SQL, except that the virtual data structure is an RDF graph instead of a virtual relational table. The virtual RDF graph can be accessed in various ways, depending on what's offered by the implementation. The D2RQ Platform provides SPARQL access, a Linked Data server, an RDF dump generator, a simple HTML interface, and Jena API access to D2RQ-mapped databases.

Friday, June 13, 2014

Methodology

In this Project we are going to develop business Intelligence software using semantic web and
Natural Language Processing. By having a business intelligence system in a business
organization can assist people in the organization to make decision on the fly. In this project we
are going to use Agile Methodology for developing the software. Also we have planned to
maintain a blog about the work carried by each group members. First of all, we are going to
study the technologies we can use in this project (RDF , OWL , SPARQL etc.).

Then we are going to use a dummy relational database as the input data for the project. Then this
database is converted to Resource Description Framework (RDF). In order to this we will have to
find a library/tool to convert the relational database to RDF. If not we have to develop a library
to do this work. When we are able to convert the relational database to RDF we can extend this
process to gather data from other sources such as web resources, email to RDF.

After creating the RDF file we are going to use SPARQL queries to get data from RDF file. In
order to do that we will have to study about the SPARQL and find libraries to query RDF file
using SPARQL in Java.
But the business people don’t know SPARQL. So we are going to use Natural Language
Processing (NLP) to convert Natural language inputs to relevant SPARQL queries and get the
output.

Project Objective

The main objective of our project is to do a comprehensive analysis of semantic web and data
technologies in making structured data more meaningful for an organization and also find tools
and their capabilities in delivering the above requirements. Then use NLP in generating queries
on semantic data space to create working software.

Saturday, May 31, 2014

Semantic Web

The Semantic Web is the extension of the World Wide Web that enables people to share content beyond the boundaries of applications and websites. Lt has been described in rather different ways: as a utopic vision, as a web of data, or merely as a natural paradigm shift in our daily use of the Web. Most of all, the Semantic Web has inspired and engaged many people to create innovative semantic technologies and applications.

        The word semantic itself implies meaning or understanding. As such, the fundamental difference between Semantic Web technologies and other technologies related to data (such as relational databases or the World Wide Web itself) is that the Semantic Web is concerned with the meaning and not the structure of data.This fundamental difference engenders a completely different outlook on how storing, querying, and displaying information might be approached.  Some applications, such as those that refer to a large amount of data from many different sources, benefit enormously from this feature.  Others, such as the storage of high volumes of highly structured transactional data, do not.


The Semantic Web consists primarily of three technical standards:


  • RDF (Resource Description Framework): The data modeling language for the Semantic Web. All Semantic Web information is stored and represented in the RDF.
  • SPARQL (SPARQL Protocol and RDF Query Language): The query language of the Semantic Web. It is specifically designed to query data across various systems.
  • OWL (Web Ontology Language) The schema language, or knowledge representation (KR) language, of the Semantic Web. OWL enables you to define concepts composably so that these concepts can be reused as much and as often as possible. Composability means that each concept is carefully defined so that it can be selected and assembled in various combinations with other concepts as needed for many different applications and purposes.



Wednesday, May 28, 2014

Introduction

Business Intelligence is the future of efficient management and improves profitability in today’s industry. Semantic web technologies provide much flexible knowledge gathering and processing platform on data. Data can be gathered and processed using various available technologies to find relationships base on reason. This knowledge can be stored and processed using RDF/OWL technologies and can quire by languages like SPARQL with available technology frameworks. Processing large amounts of unstructured data can be quite computation intensive process. This challenge can be overcome by parallelism and using techniques like Google Map Reduce.
Further by including Natural Language Processing the information analysis can be greatly eased out for the end user of the system. This will help the end users to make quick decisions which can lead to profitability in an organization.
The semantic web (SW) has been conceived as a means to build semantic spaces over web published contents so that web information can be effectively retrieved and processed by both humans and machines in a great variety of tasks
SW Formats: RDF(S) and OWL
In RDF there are three kinds of elements resources, literals, and properties. Resources are web objects (entities) that are identified through a URI, literals are atomic values such as strings, dates, numbers, etc.,
The ontology web language (OWL) mainly differs from RDFS in the underlying semantic formalism, which is founded in description logics (DL) . Indeed, OWL languages provide RDF/XML serializations of different DL languages.