EnggRoom

Full Version: Combine Tag and Value Similarity for Data Extraction and Alignment
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
project name: Combine Tag and Value Similarity for Data Extraction and Alignment

I want to develop this project in java.
The Graphical interface and implementation details are given on youtube with its name above.

In this project we have following steps to implement
1. we are going to extract the html code from web page with all it tags and value, which are given to images of web page.

2. After extracting code we are going to construct a tree which contains all the tags and value as node to it. Starting from root node.

3. Construction of tree followed by data region identification, in which we are going to identify each separate record.

4. We are going to implement three algorithms in this which are
a. Pairwise QRR Alignment
b. Holistic Alignment
c. Nested Structure Processing

5. After implementing this algorithm all the sorted content is put in the table format.

all the information is in the below paper.

please send your replies to me about this project