Você está na página 1de 13

A Report On

SamPra Social Networking Website (Distributed Database Approach)

Database Management And 4 GL

Table of Contents TITLE 1 Introduction 2 Problem Statement 3 ER Diagram And Tables 4 Application 5 Snapshot of the application 6 Contribution of individual members of the group in the project PAGE NO.
3

6-

10

13

Introduction
Social networking is the grouping of individuals into specific groups, like small rural communities or a neighborhood subdivision. Although social networking is possible in person, especially in the workplace, universities, and high schools, it is most popular online. This is because unlike most high schools, colleges, or workplaces, the internet is filled with millions of individuals who are looking to meet other people, to gather and share first-hand information and experiences about golfing, gardening, aesthetics and cosmetic surgery, developing friendships or professional alliances, finding employment, business-to-business marketing and even groups sharing information about the end of the Mayan calendar and the Great Shift. The topics and interests are as varied and rich as our society and the history of the human being. When it comes to online social networking, websites are commonly used. These websites are known as social sites. Social networking websites function like an online community of internet users. Depending on the website in question, many of these online community members share common interests in hobbies, religion, or politics. Once you are granted access to a social networking website you can begin to socialize. This socialization may include reading the profile pages of other members and possibly even contacting them.As we all know that a coin has two sides, there are dangers associated with social networking including data theft and viruses, which are on the rise. The most prevalent danger though often involves online predators or individuals who claim to be someone that they are not. Although danger does exist with networking online, it also exists with networking out in the real world. Our project is an online Social Networking Website which used a distributed database at the back end. As data is stored in huge quantities in Social Networking using a Single server to host the database is not practical. Hence, we illustrated how distributed database is maintained and data is fetched as per requirement from multiple servers.

The application of this Project is to any project which requires the maintenance of a large database (like Social Networking, employees database, Citizens database at government sites etc.)

Background
Hardware / Equipment used: 4 Pc Machines (1 Master and 3 Slaves, located in MSCLIS 1st year Lab)

Framework / Architecture used: Apache Hadoop Apache Hadoop is an open-source software framework that supports data-intensive distributed applications. The Hadoop framework transparently provides both reliability and data motion to applications. Hadoop implements a computational paradigm named Mapreduce where the application is divided into many small fragments of work, each of which may be executed or re-executed on any node in the cluster. In addition, it provides a distributed file system that stores data on the compute nodes, providing very high aggregate bandwidth across the cluster. Both map/reduce and the distributed file system are designed so that node failures are automatically handled by the framework.

Each node in a Hadoop instance typically has a single namenode; a cluster of datanodes form the HDFS cluster. The file system uses the TCP/IP layer for communication. Clients use Remote procedure call (RPC) to communicate between each other. A small Hadoop cluster will include a single master and multiple worker nodes. The master node consists of a JobTracker, TaskTracker, NameNode and DataNode. A slave or worker node acts as both a DataNode and TaskTracker, though it is possible to have data-only worker nodes, and compute-only worker nodes
4

Problem Statement
Data Intensive Application like Social Networking, government Databases etc come under tremendous load and often fail to satisfy requests, eg IRCTC, MySpace etc. They quickly run out Resources like space, memory, CPU cycles, when requests are made on them on a global basis. Our Approach during the Project: We wished to illustrate how distributed database frameworks can be used to serve such data intensive needs. We used the Hadoop Framework that is deployed by the likes of Facebook, Yahoo etc Design of our Project: We already had the Hadoop Framework to work with. We designed Database schemas to it and then designed a Web Interface for our SamPra Networking Website. We configured the Master and Slaves to be used to serve data requests and brought it all together.

Slave 1 <---> Master <-->Slave 2 <--> Slave 3 Configure Hadoop Master / Slaves Design Database Schema for SamPra Populate the Database in the Hadoop Storage SamPra Website Front-end

Tables
Table Name: Profile Column Family: ID Attribute Name First_Name Last_Name Email Password Phone_No Gender Birthday Address About_Me IDP Type Varchar Varchar Varchar Varchar Int Char Date Varchar Varchar Int Column Family: Work Attribute Type Size 3 2 20 80 40 2 Description Companies Involved Years of Experience Name of Company Address of Company Designation of Person No. of Work Months Total_Companies Int Total_Experience Int CName Address Designation Work_Months Varchar Varchar Varchar Int Size 20 20 25 20 10 2 40 80 6 Description The First Name of Person The Last Name of Person Email of Person Hashes of Password Selected The Mobile Number The Gender of Person The Persons Birthday The Persons Residence Person in his own words Unique Id of Person

Column Family: Education Attribute School_Name Percentage_10 JCollege_Name Type Varchar Int Varchar Size 20 2 20 Description Name of School Attended Percentage in 10th Name of Junior College Attended

Percentage_12 College_Name

Int Varchar

2 20 2

Percentage in 12th Name of the College Percentage in Grad

Percentage_Grad Int

Table Name: Friends Column Family: ID

Attribute Friends_List Sent_Request

Type Varchar Int

Size 30 3 3 30

Description List of Friends No. of Requests Requests Pending UID of Blocked People

Pending_Requests Int Blocked_List Varchar

Table Name: Updates Column Family: Posts

Attribute Total_Status Status From Likes

Type Int Int Varchar Int

Size 2 2 20 3

Description Total Posts made Status of Post Fname of Person No. of Likes on Post
7

Column Family: Comments

Attribute Name Comment_ID From Receiver Comment Date Likes

Type Int Int Int Varchar Date Int

Size 6 6 6 100 3

Description Unique Id of Comment Id of Sender Id of Receiver The Actual Comment The Date of Comment The no. of Likes on Comment

Column Family: Messages

Attribute Total_Messages Message From MStatus

Type Int Varchar Int Int

Size 3 100 6 1

Description The total count of Messages The Actual Message The UID of Sender Read or Not

Application Of SamPra Social Networking:


The Applications SamPra Website over Distributed Framework are many. Social networking service is a platform to build social networks or social relations among people who, for example, share interests, activities, backgrounds, or real-life connections. A social network service consists of a representation of each user (often a profile), his/her social links, and a variety of additional services. Most social network services are web-based and provide means for users to interact over the Internet, such as e-mail and instant messaging. Online community services are sometimes considered as a social network service, though in a broader sense, social network service usually means an individual-centered service whereas online community services are groupcentered. Social networking sites allow users to share ideas, pictures, posts, activities, events, and interests with people in their network. The distributed framework we illustrated through this projects can offer many advantages in Databases which are huge and need to spread over different networks, these are:

Management of distributed data with different levels of transparency like network transparency, fragmentation transparency, replication transparency, etc. Increase reliability and availability. Easier expansion. Reflects organizational structure database fragments are located in the departments they relate to. Protection of valuable data if there were ever a catastrophic event such as a fire, all of the data would not be in one place, but distributed in multiple locations. Improved performance data is located near the site of greatest demand, and the database systems themselves are parallelized, allowing load on the databases to be balanced among servers. (A high load on one module of the database won't affect other modules of the database in a distributed database.) Modularity systems can be modified, added and removed from the distributed database without affecting other modules (systems). Reliable transactions - Due to replication of database. Distributed query processing can improve performance. Distributed transaction management.
9

Snapshots of the Developed Application

The Login Page (You can Login here or follow link to Register)

The Register Page (where first time users make a new profile)
10

The Main Home Page

11

The Profile Wall Page

12

What Individual Members did in the Project


Member 1: Pranshu Bajpai Role 1: Designed the Database Schema and Handled the Apache HBase Hadoop Database of multiple PCs (1 Master + 3 Slaves) Sample HBase Commands used: Create profile, username, email, pswd, phno, gender, bday, b_month, b_yr, relation_status, languages, about_you, address Create profile, id, work, education Put profile, pranshu, id:fname, pranshu Get profile, pranshu

Role 2: Populated the Database Tables with initial entries Role 3: Designed the developed the website Front-end using HTML and CSS and JSP

Member 2: Sameer Patil Distributed Computing Framework Deployment Role 1: Configured the Apache Hadoop framework The distributed computing framework Hadoop was deployed on 1 master and 3 slaves. The master would have following services running: HQuorumPeer HMaster NameNode JobTracker The slaves would have following services running: Regionserver Datanode TaskTracker Role 2: Coding Modules
13

Você também pode gostar