Você está na página 1de 6

ITS Management & IT Institute

A Study on : Effective Management of Data in Social Networking Sites(SNS)


Gourav Kumar Dubey Department of PGDM- (2011-13) I.T.S- Management & IT Institute Mohan Nagar, Ghaziabad
gouravkumardubey@its.edu.in

Abstract
In todays fast growing commercial world Social networking sites (SNS) like Facebook, Twitter etc, are the major source for maintaining social communication, & doing e-business. Now a days dynamic data discovery is a part for the activity of many organizations in banking sector, content providing sites and ecommerce websites. In this paper, we will discuss the new methods for collecting social nework strucuture, and an area in data management, probabilistic databases that how actually social networking sites maintained their databases. Keywords: Social networking sites (SNS), social communication, query data base, cluster based analysis, SLINK, MST, NoSQL, Sharding, PHP.

1. Introduction to Social Networking Sites


Social networking sites(SNSs) such as MySpace, Facebook, Cyworld, & Bebo have attracted millions of users, many of whom have integrated these sites into their daily practices. We define social network sites as webbased services that allow individuals to (1) construct a public or semipublic profile within a bounded system, (2) articulate a list of other users with whom they share a connection, and (3) view and traverse their list of connections and those made by others within the system. 1.1 Facebook a social networking service where users create personal profiles, add other

users as friends and exchange messages, users may join common-interest user groups, organised by common characteristics (e.g. workplace). Founded by :- Mark Zuckerberg, Eduardo Saverin, Andrew McCollum, Dustin

Moskovitzand Chris Hughes(roommates) & Launched in February 4, 2004 &, having 955 million users (active June 2012).

Page 1 of 6 Gourav Kumar Dubey PGDM-(2011-13)

ITS Management & IT Institute

1.2

Twitter- is an online social networking service and micro blogging service that enables

its users to send and read text-based messages of up to 140 characters, also known as "tweets". Founded by:-Jack Dorsey & Launched in July15, 2006, having over 500 million active users as of 2012. LinkedIn a business-related social networking site mainly used for professional

1.3

networking. This list of connections can then be used to build up a contact network, followdifferent companies and find jobs, people and business opportunities. Founded by:- Reid Hoffman, Allen Blue, Konstantin Guericke, Eric Ly, Jean-Luc Vaillant & Launched in May5 2003, having > 175 million registered users in more than 200 countries and territories. MySpace an online community of users personal profiles. These typically include

1.4

photographs, information about personal interests and blogs. Founded by :-Chris DeWolfe, Tom Anderson & Launched in August 2003, having 25million users (June 2012). YouTube . is a video-sharing website, created by three former PayPal employees in

1.5

February 2005, on which users can upload, view and share videos. Founded by:- Chad Hurley, Steve Chen, and Jawed Karim & Launched in February14, 2005 and 54 language versions available through user interface.

2. Different Methods Used For The Management Of Facebook Data:


Facebook has been developed from the ground up using open source software. Developers building with Platform scale their own applications using many of the same infrastructure technologies that power Facebook. Facebook stores up to 800 pages of personal data per user account. As Facebook having 955 million users (active June 2012) but how does it actually works..?

Page 2 of 6 Gourav Kumar Dubey PGDM-(2011-13)

ITS Management & IT Institute

2.1.

Platform- Our Platform engineering team has released and maintains open source SDKs

for Android, iOS, JavaScript and PHP. 2.1.1. PHP is an incredibly popular scripting language which makes up the majority of our code-base. Its simple syntax lets us move fast and iterate on products. PHP is a powerful language that can be used to enhance your web pages. It is also commonly used to create many of the powerful applications (such as Blogs, Wiki's and Content Management Systems) that you may use on your web site, so even a high level understanding of PHP can be very useful and it is good for rapid iteration. 2.1.1.1.Content Management Software Allows users to focus on the content rather than on the mechanics of presenting the information. Web sites tend to be very standards compliant. This means that the content will be displayed 'correctly' on all browsers and all types of devices (including often overlooked smart phones). Popular CMS's:XOOPS, Joomla , PHP-Nuke, MODx, pwpwcms, Drupal, e197, Mambo,

2.1.3. Linux & Apache This part is pretty self-explanatory. Linux is a Unix-like computer operating system kernel. Its open source, very customizable, and good for security. Facebook runs the Linux operating system on Apache HTTP Servers. Apache is also free and is the most popular open source web server in use. 2.1.4. MySQL For the database, Facebook utilizes MySQL because of its speed and reliability. MySQL is used primarily as a key-value store as data is randomly distributed amongst a large set of logical instances. These logical instances are spread out across physical nodes and load balancing is done at the physical node level. As far as customizations are concerned, Facebook has developed a custom partitioning scheme in which a global ID is assigned to all data. They also have a custom archiving scheme that is based on how frequent and recent data is on a per-user basis. Most data is distributed randomly.

Page 3 of 6 Gourav Kumar Dubey PGDM-(2011-13)

ITS Management & IT Institute

3.
3.1.

Developer tools
Codemod assists with large-scale codebase refactors that can be partially automated but

still require human oversight and occasional intervention. 3.2. Facebook Animation is a JavaScript library for creating customizable animations using

DOM and CSS manipulation. 3.3. Online Schema Change for MySQL lets you alter large database tables without taking

your cluster offline. 3.4. Phabricator is a collection of web applications which make it easier to write, review,

and share source code. It is currently available as an early release. 3.5. phpsh provides an interactive shell for PHP that features readline history, tab completion,

and quick access to documentation. It is ironically written mostly in Python. 3.6. Three20 is an Objective-C library for iPhone developers which provides many UI

elements and data helpers that were used in older versions of our iPhone applications. 3.7. XHP is a PHP extension which augments the syntax of the language such that XML

document fragments become valid expressions. 3.8. XHProf is a function-level hierarchical profiler for PHP with a simple HTML-based

navigational interface.

4.

NoSQL Database Technologies

These NoSQL databases, each eschewing the relational data model, are a far better match for the needs modern interactive software systems. 4.1. No schema required Data can be inserted in a NoSQL database without first defining a

rigid database schema. This provides immense application flexibility, which ultimately delivers substantial business flexibility. 4.2. Auto-sharding (sometimes called elasticity) A NoSQL database automatically
Page 4 of 6 Gourav Kumar Dubey PGDM-(2011-13)

spreads data across servers, without requiring applications to participate. A properly managed

ITS Management & IT Institute

NoSQL database system should never need to be taken offline, for any reason, supporting 24x7x365 continuous operation of applications.

5.
5.1.

Infrastructure
Apache Cassandra is a distributed storage system for managing structured data that is

designed to scale to a very large size across many commodity servers, with no single point of failure. 5.2. Apache Hive is data warehouse infrastructure built on top of Hadoop that provides tools

to enable easy data summarization, adhoc querying and analysis of large datasets. 5.2.1 Apache Hadoop provides reliable, scalable, distributed computing infrastructure

which we use for data analysis. 5.2.2. Apache HBase is a distributed, versioned, column-oriented data store built on top of the Hadoop Distributed Filesystem. 5.3. FlashCache is a general purpose writeback block cache for Linux. It was developed as a

loadable Linux kernel module, using the Device Mapper and sits below the filesystem. 5.4. HipHop for PHP transforms PHP source code into highly optimized C++. HipHop

offers large performance gains and was developed over the past two years. 5.5. Scribe is a scalable service for aggregating log data streamed in real time from a large

number of servers. 5.6. Thrift provides a framework for scalable cross-language services development in C++,

Java, Python, PHP, and Ruby. 5.7. Tornado is a relatively simple, non-blocking web server framework written in Python. It

is designed to handle thousands of simultaneous connections, making it ideal for real-time Web services. 5.8. Varnish serves billions of requests every day to Facebook users around the world.

Whenever you load photos and profile pictures of your friends, there's a very good chance that Varnish is involved.
Page 5 of 6 Gourav Kumar Dubey PGDM-(2011-13)

ITS Management & IT Institute

6.

Documentation and Code Licensing

Documentation or source code examples within the Facebook Platform Documentation under the terms of the Creative Commons Attribution-ShareAlike 3.0 license. Source code examples within the Facebook Platform Documentation are further licensed under the terms of the Apache License, Version 2.0. 7.

Conclusion

In a nutshell, thats how actually facebool & other Social Networking Sites(SNS) maintain or manage their large database, If you look past all of the features and innovations the main ideas behind Facebook is really very basic keeping people connected. Facebook realizes the power of social networking and is constantly innovating to keep their service the best in the business.
8.

Key References
Salina Adinaryana1, Dr. G. Samuel Vara Prasada Raju2 , Allam Mohan3 (2012),

8.1.

Detecting Identification Anomalies in Social Networking with Cluster based re-ranking and Slink Algorithms International Journal of Modern Engineering Research (IJMER), Vol.2, Issue.4, July-Aug. 2012 pp-2839-2842. 8.2. Fiona Redmond(2010), Social Networking Sites: Evaluating and Investigating their use

in Academic Research, http://creativecommons.org/licenses/by-nc-sa/1.0/. 8.3. Eytan Adar and Christopher Re(2007) Managing Uncertainty in Social Networks

journal of the IEEE Computer Society Technical Committee on Data Engineering. 8.4. Andrea Broughton, Tom Higgins, Ben Hicks and Annette Cox(2010) Workplaces and

Social Networking Acas, Advice leaflet - Internet and e-mail policies: www.employmentstudies.co.uk. 8.5. www.facebook.com

8.5.1. http://developers.facebook.com/licensing/ 8.5.2. http://mirror.facebook.net/apache/ 8.5.3. http://developers.facebook.com/docs/ Page 6 of 6 Gourav Kumar Dubey PGDM-(2011-13)

Você também pode gostar