Você está na página 1de 7

Vision of Data Processing Environment at Social Research Institute.

Background
Social research institutes plays vital role in coordination between government and public being part of formulation and monitoring developmental plan. They also help in understanding nature and status of the society belonging to its area of study. Due to its specific role, it hold data which has different nature from that of industry and government statistical system. While industry has more emphasis on managerial type data (for taking decisions on day to day activity) and hence it is mostly of online nature, social research institutes use data for policy formulation and monitoring purposes (and hence off line nature). Unlike government statistical organization, they rely many times on qualitative surveys and experiment with new methodologies, indicators and subjective nature of data. Unlike industries and government statistical system, they heavily depend upon data from many sources which are not always synchronized at scale of time, geographical nature and purpose. Due to their specific data requirement, social research organization require a particular type of data processing environment. Due to availability of vast computational power in Information Technology (IT) in last two decades or so, in turn, impacted significantly on the techniques for designing and implementing social research (qualitative and quantitative). Parallel to the developments in hardware, there is significant improvements in the quality and user friendliness of software for statistical data processing, analysis, and dissemination. This has also made it possible for many of the processing tasks to move from computer experts to subject matter specialists. A number of software packages for the processing of statistical surveys have emerged over the years. The relative strengths for each of these software products differ with the different steps of data processing. Use of suitable software, for different steps of data processing, and training have significant role in plan for realization of vision of modern data processing system.

Vision
Vision of data processing environment for social research institute may be expressed through following capacities and behaviors: 1. Institute is capable of large scale qualitative and quantitative data analysis. 2. Any data related with qualitative or quantitative research may be released for analysis with in four months of field work. 3. Data processing may help in monitoring of field work (problem of probing etc.) through patterns in incoming data. 4. Sufficient computational and analytical skill to adopt full strength of computer based analysis (see annexure-1). 5. Can easily adopt any new methodological change in data capturing, analysis, presentation, dissemination and computerized content as well as knowledge management system. 6. Have rich data bank comprising all relevant data and documents either owned by institute or collected from others (may be panel data). It is integrated with broader network with various level of asses to users of data. 7. Have good links with other institutes and individual users of its study for sharing data and ideas through social network.

Organization of Data Processing


To proceed in direction of above vision, modular approach will provide more adoptability and flexibility to implement plan of realization of vision. Total data processing environment may be divided in centers which will perform different steps of data processing. These centers have been created according to different nature of work, requirements of software (and its training) and skills to perform the task. Developing all centers simultaneously to perfection level is not essential. They can be developed in phases. Although Data Bank is central part of data processing, we can develop data processing system from periphery. Centers may be given priority as follows:

(1) Data preperation center:


Data Preperatio n center

Center for Analysis

Data Bank

Dissemin ation center

Center of Social Network

Although data may come in different form (like textual, number, audio, video etc.), we can concentrate on numeric (quantitative) and textual data obtained as outcome of quantitative and qualitative survey at initial stage. Data preperation of quantitative an qualitative surveys are entirely different (and hence different skill and software required), separate wing may be created for preparing quantitative and qualitative survey data. Following will be requirement of wings: Quantitative wing:

Hardware: PCs (of moderate strength). Number may vary as per work load Software: CSPro (free to download). Responsibility: Data entry, data validation, codification, basic predefined tabulation, generation of field monitoring reports. Skill: In charge of center should have understanding of (1) logic associated with questionnaire (2) steps of data preperation (3) program development through CSPro (4) basic understanding of database, spreadsheets, data archiving (for in charge). Rest of staff will work as data entry operator. Basic knowledge of computer (file system) will be required for them. Link: Questionnaire preperation team, data bank Qualitative wing: Hardware: PCs (of moderate strength). Number may vary as per work load. Software: AtlasTi, Anthopac, Answer (free), ez-text (free) Responsibility: Entry of field report (or its summary) according to format required for software, creating codes. Skill: Understanding of subject, capable to create suitable quotation and code from text. All faculty and research scholars who are involved in qualitative research should have skill of running such software. Link: Team of qualitative research, data bank

(2) Center for analysis


All faculty and research scholars should be attached with analysis center. Hardware: PCs with sufficient RAM and CPU strength to all faculty. A good lab for research scholars. Responsibility: Doing exploratory and confirmatory data analysis, report writing, preperation of presentation. Skill: Knowledge of using word processor, spread sheets, slide preperation tools, statistical software, GIS based modeling and simulation Software: MS Office, Open Office (free), Epi Info (free) for presentation through map (other open source GIS software may be selected according to level of requirement, see for other sources), Stata (more suitable for analysis of large complex surveys). Link: Data bank

(3) Dissemination center


Hardware: PCs of sufficient strength. Software: Basic knowledge of HTML, CSS, HTML Editor. There are many tools available which reduce programming load for its user. Druple is one of them which is freely available. There are many free html editor also available. Most of the content management tools have its own HTML editor. Responsibility: Center will receive raw documents in form of soft copy from its faculties and will convert them in suitable format for publishing (in hard copy as well as on web). Unless development of databank, all part of content management- creation, editing, publishing and managing (archiving) will be responsibility of this center. Skill: Aesthetic sense of word processing, skill to use content management tools. Link: Analysis center, data bank

(4) Center of social network


Any social research institute can not work in isolation. Recent developments in IT and web, has made it possible to use social network for learning and research. There are many benefits of social networking at individual level as we as organizational level. Following are benefits at organizational level: 1. Make sure knowledge gets to people who can act on it in time. 2. Connect people and organization to build relationships across boundaries of geography or discipline.

3. Provide an ongoing context for knowledge exchange that can be far more effective than memoranda. 4. Attune everyone in the institute to each other's needs more people will know who knows who knows what, and will know it faster. 5. Multiply intellectual capital by the power of social capital, reducing social friction and encouraging social cohesion. 6. Create an ongoing, shared social space for people who are geographically dispersed. 7. Amplify innovation when groups get turned on by what they can do online, they go beyond problem-solving and start inventing together. 8. Create a community memory for group deliberation and brainstorming that stimulates the capture of ideas and facilitates finding information when it is needed. 9. Improve the way individuals think collectively moving from knowledgesharing to collective knowing. 10. Turn training into a continuous process, not divorced from normal business processes. Hardware: PC with sufficient bandwidth. Software: Most of the social software are available as web services and are free. Responsibility: In charge of center will analyze, expand and maintain social network of institute. Skills: Although faculties and staff will be member of this center. In charge of center will maintain communication on behalf of institute at platform of social network. Link: Faculty and staff, all centers, external people and organization.

(5) Data bank


Data bank is central part of data processing system. It is the center through which other center will be coordinated. Apart from own data and report, center will work as consortium of different academic and research institutes as well as external socio economic data banks like Inter University Consortium for Political and Social Research ,
The United Nations Statistics Division, Minnesota Population Center, IQSS Dataverse Network etc.

Hardware: Sever and PCs with sufficient bandwidth. Institute can hire web hosting services for maintaining its external link. Software: Tools for webmaster (to be selected by webmaster according to his confidence. Many open source tools are available). Skill: Role of data center is very challenging. Its in-charge should be capable to configure server, install application at host site and integrating web services. He should know server and client scripting language (like PHP and Javascript) and Database management tools. Responsibility: Following are responsibilities of data bank center 1. Create catalog of data and reports. 2. Put uniform code for geographical area (in different data sets) so that they may be linked 3. Create different aggregation level of data as per need. 4. Provide data in required format 5. Create metadata for data collected by institute. It will help to share data. 6. Preparing time series micro- economic data banks 7. Role of webmaster Links: With all centers and external network.

Challenges in realizing the vision


1. It is difficult to identify a role model. A lot of experimentation are going on at international level. There is need to be cautious to choose own path by learning from on going experimentation.

2. IT people are trained as per need of business and industry. It may be difficult to identify suitable people (or trainer) according to need of institute. 3. There may be resistance for change in role of faculties and staff. 4. Old habit may resist for new change. Chances of resistance increases because gain (through data processing system) can be perceived only after certain level of perfection. 5. Training is crucial for vision. For successful training, it is necessary to fix target of achievement at organization and individual level (in terms of work) after particular training. This is difficult to implement. Trainer also may not be ready for it (it will require many follow-ups). 6. Hierarchy may have objection to assign higher role to efficient person. 7. Weak motivation for training in participants.

Conclusion
From above discussion, it clear that for developing good data processing system, apart from investment in hardware, there is little monetary investment in software is required. Real issue in developing a good data processing environment is training. Training for most of areas are also available on net (even free) Sufficient will and motivation can lead a social research institute in direction of developing a modern data processing environment.

Annexure-1

Role of Computational Skill in Statistical Analysis


Hurdles in statistical analysis
1. Vague vision regarding statistics- whether it is number or methodology or way of thinking; 2. Less importance to variation as compared to center of data. The main cause seems to lack of computational capability. Due to this reason, statistical scale could not be developed properly; 3. Simulation as a tool of analysis could not get desired importance, again due to lack of computational skill; 4. Statistical weights based on data did not used for conversion of a unknown phenomena to a number (use of latent variable), which creates unresolved disputes; 5. Lack of proper sampling design, restricts to generalize results in right manner; 6. Generally statistical results are interpreted as causal relationship.

Common view on computer based computational capability


1. 2. 3. 4. 5. Required as it works fast; It is useful as it hides mathematical complexity of statistical tools; Obtained results are more accurate; Little computational burden; Only investment is a computer and some feel that a statistical package with skill to run it is also required.

What is reality
1. It works fast only if data is organized in proper format; 2. It hides mathematical complexity but it requires clear understanding of assumptions and interpretation lying behind statistical tools. Application of tools without feeling of data may lead to misleading results; 3. It may provide inaccurate, sometimes more disastrous results, if proper steps are not followed; 4. Yes, it ease the burden of computation, if logical complexities are less and dataset is large; 5. Apart from investment for computer and skill to run statistical software, skill to organize data is required. In fact most of the analyst did not change orientation for data analysis in spite of fast improvement in computational capabilities. How new framework of analysis should be different from old one, capabilities required and new concepts emerging due to availability of power of computational tools can be understood by comparing old framework of analysis with new one (as follows):

Old framework of analysis


Start analytical work by following precedence in the area of study Format of analysis is fixed before planning of data collection Computational skill; and analysis and interpretation are treated different entity Descriptive analysis is based only on different measures of central tendency such as mean, median, mode etc Testing of assumptions for use of certain statistical tools is almost neglected

New framework of analysis

Start analysis with an attempt to know and feel the data (exploratory data analysis) Mixed strategy is followed with more emphasis on learning from data Needs computational and analyzing skill in same person Apart form studying central tendency of data, more emphasis is given on variation in data Testing of assumptions of tools and transforming the data to meet these assumptions is given importance Anything computed is worth for reporting A major part of computation is meant for understanding and feeling the data Computational work cannot be reused Reusability is significant part of skill Believe that analysis start after obtaining Believe that analysis starts with planning the data of survey Missing values and non response is not Missing and non response can be handled given due weight due to computational easily problems Sampling design is not important for Sampling design is important for applying developing a model a model Only those statistical model should be Simulation may be used where analytical used which has clear mathematical solution is not possible solutions Understanding of behavior of data in Understanding of probabilistic terms of probability is not much important interpretation of behavior of data is important

Você também pode gostar