Escolar Documentos
Profissional Documentos
Cultura Documentos
0 System
FULL TEXT SEARCH - SOLUTIONS EVALUATION
Submitted by
www.Patni.com
Version: 1.0
Date: 07 November 2008
TABLE OF CONTENTS
1
1.1
INTRODUCTION.........................................................................................................4
Salon 3.0 Requirements....................................................................................................................................4
POSSIBLE SOLUTIONS............................................................................................4
2.1
File System Based Search - Using Windows Indexing Service.....................................................................4
2.1.1
Steps for configuring this service:-.............................................................................................................4
2.1.2
Security Points for Windows Indexing Service :-......................................................................................8
2.1.3
Pros and Cons.............................................................................................................................................8
2.2
Database Based Search Using SQL Server 2005 Full Text Search............................................................8
2.2.1
Pros and Cons..........................................................................................................................................14
POC DETAILS...........................................................................................................15
3.1
POC Implementation Details.........................................................................................................................17
3.1.1
Using Windows Indexing Service.............................................................................................................17
3.1.2
Using SQL Server 2005 FullText Search................................................................................................18
PATNI RECOMMENDATION....................................................................................19
APPENDIX................................................................................................................19
5.1
Reference.........................................................................................................................................................19
DOCUMENT CONTROL:
Security
Classification:
Issue Date:
Author(s):
Patni Confidential
07 November 2008
Name
Sameer M
Title
Technical Designer
Archana Kamat
Technical Architect
Reviewer(s)
Document History:
Date
Revision
Change
17 Sep 2008
0.01D
Initial Draft .
07 Nov 2008
1.0
1 INTRODUCTION
This document provides details on the Full Text search requirement of Salon 3.0 system. It also
explores possible solutions for implementing the same.
2 POSSIBLE SOLUTIONS
These requirements for Full Text Search can be achieved by following ways -
2.1.1
4) Right-click on 'Indexing Service' and select 'New' 'Catalog' from the list that appears.
This will present the following dialogue box.
5) Enter the catalog a name like Search and specify the location of the catalog where it
will be stored.
6) Press 'OK' to continue.
7) On the catalog created, select the directory folder. Right click and select new directory
menu option. In the displayed window give the path of the directory that needs to be
included in the search operation.
2.1.2
2.1.3
Pros
Cons
Security Points for Windows Indexing Service :The Indexing service runs on the local system account. It can not be configured to run
in any other context.
On a local computer indexing service uses the System account to operate. If the system
account does not have access to documents or directories, Indexing service will not be
able to index the documents.
Any authenticated local or remote user can issue Indexing Service queries.
2.2 Database Based Search Using SQL Server 2005 Full Text
Search
Steps for using this service:
1) Open the Microsoft SQL Server Management Studio and connect to the SQL Server 2005
database instance where the full text catalog setup needs to be created.
2) Create a table for storing files. For example :
CREATE TABLE [dbo].[Documents](
[documentid] [int] IDENTITY(1,1) NOT NULL,
[FileName] [nvarchar](50) NULL,
[FileSize] [int] NULL,
[ContentType] [nvarchar](50) NULL,
[full_Text_bin] [varbinary](max) NULL,
[Extention] [nchar](10) NULL,
CONSTRAINT [pk_documents] PRIMARY KEY CLUSTERED
(
[documentid] ASC
))
3) Ensure the Full-Text search is enabled on the selected database. Open the database
properties screen, then select the Files page. This window has "Use full-text indexing"
checkbox for enabling or disabling the full-text search on this database. If the option is
disabled, then enable it by checking the checkbox.
4) On the selected database go to the Storage -> Full Text Catalogs folder. Right Click and
select the option New Full Text Catalog. This will bring a new window
5) On the window enter the Catalog name like Search, its location and other details and
click the OK button. This creates the catalog for the database.
6) Select the catalog, right click and on the displayed menu select Properties option. A
new window will be displayed.
7) On the displayed window select the Tables/Views option. The screen looks as follows
8)
Assign the Documents table from the displayed list to the catalog. In the Selected
object properties section, under Available Columns, tick the check box for
full_text_bin field. Under the Data Type Column for the full_text_bin field, select
the Extension field from the dropdown. Click the OK button to save the changes.
9) This completes the setup for creating a catalog on the SQL Server 2005.
2.2.1
Pros
Cons
Programming point of view storing files and retrieving from database is more tedious
than that on file system.
SQL Server 2005 imposes restriction on file size of the file to be stored in the DB. Max
file size allowed is 2 MB.
3 POC DETAILS
A POC has been done to test both solutions of Full Text Search.
The details of the POC are as follows
Below files considered for the full text search
POC for Salon.xls(20KB)
Analysis Report Screen.htm (3KB)
License.txt(3KB)
proposal-expectations.doc(86KB)
SalonSystem3.0_Request_Module_UCS.doc(170KB)
abcpdf.pdf(336 KB)
NET Memory Profiler.pdf(1.23 MB)
A web application created for entering search criteria and two submit buttons one for File
System Full Text Search and the other for Database Full Text Search.
The user can enter the search string in the text box and click one of the submit buttons.
On clicking the Search In File System button, a search is performed on the catalog
created on the IIS machine. The search results and the turn-around time taken for the
search operation is recorded and displayed on the screen:
On clicking the Search In Database button, a search for the entered string is done on the
catalog created on the database. The search results and the turn-around time taken for the
search operation is recorded and displayed don the screen:
Code Snippet for data search on file system using OleDB provider of Ado .NET
DataSet ds = new DataSet();
string query = "";
try
{
string strconnection =
ConfigurationManager.AppSettings["IndexService"].ToString();
OleDbConnection connection = new
OleDbConnection(strconnection);
if (strsearchstring.IndexOf('*') > 0 )
Fulltext search on file system tested for local and remote users who are successfully
authenticated and they do not have admin access on the file system of the application
server.
3.1.2
Implementation of search is similar to any record search in the database tables. ADO .NET
Sqlclient provider used for database access.
Code Snippet for data search on the database using SqlClient provider of Ado .NET
DataSet ds = new DataSet();
string query = "";
try
{
string strconnection =
ConfigurationManager.ConnectionStrings["Databaseconnection"].ToString();
if (strsearchstring.IndexOf('*') > 0)
query = "SELECT FileName FROM Documents WHERE
CONTAINS(full_Text_bin,N'" + '"' + strsearchstring + '"' + "')";
else
query = "SELECT FileName FROM Documents WHERE
FREETEXT(full_Text_bin,N'" + '"' + strsearchstring + '"' + "')";
SqlConnection dbConn = new SqlConnection(strconnection);
dbConn.Open();
dbConn);
4 PATNI RECOMMENDATION
On analyzing the possible-solutions details and POC results, we feel both solutions are suitable
for Salon 3.0 system requirements. That is either
File-System based document storage and full-text-search can be used. OR
Database based document storage and full-text-search can be used.
Patni recommends File system based documents storage and full text search. Reasons are as
below:1. All pros(of File System based search) as detailed in prior section are in favor of Salon 3.0
system requirements
2. All cons (of File System based search) as detailed in prior section do not have major
impact on Salon 3.0 system requirements.
For example: Cons related to security,data backup can be taken care with appropriate
Operational and administrative solutions.
Regarding no support for XML files, we assume XML is not a required file type to be
supported for Salon 3.0 system. File types supported by File system based full text search
are MS Office files, MIME messages, HTML, text files, PDFs.
3. As per the proposed infrastructure requirements for Salon 3.0 system, the application
server will be a dedicated one. And as per the P&G infrastructure standards no clustered
environment available for dedicated application server. Hence server affinity issue is not
anticipated in case of file-system based storage.
Even in future , if clustered environment is applied then solution can be devised by having
a central server for file storage.
4. For filesystem-based search, to support PDF files adobe ifilter(free downloadable) has to
be installed on the application server. Since Salon3.0 application server is dedicated one
this installation should not be a problem.
5. Having DB based file storage and search may call for more tables, stored-procedures and
also some special settings on the database.
As per the proposed infrastructure
requirements for Salon 3.0 system, the database server will be in shared environment. As
per P&G infrastructure standards, there are restrictions on the DB size, # of tables, stored
procedure etc for the DB hosted in shared server.
6. SQL Server 2005 database has 2MB file size restriction. But there is no pre-defined min-size
or max-size for the files/documents to be stored in Salon 3.0 system.
5 APPENDIX
5.1 Reference
http://msdn.microsoft.com/en-us/library/ms142571(SQL.90).aspx
http://msdn.microsoft.com/en-us/library/aa163263.aspx
POC Source Code - In Salon 3.0 VSS