Documentum come with its own ranking formula. Administrator has no way to fine tune the configuration to integrate with other open source search engine. Rank normalization can be achieve through user feedback. Simply user themselves vote and assert priority on the individual search engine. How to determine the voting? For every search result the user and selected, the respective search engine vote will plus one. therefore the next search query, The higher vote search engine get to display more result as compare to the remaining search engine. However there always a fixed threshold to ensure at least each search engine got a chance to display some result for example 5 articles.
Simplest approach - set a fixed number of results for each search engine. Equal priority is assign to each search engine. Total result = Set( top 5 elements from SE01 ..... SE0X )
Thursday, June 5, 2008
Captiva InputAccel Server
Server is configured as a window service in MS operating system which is suitable for administrator to manage the production service. Error or reporting log can bind to NT event logs or external text file.
User management is mapped to the NT user account. Indirectly the user account for IA service is manage via the NT users manager. However the security rights and roles are configured in IA server itself.
Maintenance and house-keeping task can be automated via the timer setup. For Example, remove all non-active batches after 2 weeks.
User management is mapped to the NT user account. Indirectly the user account for IA service is manage via the NT users manager. However the security rights and roles are configured in IA server itself.
Maintenance and house-keeping task can be automated via the timer setup. For Example, remove all non-active batches after 2 weeks.
Monday, June 2, 2008
EMC Captiva InputAccel
Trying to archive huge amount of hardcopy document?
Beside the basic scanning process to convert hardcopy to digital format to ease backup of information. InputAccel provides additional OCR, form validation, data extraction and export feature.
Validation - requires operator to validate the required fields and provide the correct inputs. Possible to integrate validation Dll file to communicate to the server, or validate via a txt file. For Example, the fields value can be auto-populated via a odbc connection and select by specifying the SQL query where clause. Operator can confirm the populated value with the image and perform a final verification. The setup is quite flexible as if the record is not found within the database, insertion will be perform else update of record.
Data extraction - Allow user to pre-defined the area which hold the required data. Ability to extract data from barcode.
Beside the basic scanning process to convert hardcopy to digital format to ease backup of information. InputAccel provides additional OCR, form validation, data extraction and export feature.
Validation - requires operator to validate the required fields and provide the correct inputs. Possible to integrate validation Dll file to communicate to the server, or validate via a txt file. For Example, the fields value can be auto-populated via a odbc connection and select by specifying the SQL query where clause. Operator can confirm the populated value with the image and perform a final verification. The setup is quite flexible as if the record is not found within the database, insertion will be perform else update of record.
Data extraction - Allow user to pre-defined the area which hold the required data. Ability to extract data from barcode.
Image Enhancement - Similar to photoshop tool which improve the quality of the image. IE is pretty impressive, its consist of a full suite of features to fine tune the scanned images. Tools such as remove border, skewing, noise removal. With such filters, the processed images would be in a good condition for OCR or export into pdf. Two main set of filters are categorized into black/white and color.
Export - documentum, commonly used databases, xml etc.Sunday, June 1, 2008
Documentum Search
Documentum come with a index search component. The search engine consist of 2 parts. The Index server and indexing agent. Upon inserting of new record into documentum content server. the file would be automatic append to the indexing queue. And depend on the file size, 5-10 minutes is required to index the new file and ready to be search. The support file types can be indexed are txt, (S)pdf, doc... etc. With documentum search can include structure/unstructure data. Advance search such as phrase, partial searching can be supported.
EMC documentum
Data is not simple anymore. Data consist of text, pictures and mutlimedia files. Technically is divided into structure/unstructure data. The existing database can handle structure data well. However for unstructure data is more complex and usually bigger in file size which is not efficient to be store in traditional database. Documentum separate the storage of structure and unstructure data. unstructure Data will be stored in its original state and reside in a file system repository. As for structure data will be place into popular RDBMS. Documentum provide a intermediate interface which help to retrieve and store structure/unstructure data efficient. Therefore the complexity is reduced. Developer can use documentum API to retrieve data. The management of unstructure/structure data is handle completely by documentum itself.
Documentum describe file as an object. OOP concept. Which is like object mapped to relational database. Documentum also have their query language which is similar to sql statements.
Documentum describe file as an object. OOP concept. Which is like object mapped to relational database. Documentum also have their query language which is similar to sql statements.
ExtJS web uploading tool
extjs framework with ajax allows remote file management just like windows explorer. Loading progress bar and transfer rate are shown to ensure effective response to user. Extjs also provide batch uploading feature.
With the Tree Panel structure, navigation of file directory is fast and look ahead search feature can be added.
One of the advantages of Ajax is asynchronisation allows agent to perform other task while the uploading still in process. And upon successful uploaded, proper alert message will be trigger to notify agent of the upload status. Therefore the web application itself can be more efficient and allow multi-tasking.
With the Tree Panel structure, navigation of file directory is fast and look ahead search feature can be added.
One of the advantages of Ajax is asynchronisation allows agent to perform other task while the uploading still in process. And upon successful uploaded, proper alert message will be trigger to notify agent of the upload status. Therefore the web application itself can be more efficient and allow multi-tasking.
Subscribe to:
Posts (Atom)