From willh@cac Wed Aug 6 13:28:30 1997 Date: Fri, 5 May 1995 15:31:08 -0700 (PDT) From: Will Hall To: webscan@cac Subject: DRAFT Imaging Pilot Project Proposal DRAFT Imaging Pilot Project Proposal Table of Contents (DRAFT) ------------------------- 1 Objectives 2 Introduction 3 Architecture 4 Project Outline 5 Project Organization 6 Timeline 7 Cost 8 Risk statement 9 Issues 1 Objectives (DRAFT) ---------------------- A To develop a pilot imaging system for the administrative offices of the University of Washington which meets their basic document storage and retrieval needs in a manner consistent with the University's technical direction. B To implement a pilot system for the Student Loan Files. 2 Introduction (DRAFT) ------------------------ Over the past five years, several departments have considered imaging applications. Some have launched substantial projects to define requirements, calculate costs and benefits, and meet with vendors. None have installed a system yet, but interest continues to increase. Computing & Communications conducted a survey to determine which administrative offices would benefit from an imaging system, and what the key requirements of that system would be. The survey found that the following 16 offices would significantly benefit from imaging: Student Loan Office Receivables Collection Office Grant and Contract Accounting Grant and Contract Services Payables and Accounting Operations Admissions Office Financial Aid Office Payroll Records Staff Personnel Office Benefits Office News and Information Graduate Admissions Office Graduate School Student Accounts Capital Projects Office Purchasing Office Instead of allowing these offices to acquire independent and probably incompatible systems, Computing & Communication proposes to work with the users to develop a common, centrally administered system which meets their needs within the University's technical direction. Key elements of that direction include: Information access from X terminals and PCs Distribution of information over the campus network Consistency with campus and higher education standards Supportability of systems by Computing & Communications The results of the survey and the initial conceptual design were presented in a draft report. Subsequent discussions have led to the refinements in this proposal, which supersedes the information contained in the draft report. 3 Architecture (DRAFT) ------------------------ A Approach The approach to the architecture is to use standard, widely available tools for the major components of the system and to develop the modules required to make the components function together as a user-friendly system. The user interface will be primarily through browsers such as Mosaic and Netscape. The application will run on a Web server. The images will be stored on magnetic disk on a dedicated image server, and the index to the images will be stored separately in a database. The system will be as generic as possible so that it can be applied to the different offices with minimal customization. The only high-level user requirements are to scan, index, store, search for, display and print business documents. The architecture will be modular, so that any piece of it may be replaced without impacting the overall system. If a new Web browser becomes available, it can be used without affecting the storage, indexing, or searching modules. If the indexing database is changed, the software used for scanning and displaying images will not be affected. And if the volume of data for a given site justifies an optical jukebox, only the image server would need to be changed. B Scanning Each department will do their own scanning using an appropriately sized scanner. Off-the-shelf scanning software will be used to capture the images and store them in an industry standard format on the image server. C Indexing Online forms will be developed to allow users to enter in the key identifying information for each document. When possible, that key information will be validated against existing databases. For example, in the Student Loan pilot site, if either the student name or Social Security Number is entered, it can be validated in the Student Data Base so that the document could be retrieved either way. Indexing can be done for each document as it is scanned or it can be done for a batch at a later time. D Retrieval Users can retrieve documents by filling in an online search form. If a search returns a single document, it is displayed on the screen. If it returns more than one, the user is given a list from which to choose. Since the application will follow Web standards, users can access documents from any Web browser. No custom client software is needed. 4 Project Outline (DRAFT) --------------------------- The project is broken down into several major tasks as defined below. The milestone tasks should include a major review by all parties with an interest in the project. A Design ** Milestone Define the way the users will interact with the system. Design the forms that will be used, create a prototype, and review with the users. Users must define their revised procedures for using the new system. Select the hardware, system software, scanner, image format, database manager and other components of the system. Define how these components will interact. Major milestone review when design is complete. B Development Program and test each of the modules of the system, including the screens, scripts, database routines, scanning and display components. C Hardware and Software Installation Acquire, install and configure the hardware and software to support the pilot system in production. Migrate the modules to the production system. D System Testing ** Milestone Using scenarios that mimic the production use of the system, test the integration of all components. Continue to test and adjust the system until all scenarios execute as defined in the User Design step. Users will be deeply involved in this task, and they will need to spend considerable time developing the business scenarios and verifying the system. Major milestone review when system testing is complete. E Training Develop training for the new system and deliver the training to all of the pilot site users. Users will be involved in developing the training on the system and users will do all of the training on new processes. F Conversion Work with the users to adjust from their current processes to the revised processes developed in the User Design step. Begin scanning documents into the system and begin retrieving documents from the system. The scanning will be completed by the users over a considerable period of time. The scanning effort is not included within this workplan. G Support ** Milestone Assist the users in adjusting to the new system and provide technical support. Evaluate the system and identify any significant weaknesses. Major milestone review three to four months after system is in use. 5 Project Organization (DRAFT) -------------------------------- Will Hall will be the part-time project manager and design analyst. ****TBD**** will be the full-time programmer/analyst. Frank Fujimoto will be the part-time system administrator focused on the hardware and system software for the servers. Ping-Yun Lo will be the part-time database administrator. The Student Loan Office and Receivables Collection Office will provide part-time user representatives. The involvement of these representatives is essential in order for the project to meet their needs. Specific individuals will need to commit time to work on project activities. Other in-house experts will be used for occasional consultation. 6 Timeline (DRAFT) -------------------- This is a prototype project. The design of a scalable, quality approach to imaging systems for the business environment is paramount, so it is difficult to accurately estimate a completion date. The target date for completing the design activities is the end of June. 7 Cost (DRAFT) ---------------- A Personnel Costs The part time resources are already funded and will not be charged to this project. The cost of the full time programmer/analyst is quoted at $52 per hour, which is a market rate for outside or self-sustaining resources. Development time is estimated at three months, but the cost is based on six months to provide for support, enhancements, and ramp-up. Six months at $52 per hour = $49,920 B Hardware/Software Costs If the system were only being designed for the pilot site, it could be done using the least expensive hardware that meets their needs. In order to ensure that the system is scalable for multiple hosts and other sites, we propose to invest more initially by installing two hosts for the image servers. This builds in extra capacity and confidence that will make expansion to additional sites easier. The expansion to a second site 140% the size of the pilot would only require increasing disk space and purchasing another scanner. The pilot site already has the required desktop devices. First host with 10 Gigabytes $23,000 Second host with 10 Gigabytes $23,000 Backup devices $12,000 Scanner for pilot site $10,000 ------ $68,000 Expansion to 48 Gigabytes $24,000 Scanner for second site $10,000 ------ $34,000 8 Risk Statement (DRAFT) -------------------------- This is a leading edge pilot project, with significant risks and unknowns. Nothing like this project has been done at the University before, and it is not clear if this approach has been taken anywhere. The proposed team of users and technical resources will make every effort to meet the objectives, but this is an experiment which needs to be critically reviewed at several points. Appropriate action must be taken to address any significant flaws, including possibly changing the approach or terminating the project. The hardware which is being purchased for this project could be applied directly and effectively in other areas, so the investment risk of this project is limited to personnel costs. 9 Design Issues (DRAFT) ------------------ There are a number of design issues which must be resolved in the early stages of the project. The major ones are listed here. 1 If the index entries are validated against the administrative systems databases on the A16, then what happens years later when the database record is purged and images are added to the file for that record? 2 How do we automate the storage of the scanned image into the image server from a PC scanning station? 3 What image format will be used (gif vs. tif vs. jpeg)? 4 What resolution is required (150 vs. 200 vs. 300 dpi)? 5 Will the images be displayed inline by the browser or using an external viewing tool? What external tools could be used on the PC? What external tools could be used on the X terminal? 6 Will the users be able to zoom in on parts of the image? What tool will they use to do so on a PC? On an X terminal?