N.J. Division of Revenue improves tax processing
With 3.8 million personal income tax returns to deal with, officials in the New Jersey Division of Revenue decided to automate their tax collection processing.
The state received four bids, the lowest of which was from Northrop Grumman (Los Angeles), proposing a LAN-based system with recognition from Symbus Technology, now FormWare (Park City, UT). The open architecture solution is based on Formware's Inscript recognition called DPS. Now 90% of the tax returns--1040 individual tax forms and HR (Homestead Rebate) 1040 forms--are processed automatically using the software.
Joe Roose, assistant director at the Division of Revenue, said, "We learned as we went along and made a lot of changes to improve our processes. We redesigned the form to improve recognition. We changed the size of the envelopes so that the return could be folded only once, which resulted in faster prep and better scanning.
After the envelopes are opened, the returns are sorted into three categories: machine-printed, handprinted and rebate applications. Checks are removed and processed as quickly as possible. A barcoded batch control sheet is placed in front of each tax return.
Automated processing starts with eight Kodak (Rochester, NY) IL923 high-speed paper scanners, each controlled by a Sun (Palo Alto, CA) workstation. They scan in portrait mode because W2s go through more easily than in landscape. The barcoded batch controls are read automatically to define the start and stop of each return, while the barcodes on each page are read to check for completeness and act as a check on whether the paper is the right way up. W2s are a problem because they vary in size and weight.
Each handprinted submission consists of four sides with an average of four or five attachments including W2s. Machine-printed submissions consist of 10 or 11 pages each. The system is set up to process 38,000 returns per day.
After the forms are scanned, the images are routed in batches of 50 to the recognition process, which is controlled by a Unix-based E4000 Sun-based server. Recognition of the data is handled through 14 Pentium 90 and 100 based PCs connected over Ethernet, which run 24 hours a day during peak times. At an average of 350 to 400 characters per return, that represents a capacity of about 1 million characters recognized per day per PC.
Nick Manocchio, chief of technology services at the Division of Revenue, said that field accuracy rates on the handprinted forms are 90% to 92%, and 97% to 98.7% on the machine-printed forms.
Verification of the data is managed on 30 workstations and then the data is reformatted into the same large record format that has always been output from the manual key entry stations. It is merged with exception items that have been keyed on ASCII-based screens and sent to the mainframe for processing. After capture, the images are reformatted and passed on to a FileNet (Costa Mesa, CA) imaging system for storage and retrieval. After the paper is scanned. it can be boxed, shrink-wrapped and sent, off for storage off site.
Having signed the contract in 1995, New Jersey is in its third year of using the system. Last year it processed 2.7 million returns using image and recognition and expects to do more than 3 million this year. The new system has reduced the number of key entry stations from more than 300 to 110.
The department is now being asked to process forms for other agencies under the "one-stop shopping" system announced by New Jersey's governor, which is designed to provide smaller government