![]() ![]() |
Data Home | ||
Chemical and Crystal Structure Mathematical Databases, Software and Tools Thermophysical and Thermochemical
|
![]() |
NIST Special Database 8NIST Machine-Print Database of Gray Scale and Binary Images (MPDB)![]() ![]()
A sample of the data contained in this database is
available via anonymous ftp at sequoyah.ncsl.nist.gov
in the files sd8-README.txt
and sd8.tar.Z
[699K].
The NIST machine-printed database contains gray scale and binary images of machine printed pages. There are 360 digitized pages on three CD-ROM discs. There are a total of 3,063,168 characters in the set which is an average of 8509 characters per page. A reference file is included for each page. These reference files are the ASCII text pages that were used to generate the original hardcopy that was digitized. This database
is being distributed for use in the development and testing of Optical
Character Recognition (OCR) systems on a common set of images. This
allows vendors to report results with respect to this common image set.
You may browse the Users' Guide
to see how this database works.
Each disc in this three-disc set contains approximately 593 megabytes of storage when the images are compressed. Uncompressed, each disc contains 1.1 gigabytes of data (1.85 :1 average compression ratio using JPEG and CCITT group 4 compression schemes). The database has the following features:
Suitable for automated machine-print research, development, and evaluation, the data set can be used for:
The database is a valuable tool for measurement and comparison of system performance on machine-print pages. System Requirements: CD-ROM drive with software to read ISO-9660 format. Price: $90.00. Special pricing for multiple copies available. Call for details. To order online, click here For more information on Special Database 8 please contact:
Keywords: ASCII Reference, automated character recognition, automated data capture, binary, character recognition, font size, full page, Grayscale Image Database, machine print, NIST, OCR, optical character recognition, software recognition, style. |
Create Date: 6/02
Last Update: Thursday, 06-Mar-03 15:42:04
Contact Us