USPTO logo - eagle landing on shining lightbulb with 4 stars below
[Skip standard page navigations] United States Patent and Trademark Office
HomeIndexSearchSystem StatusBusiness CenterNews and NoticesContact Us
 

FTP Weekly Patent Bibliographic Raw Data


Go to FTP Server

 

NOTICE: Effective 2002 January 1, Grant Red Book V2.5 (xml) bibliographic data will be available on the FTP server instead of Grant Red Book V2.4 (SGML). An XML to SGML conversion utility (V25xml2V24sgml.pl) is available free for download. See below for details.
Effective 2002 January 3, Application Red Book V1.6 (xml) bibliographic data will be available on the FTP server instead of Application Red Book V1.5 (xml). The data is located within the "pgpub" sub-directory located in the FTP server 2002 folder.

NOTICE: Effective 2001 March 15, Application Red Book V1.5 (XML) bibliographic data is available for download within the "pgpub" sub-directory located in the FTP server 2001 folder.

NOTICE: Effective 2001 January 2, Patent Full-Text/APS (Green Book) bibliographic data will no longer be provided. Instead, bibliographic data will be provided in Grant Red Book format. A Red to Green Book conversion utility is also available free for download. See below for details.

Grant Red Book Files (XML), 2002--

Effective 2002 January 1, Grant Red Book XML is available and compliant with the V2.5 Grant Red Book DTD. Reference the Red Book Information page for documentation, DTD's, entity files, sample data, and a description of the changes between V2.4 and V2.5.

To assist customers in migrating to Grant Red Book XML, two conversion utilities are provided.

The utility V25xml2V24sgml.pl (updated on 2002-02-05) converts Grant Red Book V2.5 (xml) back to Grant Red Book V2.4 (SGML). It is distributed "as-is" and is available free for download in the Grant Red Book Conversion Tools and Sample Data page. The software is provided as-is, with no support. Limited documentation is included.

The utility RBxml2GB.pl converts Grant Red Book V2.5 (xml) to Green Book. It is being distributed "as-is" and is available free for download in the 2002 FTP directory. The conversion utility is a PERL script which can be modified (with some effort) to convert Grant Red Book to formats other than Green Book. The software is provided as-is, with no support. Limited documentation is included.

Grant Red Book Files (SGML), 2001--

Grant Red Book files are SGML files based on the Grant Red Book DTD v2.4 which appears on the Red Book Information page. Grant Red Book, although it is SGML, avoids all constructs forbidden in XML. The files available for download from the FTP site consist of a single zip file for each week's issue. The file contains the concatenated *.SGM files for each patent in the issue, except that the following elements have been removed: BRFSUM, SDOCL (except for Design patents), DETDESC, RELAPP, DRWDESC, SDODR, SDOCR. The resulting file contains the so-called "front page" information only. Although there are references to external entities in the DOCTYPE declaration at the start of each document in the file, none of those entities are available via FTP. To obtain the complete Grant Red Book file, including all external entities, you must subscribe to Patent Data/SGML. Standard character entities referenced in the files are available from public sites on the Internet, from ISO, and, in the future, at the Red Book Information page.

To assist customers to migrate to Grant Red Book, a Red Book to Green Book conversion utility RB2GB is being distributed "as-is" and is available free for download in the 2001 FTP directory. The conversion utility is a PERL script which can be modified (with some effort) to convert Grant Red Book to formats other than Green Book. The software is provided as-is, with no support. Written documentation is included.

Green Book Files, 1996--2000

The data content of Green Book is identical to the patent bibliographic magnetic tapes sold by USPTO, in a format known as the "Patent Full-Text/APS File" format, or "USPTO Green Book." The data is available as one zipped file for each weekly issue, beginning with week 36 of 1996 and ending with the last week of 2000. Within each zip file, the data appears in USPTO Green Book formatted as either fixed-length (blank padded) or variable-length, linefeed or carriage return/linefeed - terminated ASCII records. Each file is approximately 2 to 4 MB zipped, and unzips to a single 20 to 50 MB ASCII file.

WARNING


Patent Data FTP Directory:

This directory contains raw patent data for each weekly issue in the
current calendar year.

The data types are as follows ["nn" is a two-digit, fixed-length number
(i.e., with leading zero), which represents the sequentially-numbered
week of issue]:

99weeknn.rpt     -- ASCII text file listing unused sequential patent
                    numbers and summarizing weekly contents by patent
                    type.

99weeknn.txt     -- ASCII text file containing a list of all patent
                    numbers in the issue, one per line. (A UNIX "wc"
                    of this file should yield a line count which equals
                    the total patent number in the .rpt file.)

99weeknn.zip     -- ASCII text file, zipped, USPTO Green Book tagged data format
                    (1996--2000), containing variable-length, linefeed-terminated
                    records.  A UNIX grep for "^WKU" piped to "wc" (grep "^WKU"|wc)
                    should yield a line count which equals the total patent number
                    in the .rpt file.

01weeknn.zip     -- ASCII text file, zipped, USPTO Grant Red Book SGML data format
                    (2001-- ), containing a single *.SGML file. Within the file, each
                    document consists of a DOCTYPE declaration followed by the start
                    tag <PATDOC> followed by additional markup and content
                    followed by the end tag </PATDOC> which terminates the
                    document.  The number of occurrences of PATDOC indicates the
                    number of documents in the file.

 

NOTICE: The provided data formats changed as follows, on 19 October 1999:
  1. The primary data file format changed from fixed-length, linefeed-terminated records to variable-length, carriage return/linefeed or linefeed-terminated records (i.e., trailing white space was eliminated).
NOTICE: The provided data formats changed as follows, on 1 September 1997:
  1. From Week 35 (issue date 2 September 1997) on, USPTO provides three weekly files: the primary data file as 97weeknn.zip; a weekly report file, 97weeknn.rpt; and a weekly list of patent numbers, 97weeknn.txt.

  2. Prior to 1 September 1997, the primary data file consisted of a stream of concatenated 80-character, fixed-length records, without any terminating character. After that date, the primary data file (97weeknn.zip) format became a series of concatenated 81-character records, trailing-blank-padded with the 81st character being a newline (hex 0A) character.

  3. The old, BBS-format file, 97weeknn.fms.zip, is no longer provided. For anyone who desires to make continued use of this format, DOS executable and C-language source code for the software previously used to generate the fms file from the old and new primary data file formats are available in the fms directory. Adaptation and use of this conversion code is strictly up to the user; no support whatsoever will be provided by USPTO.


HOME | INDEX | SEARCH | SYSTEM STATUS | BUSINESS CENTER | NEWS&NOTICES |
CONTACT US
| PRIVACY STATEMENT