Cover/Title Page
|
(303-497-6854) Web Homepage: http://www-fd.fsl.noaa.gov/
Mark D. Anderson, Senior Database Analyst, 303-497-6518
(The above roster, current when document is published, includes
Address: NOAA Forecast Systems Laboratory Mail Code: FST
ObjectivesThe group designs, develops, upgrades, administers, operates, and maintains the FSL Central Computer Facility. For the past 21 years, the facility has undergone continual enhancements and upgrades in response to changing and expanding FSL project requirements and new advances in computer and communications technology. In addition, ITS lends technical support and expertise to other federal agencies and research laboratories in meteorological data acquisition, processing, storage, distribution, and telecommunications. The Central Facility acquires and stores a large variety of conventional (operational) and advanced (experimental) meteorological observations in real time. The ingested data encompass almost all available meteorological observations in the Front Range of Colorado and much of the available data in the entire United States. Data are also received from Canada, Mexico, and some observations from around the world. The richness of this meteorological database is illustrated by such diverse datasets as advanced automated aircraft, wind and temperature profiler, satellite, Global Positioning System (GPS) moisture, Doppler radar measurements, and hourly surface observations. The Central Facility computer systems are used to analyze and process these data into meteorological products in real time, store the results, and make the data and products available to researchers, systems developers, and forecasters. The resultant meteorological products cover a broad range of complexity, from simple plots of surface observations to meteorological analyses and model prognoses generated by sophisticated mesoscale computer models. AccomplishmentsComputer FacilityITS systems administrators have spent considerable time on routine maintenance of Central Facility systems and implementation of necessary security measures. With the broad spectrum of operating systems currently being utilized by ITS, no practical system to manage the configuration files of these systems exists. ITS and Aviation Division administrators began working on a client/server-based system, RoboAdmin, aimed at addressing this issue. The goal was to provide an easy method for users to change their passwords on all ITS systems. Initial results were positive during the required quarterly password changes, and this system will be expanded to manage additional configuration files. Another software package, called sysdoc, was installed in all ITS systems to generate system configuration information daily on each host, and e-mail the results to a central repository. This system has proven very useful when rebuilding failed systems. The primary FSL Hardware Assets Management System (HAMS) Oracle server received both hardware and software upgrades. The external multipack disk chassis were eliminated by adding the internal interfaces and hardware so that the disks could be installed directly into the Sun E450 host. The system memory was expanded and the Operating System was upgraded to Solaris 8, prior to loading Oracle 8i. HAMS enhancements included the addition of network management, support contracts, and vendors, and other upgrades continue to provide added functionality and cost savings to FSL. Use of the Linux operating platform for desktop systems is prevalent in ITS. All of the X-terminals formerly used by the operators were replaced with PCs acquired from the Bureau of Census. Additional Census PC systems running Linux were installed for system administrators and developers. The trend toward the Linux operating system has also extended into the server domain. The metadata project that was originally developed on a Sun server was rebuilt and re-hosted on a Linux-based server. Additionally, some of the ftp services were re-hosted on PCs running Linux. Two Linux systems have been configured for use in migrating Networked Information Management client-Based User Service (NIMBUS) processes away from SGI-based systems. A new project involves completely changing the architecture of the data ingest and distribution systems within ITS, again using Linux-based systems. The overall effect is lower hardware costs and greater flexibility due to the prevalence of open-source tools available for Linux. IT security, of course, received much attention over the year, and the FSL Computer Policy and Procedures were approved by FSL management. All systems are routinely patched each quarter with vendor-provided operating system software, and specific vulnerabilities are patched immediately. The FSL security plan and risk assessment contingency/disaster recovery plan were approved and accredited for the two FSL facilities, and a full-time IT security officer was hired. After close examination of the FSL computer room infrastructure, work began on modification and expansion of resources, including the addition of a larger secondary computer room. A major accomplishment was the installation of an extensiveire suppression and air sampling FM-200-based system in the main computer room. The VESDA air-sampling system measures the particle level in a continuous flow of air that is collected by piping above the computing equipment and under the floor. Different alert levels are reached depending on the amount of particles in the samples. When the highest alert level is reached, the FM-200 gas will be released in the room to extinguish any fire. Also, FE-36 clean-agent fire extinguishers were installed in the hallways near all FSL computer rooms, along with carbon dioxide fire extinguishers inside all computer rooms. Another safety measure implemented was installation of a SCADA 3000 environmental monitoring system in all FSL computer spaces. The system monitors temperature and will call appropriate personnel when temperature levels reach unacceptable levels. The system also monitors for water under the floor of the main computer room. FSL NetworkFSL Network Administration developed and presented a five-year network growth plan to FSL management and the user community. This report focused on FSL’s ongoing need for next-generation networking technology to support research and technology transfer in the laboratory and the user community. Current and emerging technologies that would help FSL meet the goals were discussed, including a 10-Gigabit Ethernet, quality of service, voice and video over IP, wave division multiplexing, storage area networks, and wireless networking. The Network Administration staff was restored to full strength with the hire of a Network Engineer, bringing the staff to three: one Network Manager and two Network Engineers to support the laboratory. To provide better after-hours support, a cell phone was acquired for immediate response during outages. During 2001 network port, device, and link capacities were available to support 202 FSL employees. The network utilized 518 total links, comprising 430 user links and 88 network device links. Port capacity available for network growth was 23%, or 158 free ports, distributed throughout the laboratory in appropriate proportions for very high, and moderate speed connections. All network routers and switches were running at less than or equal to 20% average CPU utilization, with the exception of the PowerHub routers that occasionally reached 100% CPU utilization for short periods during the year. For this reason and the fact that they are designated end-of-life, the PowerHub routers are slated for replacement soon. As for network link utilization, no backbone link exceeded an average utilization of 7.3%, or a maximum of 15.5% of the total 622 Mbps available. Router link utilization did not exceed an average of 12.8%, or a maximum of 58.6% of the 155 Mbps available. In combination with all other NOAA Boulder network traffic, the wide area network (WAN) utilization to commodity Internet and Abilene (Internet2) via the Front Range Gigabit Point-Of-Presence (GigaPOP) link averaged 6.0%, with a maximum of 46.2% of the 155 Mbps available. WAN traffic over the secondary commodity Internet link via MCI/UUnet averaged 34.0%, with a maximum of 86.0%, of the 12 Mbps available. FSL comprised 61.2% of the total NOAA Boulder WAN traffic, with the nearest agency, the National Geophysical Data Center (NGDC), at 18.8%. Of the total NOAA Boulder WAN traffic, 39.7% was FTP, 18.7% was LDM, 10.9% was HTTP, and the remainder a variety of other protocols. The cable plant for the FSL network is connected via five wiring closets, or intermediate distribution facilities (IDFs). After an audit of the power consumption by network devices in these IDFs, it was determined that the power usage exceeded that provided by the uninterruptible power supply (UPS) battery backup systems. To correct this, upgraded UPS systems were installed and several network devices were relocated to properly distribute the power load. The UPS systems are necessary to maintain power to the FSL network between power failure incidents and the time it takes for the building motor-generator backup power to come up to capacity. Data Acquisition, Processing, and DistributionNIMBUS receives data from sources such as National Weather Service (NWS), NOAAPORT, National Centers for Environmental Prediction (NCEP), WSR-88D Doppler radar, Aeronautical Radio Inc. (ARINC), Weather Services International Corporation (WSI), the FSL Demonstration Division, and the Geostationary Operational Environmental Satellites (GOES)-8 and GOES-10. Real-time NIMBUS datasets are also distributed to several organizations external to FSL using the Unidata Local Data Manager (LDM) protocol: GOES imagery to the NOAA Environmental Technology Laboratory (ETL), wind profiler data to the University Corporation for Atmospheric Research (UCAR) Unidata program, quality controlled ACARS data to the National Center for Atmospheric Research (NCAR), and to government agencies and universities. Support continued for the WSR-88D Radar Wideband Ingest Subsystem, which acquires and processes WSR-88D Doppler radar data within the FSL Central Facility. These data are used for various FSL meteorological analysis and modeling applications. Support also continued for the data acquisition interface to the WSI High-Capacity Satellite Network (HCSN) Data-Acquisition System that supplies WSI NOWrad and NEXRAD products to FSL. The FSL GOES groundstation hardware and software components were replaced with modern Intel-based equipment running the Linux operating system. This system generates a suite of imager and sounder products in netCDF (Network Common Data Form) format. Satellite data were briefly acquired during the GOES-11 postlaunch science test period. A decoder and storage capability for dropsonde data was developed in support of FSL modeling projects. Also, software was updated to handle additional ARINC Communications Addressing and Reporting System (ACARS) formats. Datasets were acquired from the NCEP Aviation Weather Center (AWC) via DBNet (Distributed Brokered Networking) software, a network file transfer protocol (FTP). New gridded datasets, specifically the 20-km RUC and 22-km Eta model grids, were made available to FSL users. A Linux-based PC was acquired as an FSL server for the CONDUIT (Cooperative Opportunity for NCEP Data Using IDD Technology) high-resolution model datasets available from the U.S. Weather Research Program. Staff began analyzing these data and their impact on FSL, which has the capability to share the CONDUIT data feed with other NOAA laboratories when required. Improvement to the NIMBUS Information Transport (IT) included implementation of software to streamline the acquisition and processing of Gridded Binary (GRIB) formatted data. The new method allows generic handling of datasets, meaning that data are stored in dynamically created locations. Collection of data yet to be identified within FSL (for example, those data produced by the European Centre for Medium-Range Weather Forecasts) is now possible. This generic and dynamic collection of data will be extended to allow an archive of all GRIB data in receipt format. The handling of other data types (point, radar, and satellite) will be changed to conform to the system used for GRIB, which in turn will support further development of the FSL Data Repository (FDR). To further streamline processing, the role of NIMBUS Information Transport is being redirected toward job control and logging. A redesign of the data handling using Object-Oriented methods will reduce the software maintenance required to translate the original form and generate products. This Object Data System (ODS, Figure 9) is used to fully integrate the LDM software in support of a weather research community standard for the distribution of data. ODS provided the generic translation of GRIB to netCDF and is used as a model for handling other data types. Generic handling of original and product data formats fostered a better design of the FDR. Once real-time data processing is linked to metadata, the Central Facility "customers" can generate required product specifications for direct use by ODS with no additional software development. Support for the FDR included continuing development of the centralized metadata database, which will improve metadata reliability and expedite FSL user access to metadata. The capability for viewing gridded metadata and METAR station tables was made available through Web and Java interfaces. Completion of work on NOAAPORT data storage and ODS sets the stage for implementation of a full retrospective processing capability. When linked to work on the metadata database, users can make very general queries to produce ad hoc datasets without the cost of additional software development beyond that being used for real-time processing. Changes to the Facility Information and Control System (FICS) monitor were implemented to accommodate the arrival of a variety of new datasets. Scripts were extended or developed to monitor operation of the new Mass Store System (MSS), installed as part of the FSL High-Performance Computing System (HPCS). The FICS monitor system was also ported to a new server host, and new products were added. To support real-time AWIPS data processing, AWIPS review cases were loaded from the MSS to the Network File System (NFS) fileserver. Assistance was provided to COMET collaborators when they needed data from the MSS to build review cases. The FSL localization AWIPS data server was upgraded from build 4.3 to build 5.0 with minimal interruption of service. Build 5.0 LDAD (Local Data Acquisition and Dissemination) data processing was added to this system. To accommodate LDAD data users, several LDAD data providers were added, and a semiautomatic process for routinely updating LDAD metadata files was implemented along with new data parameters, as necessary. In support of the FX-Net project, the build 4.3 GYX (Portland, Maine) localization was upgraded to K-Series HP hardware and upgraded to build 5.0 AWIPS software. A new build 5.1 Aviation Weather Center (AWC) localization was implemented to support FX-Net National. The GPS Integrated Precipitable Water Vapor (IPWV) data were temporarily made displayable by FX-Net for demonstrations at the American Meteorological Society (AMS) annual meeting. General support provided for all Central Facility AWIPS data processing included implementation of an integrated build 5.0 FSL localization, set up of FICS monitoring capabilities, and an ability to process NOAAPORT radar data from radar sites outside the default localization. The Data Systems Group led the work to replace a UniTree-based Mass Store System with a HPCS Mass Store System, and all users have transferred to the new system. The Real-Time NIMBUS Data Saving (RTNDS) was converted from the UniTree-based MSS system to the HPCS MSS. RTNDS improvements and added functionality give FSL users greater reliability and increased throughput. Difficulties associated with the new HPCS MSS were countered with tools written to assist users in storing and retrieving data from the system. Meetings were held with the MSS vendor to explore solutions to problems with the system. For efficiency reasons and to work around problems when accessing the HPCS MSS via NFS, staff devised an alternative access method using the earlier-developed Data Storage and Retrieval System Support (DSRS). Laboratory Project, Research, and External SupportITS continued to distribute real-time and retrospective data and products to all internal FSL projects and numerous outside groups and users. External recipients include:
Other data and product sets were provided to outside groups, including Doppler radar, ACARS, upper-air soundings, Meteorological Aviation Reports (METARs), profiler, satellite imagery and soundings, and MAPS and LAPS grids. As liaison for outside users, the Systems Support Group provided information on system status, modifications, and upgrades. The Systems Support Group implemented a logging system, the SSG Log (utilizing the FSLHelp system), that allows better intercommunication among staff in all areas of the facility and other divisions, resulting in a higher level of service in dealing with the numerous, varied issues responded to on a daily basis. This log also provides a means for recording a history of events and tracking procedures used to correct problems. During four months of 2001, about 400 SSG Log tickets were initiated and resolved, and about 150 customer FSLHelp requests were handled for data compilations, file restoration, account management, and video conferencing. The Web database that documents the procedures for maintaining the Central Facility real-time datasets has grown to approximately 115 documents. Ongoing refinements and updates to documents are necessary because of new procedures and outdated information. Operators use this documentation to efficiently troubleshoot and resolve issues involving real-time data, resulting in shorter down times of that monitored data. The Systems Support Group was reorganized with the departure of the Lead Operator, appointment (promotion) of a new Lead, and the hiring of a new full-time Operator. This provided an opportunity to restructure schedules and spread out coverage of the full-time staff to allow for improved coverage during absences and reducing overtime. Staff performed the daily laboratorywide computer system backups with ~300 GB of information written each night for ~260 clients. Quarterly offsite backups were completed on time. Division System Administrators implemented and installed three new Network Appliance Systems, which included two new backup robots. The fundamental set ups (create tape pools, add tapes, etc.) were performed to ensure proper daily backups of these systems. A renewed emphasis was placed on securing proper procedures for notifying data users. Refinements were made to the SCADA Temperature Monitoring System to more reliably report problems and transfer data from the system. Additional Web-based tools utilizing the data reports from the SCADA system were implemented to allow staff another means of monitoring real-time temperatures and alarms of critical events. All SSG staff received in-depth training on the computer room VESDA Smoke Detection System and FM-200 Fire Suppression System. Documentation for these systems was created and is regularly maintained. In support of general computer security issues and initiatives for NOAA and FSL, all SSG staff took the NOAA IT online security training. Division staff provided technical advice to FSL management on the optimal use of laboratory computing and network resources, and participated in cross-cutting activities that extended beyond FSL, as follows:
ProjectionsComputer FacilitySignificant changes expected for the HPCS include installation of a 48-processor testbed cluster, based upon Intel’s latest version of the Pentium processor, for evaluation purposes. The 12-TB storage system obtained from the Bureau of Census will be integrated into the HPCS, with most of the space used as bulk scratch space and for experimental file system software. The final upgrade to the HPCS, in July 2002, should include a very significant upgrade to the compute platform, an additional 1.5 TB of high-speed storage, and an additional 150 TB of hierarchical storage. In addition to the hardware upgrade, more robust software will be developed to enhance the stability of the HPCS and ensure that real-time jobs have both availability and redundancy of resources. Enhancements of the Hardware Assets Management System (HAMS) are scheduled for development during 2002, including wireless integration, report wizard, credit card reconciliation, and automated SF120 processing (excess property). Training sessions on the use of HAMS will be conducted for System and Network Administrators and property and procurement staff. The original UniTree-based Mass Store System will be rehosted to a Sun platform from an aging SGI server to save maintenance costs on the server. The StorageTek robot control software will be upgraded and rehosted on a Sun SPARC 5 server acquired as an excess from the Demonstration Division. The UniTree software will be rehosted on an existing Sun Ultra10 server, and the database will be reconfigured using a Sun multipac disk chassis acquired from the HAMS Oracle server upgrade. Selected datasets will be moved from the original MSS to the new system. RoboAdmin will be integrated throughout the laboratory. Classes will be held to train the system administration staff on proper troubleshooting techniques of system administration and network issues. FSL Network and IT SecurityAnother technology that has shown a quick acceptance of standards in the computing industry is Gigabit Ethernet (GigE). FSL has clusters of GigE and GigE-capable servers, such as Jet, and is finding a way to integrate this technology into our multigigabit ATM core. Merging ATM and GigE will help FSL optimize the high performance and familiarity of Ethernet, while protecting our investment in ATM and the resiliency that the fully meshed topology provides. IT security also will be incorporated into the design phase of FSL network upgrades. While security in-depth is emphasized to address perimeter, local access, and host-based security, the Network Administration Group is taking steps to restructure the backbone architecture connecting FSL to the NOAA Boulder network backbone. The new topology will allow FSL to more effectively implement perimeter security controls. This is important for maintaining FSL data and research integrity and network accessibility, which are critical to our computing infrastructure and mission. Figure 10 shows the planned network upgrade topology, which include integrated routing, Gigabit Ethernet and ATM, and the new IT security perimeter design. The addition of three networking projects will improve user access to FSL networks. Remote (dial-in) access to FSL will be migrated to a new modem server configuration that fully supports 56-Kbps modems. This new remote access server (RAS) will boost access speeds, and is also capable of supporting upgrades to digital modems if and when FSL needs digital remote access. The RAS server will utilize Radius authentication for encrypting user log-ons and passwords. The FSL virtual private network (VPN) will also be modified to take advantage of Radius for remote dial-in and trusted Internet access. A second planned project is to further test wireless networking capabilities. New standards are forthcoming that will result in better (five-fold) performance and security. The new wireless standards will benefit FSL staff using mobile laptops and workstations to interact with visitors and make presentations. A third planned project takes a new approach for monitoring and certifying the FSL network. A small business type digital subscriber line (DSL) service will provide IP address and domain name space that is "outside" of FSL space. This will provide an Internet "view" of FSL to test how the rest of the world sees the laboratory. The DSL service has many valuable security monitoring benefits, and also provides non-FSL space for other collaborative efforts such as for WRF-model.org development. Although a low-speed (~384 Kbps) connection, it is the domain name and IP address space separation that makes DSL a unique and valuable service to FSL. With the hiring of a full-time Information Technology Security Officer (ITSO) and heightened awareness of security issues within NOAA, IT security will become a significant strategic emphasis within the laboratory. FSL plans to install a combined firewall and Intrusion Detection System (IDS) solution as a first step toward the network perimeter. The firewall and IDS will be an integral part of the new Gigabit Ethernet network backbone, which will allow FSL a modular, off-the-shelf upgrade path to accommodate configuration changes and future bandwidth growth. In order to achieve an economy of scale, the IDS will be implemented as an integral part of the Boulder NOC building-wide system. In addition to hardware security solutions, FSL will augment IT security with administrator and user training, and a comprehensive, up-to-date set of security-related policies and procedures. Data Acquisition, Processing, and DistributionContinued support will be provided for AWIPS review cases, including loading of cases and providing collaborative support for COMET staff. A system with AWIPS Build 5.x software will be set up for building review cases, and the process for building review cases will be further automated. Support will continue for the FSL localization AWIPS data server. Upgrades and enhancements will be performed as required, including the processing of additional NOAAPORT radar data, GRIB image processing, and the implementation of a Linux AWIPS data server. Additional AWIPS data servers will be set up in support of FX-Net National. Load testing will be performed to determine the necessity of additional servers. A supportable long-term solution for making GPS IPWV data displayable by FX-Net National will be implemented. This will necessitate installing and running LDAD data processing on these AWIPS data servers. General production support of the CF AWIPS data servers will continue, including upgrades as necessary. Work toward implementation of a scalable long-term solution for FICS monitoring for these systems will continue. Implementation of metadata with GRIB datasets is planned for FSL real-time data processing. An automated system for acquiring and incorporating digital metadata is part of this plan. Work will continue on the interactive interface that allows for easy query and management of the metadata content. Program interfaces will be added to allow for secure controlled data access. Retrospective data processing and metadata management will be incorporated. The data acquisition and dissemination systems will be redesigned to support the installation of an FSL firewall. WSR-88D Radar data processing will be redesigned to support the acquisition of data from all National Weather Service WSR-88D radars. Laboratory Project, Research, and External SupportITS staff will participate in the OAR A76 study, by serving on the Performance Work Statement and Most Efficient Organization teams. The Systems Support Group will continue to service data requests, including more large data retrievals from the Jet Mass Store System. They will provide assistance to System Administrators when feasible in the areas of user account maintenance and other special projects as assigned. Because of 24/7 onsite support and augmentation of staff during emergency coverage needs, an emergency Operator coverage plan will be implemented. A plan will outline the course of action to be taken when emergency coverage of the Operator shifts is required. The tape rotation for quarterly offsite backups will be increased to allow individual machine backups to be on hand for one year. A renewed emphasis will be placed on identifying regularly failing client backups, tracking down the reasons for the failures, and implementing proper corrective measures to reduce the total amount of client backups failing on a daily basis. In so doing, system and network resources will be more effectively used, and a higher level of service will be provided for all FSL users. Numerous new products will be added to the FICS monitor. To support these additions, several critical support documents will be updated, and Help documentation will be generated so that the basic functions of the Systems Support Group (monitoring, troubleshooting, and communicating about real-time data issues) will be maintained. Updated documentation, procedures, and other assistance tools such as flow diagrams for notifying data end-users will be implemented to assist and ensure consistency in this important area of customer service. Detailed Web-based documentation for the SCADA temperature monitoring system (set up, maintenance, and usage) will be created. Additional aids for quickly resolving fire-safety issues will be developed. Staff will receive VESDA/FM-200 smoke detection/fire suppression refresher training. Additional aids for quickly resolving fire safety issues will be developed. Staff will take the updated version of the NOAA online security training and the in-depth NOAA SANS Security online training. The Systems Support Group will be the first point of contact (via email) for any attack reports. To better support the FSL RUC backup for NCEP, online documentation will be updated and other assistance materials and tools will be implemented.
|