Hydrologic Frequency Analysis Work Group Introduction to the Bulletin 17-B Flood Frequency Guidelines Frequently Asked Questions The Hydrologic Frequency Analysis Work Group is a work group of the Hydrology Subcommittee of the Advisory Committee on Water Information (ACWI). The Terms of Reference of this work group were approved by the Hydrology Subcommittee on October 12, 1999 and are available on the ACWI web page. The work group was formed to provide guidance on issues related to hydrologic frequency analysis and replaced the Bulletin 17B Work Group that had existed since 1989. The Hydrologic Frequency Analysis Work Group is open to individuals from public and private organizations. The current members of the work group are also given on the ACWI web page. The initial objectives of the work group are to
In response to the first objective above, the work group has prepared a list of frequently asked questions and answers that provide additional information relative to the implementation of the Bulletin 17B "Guidelines For Determining Flood Flow Frequency", dated March 1982 and developed by the Hydrology Subcommittee of the Advisory Committee on Water Data. These questions and answers supplement the guidelines given in Bulletin 17B and it is envisioned that these questions and answers will be modified or extended in the future as better information becomes available. Any comments on these frequently asked questions and answers or any new questions and/or answers should be provided by email to Bill Kirby, a member of the Hydrologic Frequency Analysis Work Group, at wkirby@usgs.gov for review by the Work Group. BULLETIN 17-B
|
F(x) | = | P{annual flood X < x} = |
= | P{X<x | X is hurricane flood}*P{X is hurricane
flood} + P{X<x | X is non-hurricane flood} * P{X is non-hurricane flood} |
|
wherein |
|
P{X is hurricane flood} = NH / LHHP |
= | number of hurricanes / length
of historical period analyzed for hurricanes |
|
and | P{X is non-hurricane flood} = 1 - NH/LHHP. | |
Note that ALL hurricane-caused annual floods during the period LHHP must be included in the analysis to properly define the conditional distribution of hurricane floods.
Question: I have to determine the 1.1-year flood for use in stream restoration analysis. The record contains several low outliers, and the computed frequency curve is not defined for the high exceedance probabilities that correspond to the 1.1-year flood. How do I proceed?
Answer: Bulletin 17 methodology is not designed for and should not be used
to determine high-frequency low-recurrence-interval flood magnitudes or to determine
risks due to occurrence of low-magnitude floods. This is the case whether or
not there are low outliers, even if the computation does yield a value for the
1.1-year flood. Bulletin 17 is based on the annual-flood probability model,
in which it is assumed that exactly one flood event occurs per year. This probability
model usually is adequate as an approximation for risks due to large-magnitude
low-frequency floods (which are unlikely to occur at all during any given year,
and extremely unlikely to occur more than once during the year). The annual-flood
model, however, does not adequately represent the occurrence of low-magnitude
floods. In most streams, several low-magnitude flood events occur in most years.
In most cases, if floods of this magnitude cause damages or other effects of
concern, such as channel-forming activity, then each occurrence of such a flood
will contribute to the cumulative effect for the year. Proper accounting of
the risk requires consideration not only of the distribution of individual flood
magnitudes but also of the distribution of the number of events that occur in
a year. The annual-flood analysis does not furnish the necessary information
about the likelihood of multiple flood occurrences per year. Generally speaking,
the annual-flood model understates the total risk or total effect for the year
because of undercounting of the number of minor floods in the year. Conversely,
it overstates the recurrence interval of minor floods of a given magnitude,
again because of failure to recognize the occurrence of multiple events per
year. A different method of frequency analysis, called the "partial duration"
method (because the "duration" of time associated with each flood
event is only a "partial" year), or "peaks over threshold (POT)"
method, which explicitly considers multiple events per year, is required. This
methodology is described, for example, in Hydrology for Engineers (1982, by
Linsley, Kohler, and Paulhus, McGraw-Hill, pages 359, 373-347) and consists
of selecting all distinct well-separated flood peaks exceeding a given threshold
magnitude, ranking them, estimating the recurrence intervals by the formula
T = (N+1)/m (where N is the record length, in years, and m is the rank of the
peak), and plotting magnitude versus recurrence interval. The threshold is commonly
set so that a long-run average of about 3 peaks per year will be recorded; thus
recurrence intervals as low as about 1/3 year can be defined. Relating the recurrence-interval
curve to probabilities of occurrence requires consideration of the frequency
distribution of the number of above-threshold peaks per year, as summarized,
for example, in the Handbook of Hydrology (1993, D.R. Maidment, editor, McGraw-Hill,
page 18.37). (That having been said, if administrative or regulatory requirements
necessitate use of the 1.1-year annual flood, and if that value is not computed
because of low outliers or zero flows, a value can be determined by graphical
plotting of the low end of the frequency curve or by manual calculation using
the "synthetic" statistics described in appendix 5 and printed by
the computer as "Bulletin-17-B estimates". If more than 9 percent
of the annual peaks equal zero, then the 1.1-year flood equals zero; if more
than 1/3 of the peaks equal zero, then the 1.5-year flood equals zero.)
Question: How important is data quality in the validity of Bulletin-17-B frequency results? What issues need to be checked?
Answer: Data quality is obviously important to validity of the Bulletin-17-B frequency analysis, since the frequency analysis is basically nothing other than a standardized summary of the underlying flood data set. In a critical review of the flood data set, two broad sets of issues need to be considered: 1) relevance of the flood data set (and frequency analysis results) to estimation of future flood risk, and 2) accuracy of the data set as a representation of the flood events that actually occurred in the past. In the first set of issues, factors such as flow regulation by dams, dam failures, stormwater management, effects of development (or reversion to undeveloped conditions) in the flood plain, stream channel improvement or restoration, and the effects of mining, forestry, agriculture, or reclamation from those activities, all have the potential to make all or part of the record unrepresentative of future flood risk. The significance of these factors and the nature of any adjustments that might be applied for estimation of future flood risk cannot be predicted in general and depends on the specific situation at each site; no simple guidelines can be given that could safely be followed blindly or dogmatically. The application of records of past floods (including frequency analysis results) to decision-making about the future is outside the scope of frequency analysis and belongs to the realm of engineering-economic decision making.
Regarding the accuracy of the data, it is helpful to consider the process for computing the flood record, the potential sources of error, and the steps taken to detect and correct errors. Most annual-peak flows are determined by sensing the stage or water level at the gage and reading the flow (discharge) from a stage-discharge rating curve. The rating curve is made by correlating direct measurements of discharge, made by current meters or similar devices, with concurrent measurements of stage. The accuracy of the annual peak flow value then depends on the accuracy of the stage reading and the accuracy of the stage-discharge relation. The accuracy of the stage-discharge relation, in turn, depends on the accuracy, number, and flow magnitudes of the direct discharge measurements used to establish the relation. An important factor in promoting accuracy of records is a long-term organizational commitment and focus on production of records, along with a regularized process for checking and reviewing the data collection and computations, cross-checking the results against records at nearby streams, and annual publication of the records for public examination and use.
Issues of data accuracy are most likely to affect the top-magnitude floods in the data set. These events occur rarely, so there are fewer opportunities to define the stage-discharge rating for events in this range. In addition, these are the most destructive events, and are more likely to destroy or damage the gage, or impair the operation of the instrumentation. The uncertainty associated with historical peak discharges is usually greater than that associated with peaks that are part of the systematic record. The analyst should evaluate if the historical peak discharges are reliable enough to be used in the analysis. Issues that may be of concern include whether the historical sources provide sufficient substantive information to associate a stage or discharge with the historical event, whether the historical stage is referenced to the same gage datum as the stages used to develop the stage-discharge rating used to compute the discharge, and whether the stage-discharge rating adequately reflects the hydraulic conditions that existed in the channel and flood plain at the time of the historical event.
For historical data, attention must be paid to what is not in the data set as well as to the accuracy of the recorded historical peaks. As explained in more detail below, the Bulletin-17-B procedure for historical data involves defining a historical threshold discharge that separates the record into two classes of peaks which are given different weights in the computation. Bulletin 17-B requires that the threshold be set at a level high enough to ensure that it was not exceeded by any peaks that are not in the record. Any non-systematic peaks that are below the threshold are unusable statistically. Although the precise numerical value of the threshold is of little consequence, since it is not used for computation, setting the threshold to correctly identify the number and magnitudes of the peaks to be adjusted is critical to the accuracy of the historical adjustment. There is a tendency to casually assume that any peak that is outside of a period of systematic record is a true historical peak in the sense of Bulletin 17-B, and a tendency to improperly set the threshold at the level of the lowest such peak. Occasionally, records contain non-systematic peaks that are lower than many of the systematic peaks and contain few or no higher non-systematic peaks. In these cases, it is likely that higher peaks actually did occur outside the systematic record period but were not included in the non-systematic record, thus violating the assumptions underlying the historical adjustment. Setting the threshold too low results in improper discounting of the high-magnitude peaks relative to the below-threshold peaks. The analyst should check that the number of peaks exceeding the threshold during the systematic record period is consistent with the number during the historical period, and should check that the threshold level is not so low that it could have been exceeded without anyone's taking note of it. Accuracy of the length of the historical period also is important because the value is used to compute the amount by which the above-threshold peaks are discounted; knowledge of local history is critical.
Question: What is the relationship of the Federal Data Quality Act to flood data and flood-frequency analysis?
Answer:The "Federal Data Quality Act" (officially known as Section 515 of Public Law 106-554, the Treasury and General Government Appropriations Act for Fiscal Year 2001) requires the Office of Management and Budget (OMB) and, through it, all Federal agencies to issue guidelines to ensure the "quality, objectivity, utility, and integrity" of information issued by the government. The agencies are required to develop procedures for reviewing and substantiating the quality (including objectivity, utility, and integrity) of information before it is released. The agencies also are required to establish administrative procedures by which persons affected by government-disseminated information can seek and obtain correction of information that does not conform to the quality guidelines. The general intent of the guidelines is that agencies should make their data-collection, data-analysis, and data-interpretation methods "transparent" by providing documentation of the methods; should ensure data and information quality by reviewing the methods used (including, as appropriate, consultation with experts and users); and should keep users informed about corrections and revisions. These guidelines and procedures apply to government-disseminated information in general, and thus apply to flood data and to the results of statistical flood frequency analysis. The guidelines and procedures apply not only to information produced internally by agencies themselves, but also to information supplied by outside sources.
Question: What is a low outlier? How is it different from a zero flow? Why do we drop low outliers and zero flows? Aren't we overstating flood risk if we ignore flood peaks that are zero or near zero?
Answer: Outliers are observations that lie far out from the trend of the rest of the data when plotted on a magnitude versus frequency graph. A smooth trend line, such as a statistical frequency function, does not fit data sets with outliers, and the fitted curve usually fails to fit the bulk of the data as well as the outlier. Low outliers are outliers at the low end of the data set, near zero, at least in comparison with the rest of the data. On a log-probability plot, the low outliers impart a strong downward curvature and a downward-drooping lower tail to the frequency curve. In comparison with the lower tail, the upper tail of the low-outlier-affected curve may appear relatively flat. In the Bulletin-17-B context, low outliers differ from zero values in that computations with the logarithm of zero are impossible, whereas computations with logarithms of low outliers may be mathematically possible, but may overwhelm the computations of logarithmic moments (means, standard deviations, and skews) or distort the fit of the frequency curve to the data in the upper part of the data set, which are the data that represent significant flood or near-flood events. Since the zero and near-zero values in a flood data set are not the ones that convey valid or meaningful information about the magnitude of flooding, their numerical values are not used in the computation of the moments. However, in contrast to classical statistical treatments, where outliers are considered utterly spurious and are simply dropped from the data set, the Bulletin-17-B procedure recognizes that zero values and low outliers do convey valid and meaningful information about the frequency of flooding, and this information is used in Bulletin 17-B. Thus, the Bulletin 17-B procedure first uses the non-zero non-low-outlier data to define a conditional-probability curve which applies only to the non-zero non-low-outlier events; then the number of zeroes and low outliers is determined and used in the conditional-probability or "n-over-N" adjustment (equation 5-2) to adjust the probabilities from the conditional curve to properly reflect the frequency of occurrence of zeroes and low outliers.
Question: When should low flows that are not identified as low outliers using the 17B default procedure be censored as a result of the paragraph in Bulletin 17B on page 18 that reads, "If multiple values that have not been identified as outliers ... ".
Answer: Bulletin-17-B detects low outliers by means of a statistical criterion (the Grubbs-Beck test) rather than by consideration of the influence of low-lying data points on the fit of the frequency curve. The test is based on the standardized distances, (x.i - x.bar)/stdv, between the lowest observations and the mean of the data set. The test is easily defeated by occurrence of multiple low outliers, which exert a large distorting influence on the fitted frequency curve, but also increase the standard deviation, stdv, thereby making the standardized distance too small to trigger the Grubbs-Beck test. Therefore, Bulletin 17-B (pg. 18) permits manually overriding the statistical criterion. Obviously, the intention is to allow as many low outliers to be designated as necessary to achieve a good fit to the part of the data set that contains the significant flood and near-flood events. Equally obviously, the intention is that the Grubbs-Beck result be used unless the resulting poor fit gives compelling justification for not doing so. There is no universal method that can be followed blindly to achieve a good fit. The sensitivity analysis alluded to in Bulletin 17-B is based on the engineering-hydrologic-common-sense proposition that the smallest observations in the data set do not convey meaningful or valid information about the magnitude of significant flooding, although they do convey valid information about the frequency of significant flooding. Therefore, if the upper tail of the frequency curve is sensitive to the numerical values of the smallest observations, then that sensitivity is a spurious artifact based on the mathematical form of the assumed but in fact unknown flood distribution, and has no hydrologic validity. The sensitivity analysis determines whether the upper tail of the frequency curve is sensitive to the magnitude of the lowest values by iteratively treating them, one by one, as low outliers and plotting the estimated value of, say, the 100-year flood (or other percentage point or points characteristic of the upper part of the frequency curve) as a function of the number of low outliers. Frequently, the estimated 100-year flood will change noticeably and consistently, either increasing or decreasing, as the first few low outliers are identified, but then remain relatively constant, perhaps changing erratically, as additional data points are treated as low outliers. In such cases the identity of the spuriously influential data points -- the low outliers -- is clear, and the low-outlier threshold is set just above the magnitude of the highest spuriously influential data point. Note that the magnitude of the change resulting from low outlier treatment is not the deciding factor, but rather the change in the magnitude of change as additional points are treated as low outliers. In more complex cases, there may not be a clear demarcation of the low outliers, and the entire low end of the data set may be inconsistent with the fitting of the log-Pearson Type III distribution to the upper (hydrologically significant) part of the data. In such cases it may be necessary to rely on visual assessment of the fit of the upper part of the frequency curve, and Bulletin 17-B allows for this necessity.
Question: Does dropping multiple low outliers improve the estimate of the 100-year flood at the expense of distorting the estimates of the lower-recurrence-interval (10, 20, 50-year) floods?
Answer: No. The intent and the result of the low outlier adjustment are to improve the fit of the entire frequency curve above the low-outlier threshold.
Question: What is the difference between a high outlier and a historical flood?
Answer: High outliers and most historical floods both are exceptionally large floods. High outliers are exceptionally large floods that are contained in the systematic record, whereas historical floods were observed outside the period of systematic record. Systematic records are collected during periods of systematic stream gaging, usually continuous series of years, in which flood data are observed and recorded annually, regardless of the magnitudes of the floods. A nonsystematic record is collected and recorded sporadically, without definite criteria, usually in response to actual, perceived, or anticipated major flooding. The systematic record can be used directly in flood frequency analysis. The non-systematic record cannot be used unless additional information can be supplied to relate it to the population of all flood peaks. Bulletin 17-B requires that the non-systematic record be a complete record of all flood peaks that exceeded some threshold level during a definite historical time period.
A high outlier is an extraordinary flood that occurred during the period of systematic streamgaging. It is part of the systematic record and is treated just like the other systematic peaks in the preliminary steps of the Bulletin-17-B analysis. On a magnitude-vs-probability plot, the high outlier lies well above the fitted frequency curve, and the fitted frequency curve is steeper than the trend of the other plotted data points. Often historical information is available that indicates that the high outlier was the largest in a period longer than the period of systematic streamgaging. This historical information is used to adjust the frequency curve to take proper account of the extended time period associated with the high outlier. If no usable historical information is available, the high outlier is retained in the systematic record and used without adjustment.
A historical flood is a major flood that occurred outside of the period of systematic streamgaging. The stage or elevation of the historical flood is usually determined by high-water marks left by the flood and recorded by local residents, state departments of transportation (DOTs), railroad companies, local, state or Federal agencies. Stages of historical floods often are reported in local newspapers, diaries or Bibles of local residents, unpublished documents of state DOTs or railroad companies, and/or published reports of local, state or Federal agencies. Because the historical event was not observed in accordance with definite statistical sampling criteria, and is not part of the systematic record, its relation to the underlying process of flood occurrence is uncertain. This is so regardless of the accuracy with which the stage and discharge might have been determined. For example, a historical flood that washed out a bridge might have been recorded although a larger flood that caused no damage might have gone unremarked. The historical flood cannot be used in flood frequency analysis unless additional information (historical threshold and historical period) is available to relate it to flood occurrence over a historical time period.
The computational procedures in Bulletin 17-B Appendix 6 are applied to both historical floods and high outliers.
Question: Why do we bother with historical floods and high outliers? Why don't we just use the systematic gage record?
Answer: We bother with historical floods and high outliers because systematic streamflow records usually are short and may be inconsistent with the longer-term flood history experienced and recorded non-systematically by the local community. Sometimes a short systematic record contains an extraordinary flood peak that stands head and shoulders above everything else in the systematic record and everything else experienced in the history of the local community. Conversely, the long-term community history may record one or more outstanding floods that are much larger than anything in the systematic record. In either case, the systematic record disagrees with long-term community experience. Such discrepancies must be resolved if the frequency analysis is to be a sound basis for planning. The Bulletin-17-B historical adjustment procedure provides a basis for reconciling these discrepancies.
Question: What is the high outlier threshold?
Answer: There are two different quantities that sometimes are called high outlier thresholds. One is the statistical high-outlier test criterion or threshold computed by the Grubbs-Beck test and described on page 17 of Bulletin 17-B. The other threshold, which B-17 does not not clearly describe or distinguish from the statistical threshold, is the threshold that is used in (or implied by) the Bulletin-17-B historical adjustment procedure that actually is applied to the flood record. This threshold may be called the historical-adjustment threshold; it does not necessarily equal the statistical high outlier test criterion, and may be either higher or lower than the statistical test.
The statistical high-outlier test criterion is based on the standardized distance, (x.i - x.bar)/stdv, based on logarithms of peak flows, between the extreme top observation and the mean of the data set. The criterion or threshold value is the value that is unlikely (10 percent chance) to be exceeded by the LARGEST observation in a sample. If the largest observation actually is greater than the high-outlier threshold, that is an indication that the observations above the threshold are larger than would be expected for the given period of record, that they may be associated with a longer time period than the period of systematic record and that they may be distorting the fit of the frequency curve. The test itself is only a warning, not a definitive indication of anything wrong, and additional historical information must be supplied before any adjustment can be made. If appropriate historical information is available, the historical adjustment can be made even if the statistical test does not detect a high outlier.
The historical-adjustment threshold is specified either explicitly or implicitly in the course of defining the non-systematic historical record period and applying the Bulletin-17-B historical data adjustment. Bulletin 17-B does not describe or discuss this threshold clearly, but instead simply assumes that the Z highest peak discharges in the combined systematic and historical (non-systematic) record are known also to be the Z largest in a historical period longer than the systematic record period. If the threshold is stated explicitly, then Z is the number of peaks that exceed the threshold; if Z is given instead, then the threshold is implied to be somewhere between the Z and Z+1 ranked peaks. Although the number Z and the historic period length H are sufficient for computing the historical-data adjustment, an actual discharge threshold (or range of values) is needed to properly document the historical record, and should be reported along with the results of the historical-data adjustment. An actual threshold value or range is needed for two reasons. First, the number of events Z will become outdated, and will have to be updated, whenever a new peak occurs that exceeds the previous Z-th ranked peak. Second, and more important, one cannot legitimately claim, without any support, that the Z largest peaks in the record at hand are the largest in a longer period; it is necessary to demonstrate, or at least provide plausible support, that there actually were no other peaks that occurred but were not included in the record. In most cases, this support has to be based on the idea that if any peaks greater than some magnitude had occurred, people would have noticed and recorded them; that magnitude, the magnitude that would get almost everyone's attention and ensure that a record would be made, is the magnitude that should be determined and documented as the historical-adjustment threshold to support the historical-data adjustment.
It should be noted that the numerical value of the historical-adjustment threshold is not used in the computation of the historical data adjustment. The threshold value is used only to separate the Z largest values from the remaining data points; the results of the adjustment are the same for threshold values anywhere between the Z and Z+1 ranked data points. This is important because the threshold cannot always be determined with much certainty or precision from the available historical information.
If there is not sufficient historical information available to determine a historical-adjustment threshold and length of historical period, then any historical (non-systematic) peaks are not usable for statistical analysis, because their relation to the underlying process of flood occurrence is unknown. Similarly, if the historical information is inadequate to adjust for the high outliers, then they should be retained as part of the systematic record and all peak discharges given equal weight in computing the moments (mean, standard deviation and skew).
Question: How is the threshold determined in the historical adjustment procedure?
Answer: The historical-adjustment threshold discharge is chosen high enough such that all high outliers and historical floods included in the adjustment procedure are the only floods known to exceed the threshold in the historical period of H years. In other words, the record is known to be complete for all events exceeding the historical-adjustment threshold. There is no single procedure that can be followed blindly to determine the threshold from the historical information usually available. Determination of the threshold usually will be based on consideration of channel-bank and floodplain elevations, elevations of important structures, and the history of the neighboring community. Although the determination may involve elements of subjectivity and judgement, the choice of the historic threshold should be defensible given the available historical information. The historical-adjustment threshold often is less than the computed high-outlier threshold.
Question: Must a peak discharge exceed the high outlier threshold to be included in the historical adjustment procedure?
Answer: No, the high outlier threshold (the statistical test criterion given by equation 7) is just used as guidance in determining whether a peak discharge is so large that it might require use of the Bulletin-17-B historical adjustment procedure. If the peak discharge exceeds the high outlier threshold, then the analyst should determine whether historical information is available that indicates the high outlier is the largest flood in a period longer than the systematic record. If there is useful historical information available, the high outlier may be adjusted even though it does not exceed the computed threshold. Therefore, the historic threshold used in adjusting for high outliers (and historical floods) can be less than the computed high outlier threshold.
Question: What is the difference in the historical adjustment procedure for high outliers and historical floods?
Answer: There is no difference in the computation; the historical adjustment computation does not distinguish between historical peaks and high outliers. The same threshold is used for adjusting for both types of floods. The number of peaks above the threshold is denoted as Z, and no distinction is made between high outliers and historic peaks in determining the value of Z. Historically adjusted moments are computed using a weight of 1.0 for the Z above-threshold peaks and a weight W = (H-Z)/(N+L) (H = total length of historical and systematic record; N+L = number of below-threshold systematic peaks, including low outliers and zeroes) for all systematic peaks below the threshold.
Question: Please clarify the definition of the variable "Z" in the historical adjustment procedure, appendix 6, and clarify the intended application of the procedure. Bulletin 17 seems to say that historical data should be used if possible, but appendix 6 seems to indicate that the historical peaks need to be the largest in the whole record.
Answer: Yes, Bulletin 17 does say that historical data should be used if possible. And yes, Appendix 6 does indicate that the historical adjustment is applied to the largest peaks in the whole record. The key words are "if possible." A couple of conditions must be met. First, historic peaks and high outliers, by their nature, are expected to be a biased (unrepresentative) sample of the population of all peaks. However, the Bulletin 17-B procedure assumes that the high outliers and historic peaks are an unbiased sample of the population of flood events that exceed the historical-adjustment threshold magnitude. If there is some question of that assumption, then the historical information is wholly or partly unusable. Any historical peaks that do not exceed the threshold cannot be used because their relation to the flood population is undefined. If potential high outliers do not exceed the threshold, they are used in the same way as any other ordinary systematic peaks. Second, the Bulletin-17-B historical adjustment procedure postulates the existence of exceptionally large floods (historic peaks and high outliers) in the data set. When such peaks are present, the systematic streamgaging record, especially if short, may be inconsistent with the neighboring community's long-term knowledge of flood occurrence, and some reconciliation of the gage record with the community experience is required. The historical-adjustment procedure accomplishes this reconciliation by using the above-threshold (historic and high-outlier) peaks with unit weight and the below-threshold systematic peaks with the historical weight factor (H-Z)/(N+L), in effect filling in the rest of the extended historical period with multiple copies of the below-threshold systematic record. Thus, Z represents the total number of peaks, systematic and historic, that exceed the threshold. It is quite possible and acceptable for Z to consist of, for example, 3 historical peaks that exceed the threshold, plus 3 systematic peaks that not only exceed the threshold but also exceed the 3 historical peaks; if there were several additional historical peaks that did not exceed the threshold, they would simply be ignored because their relation to the flood population would be indeterminate. It is also perfectly acceptable for Z to consist of one or more systematic peaks (high outliers) and no historical (non-systematic) peaks at all. However, if Z = 0 (no large peaks at all, only the knowledge that no peaks exceeded some threshold), the Bulletin 17-B historical adjustment has no effect -- the computation can be performed, but the frequency curve is unchanged.
Question: If all peaks that exceed the historical threshold are treated the same, why do we have all of this gobbledygook about "systematic peaks above the threshold," "high outliers," etc? Why don't we just call them "historical peaks" and be done with it?
Answer: All peaks that exceed the historical threshold are indeed treated the same IN THE HISTORICAL ADJUSTMENT COMPUTATION. However, in the preliminary analysis of the systematic record, the historical peaks are ignored whereas the high outliers are treated exactly like the other systematic peaks. Improper treatment of high outliers and historic peaks in the systematic-record analysis can adversely affect the final Bulletin-17-B frequency curve, primarily through incorrect skew coefficients and mis-identification of high and low outliers.
The systematic record is that portion of the record in which the annual peak is determined and documented for each year, regardless of the magnitude of the peak; if the peak is too small to measure, it nonetheless is recorded, but with a qualification code indicating that fact. Thus, the systematic record can be regarded as an unbiased random sample of the population of all floods, and the statistics of the sample can be taken as estimates of the corresponding characteristics of the population. The historical record consists of flood peaks that were observed outside of a systematic period of flood record. That is, the historical record is non-systematic. One might assume that such peaks were observed and recorded because they were unusually large and noticeable (or were expected to be so), but there is no real guarantee of this unless additional historical evidence is available. Since one does not know how the historical sample is related to the flood population, one cannot use it for flood estimation unless additional historical information is provided. Bulletin 17's historical flood threshold and historical period together provide the information needed to make use of the historical peaks. If the threshold and historical period cannot be defined, or if the historical peaks are less than the threshold, then the historical peaks cannot be used.
QUESTION: Why is there so much emphasis on low and high outliers?
RESPONSE:
Bulletin 17B does not explain this very well. It lumps a number of distinct problems and phenomena under the label "outlier," but does not give much explanation of how the Bulletin-17-B conception of outliers differs from the classical concepts developed in the literature on statistics and analysis of measurement data.
Bulletin 17-B defines outliers as data points that depart significantly from the trend of the remaining data when plotted as a frequency curve on magnitude-probability coordinates. By implication, outliers are data points that interfere with the fitting of simple trend curves to the data and, unless properly accounted for, are likely to cause simple fitted trend curves to grossly misrepresent the data. This definition is quite nebulous, furnishes little concrete guidance, and may be confusing to those unfamiliar with flood frequency analysis. However, flood data sets often do not conform to common statistical probability distributions and often contain observations that distort the fit of simple fitted frequency curves. Most flood data points are distributed within some range of moderate extent, but some values extend above the range by factors of 10 or more; thus statistical analysis usually is based on the logarithms of the flows. In arid environments, streams sometimes may be dry all year long, so that the annual maximum "flood" flow may be zero or, perhaps worse, a factor of 10 or more below the range of ordinary flows, so that computations with logarithms are impossible or prone to difficulties. Bulletin 17-B rather loosely gathers all of these issues, which generally may result from a range of causes, but often manifest themselves as poor fit of the fitted frequency curve, under the single and somewhat misleading term "outlier."
Classical statistical concepts of outliers involve ideas of rejection of spurious observations, such as surveying measurements of the azimuth of the North Church steeple rather than of the Blue Hill triangulation beacon. These ideas are not generally very relevant to frequency analysis of properly quality-assured published flood flow data, and Bulletin 17-B does not recommend or provide procedures for rejection of outliers.
The related and more modern notion of contamination of one measurement distribution by another is a special case of the concept of mixed populations mentioned briefly in another section of Bulletin 17-B. Different flood-generating hydrologic processes in one basin give rise to mixed populations of floods. The mixture may manifest itself as outliers, especially if the populations are quite different and one of them occurs relatively infrequently. In this case, the outlying observations could be called hydrologic outliers and could be treated either by the Bulletin-17-B outlier adjustments or by mixed-population analyses, as discussed in another one of these FAQs, with substantially similar results.
Another source of outliers is simply the purely chance occurrence of extraordinarily large observations in some samples. This kind of outlier is more common in flood distributions where one tail or the other generally is somewhat stretched out relative to the normal distribution, and is especially common in so-called "heavy-tailed" distributions such as the Pareto. These may be called "statistical" outliers, and are exemplified by the Bulletin-17-B concept of high outlier, in which it is postulated that the record length associated with the outlier is governed by historical (or paleoflood) evidence rather than simply by the systematic streamgaging record period.
Bulletin-17-B procedures do not involve "dropping" of outliers. High outliers are retained in the analysis as systematic peaks if usable historical information cannot be found. If historical information is available, the high outlier is properly discounted to a more appropriate time window. Low outliers are counted, not "dropped", and their frequency (which is the valid information that they contain relative to flood risk) is used to properly account for the occurrence of low outliers by means of the conditional probability adjustment of a preliminary conditional frequency curve based on the non-outlying observations.
The extensive discussion of outliers in Bulletin 17-B is necessary because of the prevalence of outlier-like effects in frequency analysis of flood data. Despite whatever faults might be found, the discussion in Bulletin 17-B is a generally worthy effort to provide the theoretical concepts and operational procedures needed to enable different analysts to produce reasonable and consistent fitted frequency curves in the variety of problems encountered in practice.
----------------------------------------------------------------
Question: What are the limitations on flood frequency curve extrapolation? We recently had a terrible flood in our town. I own a trailer park that was above the maximum flood level. I downloaded flood data and used the Bulletin-17-B flood frequency methodology to prove that the flood was a 5,320-year flood and that my trailer park was above the 6,000-year flood level. I wanted the Flood Agency to certify my trailer park, but they would say only that the flood was more than twice as big as the 100-year flood and that my park was outside the 100-year flood plain. Why doesn't the agency acknowledge the true rarity of this flood and the true safety of my property?
Answer: Extrapolation of flood frequency curves is limited primarily by the
user's tolerance for uncertainty in the extrapolated results. The user of flood-frequency
data needs to understand that these data carry substantial uncertainties with
them, even if no extrapolation is involved. The user has to accept responsibility
for using these results in such a way that errors in the results do not lead
to catastrophic consequences to actions based on the results. Because of the
vagaries of flood occurrence in time and space, any observed flood record is
likely to give a more or less inaccurate representation of the true magnitude
and frequency of flooding. This so-called random-sampling uncertainty is smallest
near the middle of the flood distribution (the 2-year flood) and increases for
larger less frequent flood magnitudes. This uncertainty is represented by the
confidence limits in Bulletin 17-B; the limits are farther apart, representing
greater uncertainty, in the tail of the distribution than in the center (near
the 2-year flood). Random sampling uncertainty exists and is greater in the
tail of the distribution even if extrapolation is not an issue and even if the
mathematical form of the distribution is known. In practice, the record length
or sample size usually is small (20-60 years) in relation to the annual exceedance
probabilities or recurrence intervals of interest (100-500 years), so extrapolation
is necessary for obtaining the needed information. Moreover, the mathematical
formula that should be used for the extrapolation is not known with any confidence,
and there is no agreed-upon procedure to assess or quantify the uncertainty
in the extrapolation formula. As a result, the following rules generally are
followed: 1) don't extrapolate if you don't have to; 2) if you do have to extrapolate,
do so, but only as far as necessary; 3) seek additional information to provide
independent corroboration of the extrapolated values (see Bulletin 17-B, pages
19-22); and 4) don't give too much credibility to or place too much reliance
on the extrapolated values. For many types of engineering design and planning,
there are authoritative design criteria that specify recurrence intervals or
exceedance probabilities that must be used; in such situations, extrapolation
to those levels is required, like it or not. Commonly used design recurrence
intervals include 100 years, 500 years for design of scour protection for major
bridges, and shorter intervals for less important works. Bulletin 17-B shows
recurrence intervals up to 500 years (annual exceedance probabilities down to
0.002) in the example problems; it may be assumed that there is a consensus
that extrapolation out to that level, if necessary, is acceptable, even if not
necessarily accurate or reliable. In other cases, however, there may be no essential
need to extrapolate. Estimation of the recurrence interval of an observed flood
by long extrapolation of a frequency curve, for example, generally serves no
useful purpose in terms of flood control or flood plain planning and management.
(Think of it -- What difference does it make to the winner of a raffle -- or
to the losers -- whether he had the one winning ticket in 100 or the one winning
ticket in 1000?) If extrapolation is necessary, and, for that matter, even if
it is not, prudence dictates that corroboration be sought, and that more corroboration
be sought the longer the extrapolation. Thus, it is always prudent to compare
at-site Bulletin-17-B frequency curves with regional flood frequency relations
and, if the extrapolation is longer, with flood records at comparable nearby
sites and with regional rainfall and runoff relations (Bulletin 17-B, pages
19-22). If long extrapolation is required, it probably is required because of
a concern that exceedance of the design flow would cause catastrophic damage
that must be avoided by setting an extremely high design flow; in such cases,
if the extrapolated design flow is very uncertain, and if the uncertainty cannot
be reduced by comparison with other regional flood information, it might be
prudent to consider whether an alternative system design, having less catastrophic
failure modes, might be preferable.
Home |
Jan 99 |
Apr 99 |
Sept 99 |
Oct 99
Jan 2000 |
Apr 2000 |
July 2000 |
Oct 2000 |
Nov 2000
Jan 2001 |
May 2001 |
July 2001
Jan 2002 |
April 2002 |
July 2002 |
Oct 2002 |
Jan 2003 |
Spring 2003 |
Aug 2003 |
Oct 2003