An Algorithm for Correcting Levels of Useful Signals on Interpretation of Eddy-Current Defectograms

To ensure rail traffic safety, rails are regularly inspected by various approaches and methods of nondestructive testing, including eddy-current flaw detection methods. In this regard, automatic analysis of large datasets (defectograms) output by the corresponding equipment is a relevant problem. This analysis constitutes a process of identifying damaged or defective sections along with identifying structural elements of rails within defectograms. The article continues a series of studies on the problem of automatic recognition of images of flaws and structural elements of rails in eddy-current defectograms. These images are formed exclusively on useful signals, which are determined on amplitude threshold levels calculated automatically on the eddy-current data. The previous algorithm for identifying threshold levels was geared primarily towards cases in which the vast a majority of the signals output by the flaw detector are rail noise. A signal is considered useful and is a subject to further analysis if its amplitude exceeds the relevant noise threshold by two times. This article is concerned with the problem of adjusting threshold levels in order to make it possible to identify extensive rail surface flaws. An algorithm for calculating threshold levels of rail noise amplitudes with their subsequent correction in the case of a large number of useful signals from extensive flaws is proposed. Examples of the algorithm’s application to real eddy-current data are provided.


INTRODUCTION
To ensure rail traffic safety, rails are regularly inspected by various approaches and methods of nondestructive testing, including eddy-current flaw detection methods. In this regard, automatic analysis [1][2][3] of large datasets (defectograms) output by the corresponding equipment is a relevant problem. This analysis constitutes a process of identifying damaged or defective sections along with identifying structural elements of rails based on defectograms. Given the significant volumes of input data, fast and efficient data processing algorithms are of particular interest.
The article continues a series of studies [4][5][6][7] on the problem of automatic recognition of images of flaws and structural elements of rails in defectograms of eddy-current flaw detectors. The defectograms are divided into fragments (analysis units), each of which is processed separately. Using the algorithm from articles [4,5], useful signals are isolated (against the background of rail noise) in each analysis unit and then grouped into marks. The identified marks are subject to further classification with the use of neural networks. Article [6] has solved the problem of recognizing records of small structural elements (up to 157 mm in length) of the following three types: (1) straight or beveled bolted joints, (2) flash butt welds, and (3) aluminothermic welds. Article [7] presented the recognition of records of long (420 to 3220 mm) structural elements of rails of two classes: (1) rolling stock axle counters and (2) rail crossings. Marks that have not been attributed to a particular type of structural elements are classified as conditional f laws.
We note that the described algorithm for determining the threshold level of useful signal amplitudes [4,5] is designed for such fragments of eddy-current defectograms in which the overwhelming majority of signals is rail noise. Experience of practical application of this algorithm as part of a hardware-software railway flaw detection complex has shown that when the analyzed rails have extended surface flaws (usually resulting from contact fatigue of the metal), which can reach tens or hundreds of meters in length, the calculated thresholds of amplitudes of useful signals are too high (in magnitude) for the system to successfully form the marks for either records of structural elements or records of damaged or defective rail sections.
This article is dedicated to the problem of adjusting the threshold levels of useful signal amplitudes during automatic interpretation of defectograms of eddy-current flaw detectors in order to make it possible to identify extensive rail surface flaws.
The article considers a generalized device in the form of a 12-bit eddy-current flaw detector with 15 data channels (per rail). The data channels correspond to physical sensors that are sequentially located on the surface of the rail perpendicular to the direction of the flaw detector's movement. The flaw detector records signal amplitude values of each channel as integers in the range from -2048 to 2047.
Only the amplitudes of useful signals are of interest. A signal is considered useful (and is a subject to further analysis) if its deviation from zero exceeds the corresponding threshold level of rail noise by at least two times.
The article proposes an algorithm for calculating positive and negative values of threshold levels of rail noise amplitudes with their subsequent adjustment towards zero if a large number of useful signals is detected (for example, due to extensive rail surface flaws) on the analyzed eddy-current defectogram fragment. Examples of the algorithm's application to real data are provided.
CALCULATING THRESHOLD LEVELS OF USEFUL SIGNALS Automatic analysis usually involves dividing defectograms into fragments, for example, fragments corresponding to 50-meter rail sections, i.e., for readings from a 15-channel flaw detector taken from each millimeter of the path, the analysis unit is a matrix of 15 rows by 50000 columns whose elements are the values of signal amplitudes from the corresponding data channels.
An example of a graphical representation of a 50-meter analysis unit is given in Fig. 1 (15-channel data from a 50-meter section of one rail; the data channels are numbered from top to bottom). In this defecto-  gram fragment, in addition to records of structural elements (welded rail joints) that merge into solid vertical lines in the figure, strong useful signals from an extensive surface flaw can be seen starting from about the 18th meter mark and up to the end of the considered fragment. We note that the first three channels of eddy-current data do not reflect this flaw in any way, registering mainly rail noise. The bulk of the signals from the flaw is found in central channels. As an example, Fig. 2 separately presents the data of the seventh and eighth channels, which contain the strongest signals from the flaw (i.e., the strongest amplitude deviations from zero).
We have purposefully chosen an example with a clearly visible difference between the sections that correspond to rail noise and the sections that correspond to an extensive surface flaw. A flaw can cover the entire surface of a rail over several hundred meters. In such a case, it may be difficult to visually distinguish whether a record in a flaw pattern represents a flaw or rail noise registered by sensors with incorrect sensitivity settings.
Within one analysis unit, positive and negative threshold values of the amplitudes of railway noise signals are calculated separately for each data channel. When calculating a positive threshold, only nonnegative signal amplitudes are considered. Similarly, calculating a negative threshold involves only amplitudes with values under or equal to zero.
The following Python 3 function returns the absolute value of the desired rail noise threshold for one data channel. In this function, parameter A[0:2047] is an auxiliary array that stores frequencies corre- break return int(3 * sig + 0.5) This function implements the algorithm proposed and substantiated in [4,5]. The concept of the algorithm is based on the fact that the probability of occurrence of a signal with a certain amplitude in flawfree rails on a section without structural elements follows (in approximation) the normal distribution. It is also assumed that such signals, which are essentially rail noise, make up the vast majority of the total number of signals recorded by an eddy-current flaw detector during nondestructive testing of rails. In this case, after strong signals from flaws and structural elements that do not follow the normal distribution have been excluded from the analyzed sample, the three-sigma rule applies.
Practical experience has shown that when defectograms include records of extensive surface flaws, the amplitude thresholds of rail noise calculated by the algorithm are too high (in magnitude). That leads to overly high threshold levels of useful signals as well, since they are calculated as double the selected threshold value of rail noise.
For example, in Fig. 2 the positive and negative threshold levels of useful signals calculated using the discussed algorithm for the seventh and eighth data channels are represented as solid cut-off lines. In this case, almost all signals from structural elements and flaws are within the selected threshold values, i.e., these signals would not be included in subsequent algorithms during the analysis of the current fragment of defectogram. Figures 3 and 4 present, for the seventh and eighth data channels, the sections of the defectogram that contain the records of an aluminothermic welded rail joint (top) and the signals from a surface flaw (bottom). In these figures, the calculated threshold levels of useful signals are also represented by solid cut-off lines.
We note that the first two channels of the flaw detector do not register the presence of any surface flaws on the analyzed track segment (see Fig. 1). In these two channels, the only useful signals are the signals from welded rail joints. The rest of the data is rail noise. As shown in Fig. 5, in this case the solid lines of the calculated threshold levels quite accurately cut off the useful signals that correspond to the aluminothermic weld. In other words, for these channels the algorithm for calculating threshold values of rail noise performed successfully (in accordance with its intended function).
Thus, if the data channel of the current analysis unit contains a record of an extensive surface flaw, then the calculated threshold values of rail noise need to be adjusted towards zero. The problem of adjusting these threshold levels is discussed in the next section.

ADJUSTING THRESHOLD LEVELS OF USEFUL SIGNALS
In the course of practical experience of analyzing eddy-current data, a characteristic sign of the presence of signals from an extensive surface flaw in a recording has been identified.
In the analysis unit for a channel that contains signals from a flaw of that kind, the frequency distribution graph of amplitudes of the same direction occurring on the segment from zero to the threshold value of rail noise will have local minimum points, the number of which depends on the severity of the surface flaw, i.e., on the number of useful signals and their amplitudes. A graph of the same segment for a "flawfree" data channel would be much smoother, ideally not containing a single local minimum. In any case, the shape of the graph in the specified segment approximately corresponds to the probability density graph of the normal distribution in the segment from 0 to 3σ, where σ is the standard deviation from a zero mathematical expectation.
Local minima are of interest as a resource that can be used to reduce the magnitude of the threshold level of rail noise. Iterative removal of local minimum points on the considered segment of the graph followed by shifting the remaining data towards zero also achieves a shift in the threshold noise level. The procedure stops as soon as there are no local minima left on the graph section from zero to the current threshold value. The new threshold level value obtained after these shifts is the desired result, i.e., the adjusted threshold of rail noise.
The following Python 3 function implements the described concept, returning an adjusted threshold level. Here the parameter A[0:2047] is, as before, an array (graph) of frequencies of signal amplitudes of the same direction, i.e., A[k] is the number of signals with amplitude k in magnitude. The level parameter is the threshold noise level calculated by the previous algorithm (for the amplitudes of one direction) that is subject to adjustment.   Figures 6-9 present probability density graphs of the frequency distributions of positive (right) and negative (left) signal amplitudes of the first, second, seventh, and eighth data channels before (top) and after (bottom) adjustment of the threshold level of rail noise, which is represented by a vertical cut-off line indicating its magnitude. In these figures, for the data channel of the current analysis unit, the signal amplitude values of the same direction are plotted along the X axis and the total number of registered signals of the same amplitude is marked along the Y axis.
The frequency distribution graphs of signal amplitudes in Figs. 6 and 7 are "smooth" and have almost no local minimum points. Accordingly, the adjustment of threshold levels for these channels was slight.
The starting graphs of amplitude frequencies in Figs. 8 and 9 contain large numbers of local minima and undergo significant compression, with the initial shape on the segment from zero to the threshold level being approximately preserved after smoothing.
The results of the application of the adjustment algorithm are shown in Figs. 2-4. The new positive/negative threshold levels of useful signal amplitudes, calculated as double the positive/negative adjusted noise threshold, are shown with dotted cut-off lines. In these figures, the values of useful signals from welded joints, in particular from the aluminothermic weld, and from the extensive surface flaw are found beyond the boundaries indicated by dotted lines. That means that these signals will subsequently be used for the formation of the corresponding marks both for the structural elements and for the section with the extensive surface flaw.
It should be noted that the adjusted threshold levels are also used for assessing the severity of detected surface flaws. The issues of assessing the severity of flaws detected in defectograms are outside the scope of this article.  In the case of the first and second data channels, the changes to the threshold levels of useful signals were not significant. In Fig. 5, the dotted line of the new threshold level is right next to the solid line of the corresponding initial threshold level, still cutting off the same useful signals from the aluminothermic welded rail joint.
Thus, the proposed algorithm for adjusting threshold levels of noise can be applied to any eddy-current channel without the need to perform any preliminary data analysis to determine the presence of signals from an extensive surface flaw. Thresholds are adjusted as necessary depending on the number of local minimum points on probability density graphs of amplitude frequency distributions.
The algorithm is guaranteed to stop. The resulting threshold level retains the same function, marking the boundary of rail noise amplitudes on a compressed (smoothed) density graph of the frequency distribution.
The compressed graph in the segment from zero to the threshold value approximately corresponds to the shape of the probability density graph of the normal distribution. Running the discussed in the previous section algorithm for determining the threshold noise level on this compressed graph fragment returns comparable results. For example, for the seventh channel both the desired levels (positive and negative) have the same magnitude of 47 (the threshold levels calculated by the adjustment algorithm were 44).
The concept of the algorithm is based on the results of observations from practical experience of processing large volumes of eddy-current data, and the adjustment algorithm itself is presented without a strict mathematical substantiation. In this regard, it is of interest to construct a mathematical model of the influence of extensive surface flaws on the density of frequency distributions of signal amplitudes recorded in defectograms.

CONCLUSIONS
The proposed algorithm for adjusting threshold levels of useful signal amplitudes has been shown to be effective in practical application as part of nondestructive testing of rails. As part of a hardware-software complex of eddy-current flaw detection, the algorithm can be successfully used to isolate signals received from extensive surface flaws.