Filtering Results from RI/QI Smoothing
November 18, 2009
Hello all,
The script is (finally) working relatively quickly so I have some results from the smoothing algorithm. To refresh the memory, I’m smoothing the fields of reliability index (RI) and quality index (QI) because the refractivity algorithm as it now stands has to smooth the received phase and refractivity output. The relative resolutions are ~4 km for refractivity but only 250 m for RI and QI (calculated for each gate). In order to do an accurate comparison we must smooth RI & QI on the order of ~4 km.
Below is an example of the response of the Gaussian curve to an input value of 1 at the center of the map. As you can see, the effects are limited to a radius of ~5 km. In fact the response drops below 1% (3 N-units) beyond ~6 km and below 0.1% (0.3 N-units) beyond ~7 km. (For those who are curious: I set the variance equal to that of refractivity N, which is ~4 km.)
The initial pre-smoothing data, as can be seen below, has high resolution, especially compared to a map of refractivity. The smoothing alluded to above is needed to bring the effective resolutions closer together for comparison purposes.
After smoothing the resolution reduces accordingly, more in line with that expected from refractivity data.
Now we can make the comparison between the variance of refractivity/scan-to-scan and fields of RI and QI. Time series of all quantities at one location, scatter plots of various data fields, and comparing maps of the fields together will help to qualitatively determine any correlation between them. Again, the goal is to find a link between RI and/or QI and fields of refractivity/scan-to-scan variance, which can be used as a an additional clutter filter when refractivity is used for assimilation into the ARPS computer model.
Results of this comparison will follow…
Lessons Learned in the World of Looping Variables
November 17, 2009
Hello everyone,
As you may have gathered from my last post, I have had a bear of a time trying to get the Gaussian smoothing script working in reasonable time frames. In order to solve what was going wrong I had to resort to breaking the code up into small pieces and evaluate each one separately.
This sounds like a long and boring procedure, but it was very necessary to find what was slowing my code down so badly. I thought it was running smoothly yesterday afternoon, set it to run, and left for the night, thinking it would finish in a few dozen minutes and I’d see the results when I came in this morning. Instead I found that the script had only muscled its way through about 15% of the grid points. That’s right, my 8-processor, 8 Gb RAM Mac tower, in all of its computational glory, couldn’t make it through more than 15% of the domain in about 14 hours. Pathetic.
Enter tic and toc. Tic and toc you ask? They are little functions in Matlab which essentially act as a stopwatch. Tic starts the timer and toc displays the time passed since tic was invoked. This turns out to be very handy for finding the slow spots in a piece of code.
Example:
>> tic; weights = zeros(last_range,360); toc
>> Elapsed time is 0.00095 seconds.
This shows how long this particular process is going to take to execute within the main program. In my case, this piece of code will be executed for each grid point in a 300*360 polar coordinate system. Taking this execution time and multiplying it by 300*360, I get ~103 seconds.
If you then go ahead and do this for each part of your script you can really see where computational time is going. It turned out that I had some ill-placed “for” loops which added far too much computational complexity. For instance, a simple task of rounding off an array of (300,360) grid points was going to end up taking 972 hours. That’s right. 42 days. Obviously that won’t fly, and is a massive waste of resources. Now that I have rearranged the looping structures, that same rounding procedure will take a grand total of 259 seconds. Much better, right?
I would recommend everyone looking at their code and looking at its efficiency, especially if it takes too long to execute. It might seem like a waste of time, but would I have been wasting time if I waited 42 days for my code to run? It’s much more efficient to spend the time and find the slow spots before having to run the code over and over. You also become aware of what function combinations speed the code up and what slows it down. For example, combining simple multiplication and addition within a rounding function actually slows the code down compared to splitting them up in two lines. This kind of knowledge will help you structure your code better in the future.
P.S. – For anyone who doesn’t use Matlab, there may be functions out there that do the same thing. In Python one can do:
import time
start = time.time()
do_snippet()
print “It took”, time.time() – start, “seconds to execute”
Another module in Python you can use is “timeit”. Other languages, as in the example above, can use the system or CPU clock to do similar tasks.
Help on Gaussian smoothing in polar coordinates…?
November 16, 2009
Does anyone know how to make an efficient Gaussian smoother in polar coordinates? I have to recalculate the range (and thus the Gaussian curve) for every point in my domain…which turns out to be 300*360. Lots of points, and lots of computational time. Even truncating the 2-D curve @ 6 km…I end up with ~8700 points which need the Gaussian curve calculated for every point I’m iterating through. Anyone?
Filtering RI, QI with a 2-D Gaussian curve
November 16, 2009
Hello all,
Today’s major project is filtering the fields of RI and QI with a Gaussian filter in two dimensions (see last post for more info). We essentially need to do this because refractivity measurements are smoothed (due to phase noise on small spatial scales, e.g. gate-to-gate nosiness…smoothing reduces noise and makes retrieval possible) with an approximate smoothing radius of 4 km. Boonleng’s algorithm essentially smoothes to a radius of

, where σ is the standard deviation of a Gaussian curve. This means that the variance, σ², equals 4 km². We can then assume that σ is the same in both the x and y dimensions, meaning the 2-D Gaussian curve will be circular symmetric.
We also set the maximum range to be smoothed as 3σ, or 6 km. This is done to reduce computational complexity significantly. It turns out that doing this leaves out only 0.2707% of the outer edges of the 2-D Gaussian curve. This was determined through:
, where the integrals are 2-D integrals of the Gaussian curve (eliminating the redundant 2π coefficient in each term). This leaves us with approx. 575 points contributing their values to the central point.
The results of this so far include matrices of range from a point (x0,y0) to every other point in the domain, as well as the 2-D Gaussian function from (x0,y0) to every point with a range less than 6 km. Here are sample output plots:

Range of each Pixel from element (180,50)

2-D Gaussian Function [zoomed on Pixel (180,50)
The next step is to take this Gaussian function, iterate it over the entire refractivity domain so that we have a similar smoothing function for every pixel, sum the product of the function with the data to be smoothed (i.e. RI, QI) and divide by the sum of the function. This will give us a Gaussian-smoothed map of RI and QI, which will allow for direct comparison with fields of N and dN/dt in both time and space (since spatial scales will be roughly the same).
NSF Meeting (cont’d)
November 13, 2009
Continuing the thoughts from yesterday’s post…
Since the comparison is not the greatest (point vs. smoothed measurement), it’s unfair at this point to make any solid conclusions about the correlation of variance statistics and the reliability index (RI). However, the preliminary data did show little to no correlation between the variance of N, dN/dt and RI. This, again, may be purely an artifact of the area that each pixel represents. I will be performing a 2-D Gaussian filter on the RI field and seeing what kind of comparison we get from that. One thing to note is that even visually, the two fields look completely different as far as the area each pixel represents.

As you can see from this image, variance of the refractivity field is smoothed to approx. 4 km. This is obvious in the pixels near the edges of the domain, where outlying points have a (roughly) spherical, 4-km radius shape. The variance, as has been shown in the spring, is much higher SE of the radar (KTLX) and near the edges of the domain. The large region SE of KTLX with high variance is an area with complex terrain, filled with hills and river beds and trees. This kind of terrain is largely absent from the rest of the domain. The high variance near the edges of the domain is likely due to clutter quality which is just good enough to pass though our filtering, but the phase varies so much that steady refractivity measurements in time are not possible.

This image shows RI for the same time period. It is fairly obvious now why the variance is likely so high SE of the radar (KTLX). RI in this region is very low, meaning the phase of the returned radar signal is very unstable as a function of time. Since refractivity is derived from the signal phase, unsteady phase directly leads to unsteady refractivity measurements, leading to high variance of N over this area (see Fig. 1 above). The suburban and urban landscape near I-35 (N-S highway west of KTLX), I-240 (W-E highway just south of Oklahoma City) and Norman (just east of I-35 and west of KTLX) has very high RI and very low variance statistics, which is due to the large number of power poles, buildings, radio towers, etc. present.
Also easily seen are the high-voltage power lines. The clearest one runs SE from KTLX, shown by a relative spike in RI. This leads to reduced variance even though the terrain is still complex here, likely because the returned power from the large supporting towers is much larger than power return off of trees or other clutter objects, and since the towers are stationary their returned phase will only be a function of N changes and not due to motion of the target. Other power lines are seen N and NE of KTLX.
One of the reasons we had wanted to do a comparison of RI with variance is that RI is a direct measure of phase. If phase was stable, then our derived N values should also be stable. During the NSF meeting yesterday we thought of using our quality index (QI) to do further correlation statistics. QI is a function of the observed power, radial velocity and spectrum width from a target. Correlating QI and RI with variance may shed more light on what exactly is causing the high variance, whether it can be explained straight-forward through phase stability, or whether we must use the derived radar moments (power, velocity and spectrum width). More on this later…
NSF Meeting
November 12, 2009
Today both myself and Nick presented our recent findings to Jidong and Pam, as everyone else was off at Kyoto University for the week.
My recent work over the last couple of days has centered around finding variance of absolute refractivity N and scan-to-scan refractivity (~dN/dt) as a function of time and space. The script which does the filtering and variance calculations was simplified tremendously by exploiting the fact that there is a lot of missing data. Some pixels in the field have no missing data, and these will be the best for correlation statistics. Since the filtering was done in a window, if any value (e.g., N = NaN) was missing along that time period then the filter wasn’t applied and variance not calculated. This reduced the calculation time to cover all of the 240*360 pixels*90 “windows” (derived from 100 radar scans on October 19th) from several hours to a much-improved tens of minutes.
At the NSF meeting today we discussed my findings as far as correlating our Reliability Index (RI) with variance statistics. Pam pointed out a major flaw in the comparisons, which is that N and dN/dt are smoothed values, with the smoothing approximated by a 4-km Gaussian window in space. However, RI is a point measurement which is not smoothed in space. Therefore the comparison currently is between a smoothed (~4 km) field (N, dN/dt) and a point (~250 m/range gate) field (RI). Since we’re comparing “apples to oranges” we need to level the playing field, as it were.
Future work will include filtering the RI field similar to how N and dN/dt are smoothed, and that will be by using a 2-D Gaussian filter with a 4-km standard deviation. This will allow a more direct comparison between RI and the variance statistics.
Continued in tomorrow’s post…
Update on Correlation
November 11, 2009
Hello all,
I have finished developing the algorithm which develops the short-term variance of absolute refractivity N and scan-to-scan refractivity ~dN/dt and “joins” them in time with calculations of reliability index RI (see last post for explanations as to why this was all necessary).
The toughest part of development turned out to be passing a high-pass filter through the data within a 10-scan window and calculating the variance over this window. The development of this filter was aided by some code David Bodine gave me several months ago, but that and other scripts had to be tweaked in order to work in concert.
The results are forthcoming…it takes a (ridiculously) long time to run, and they were not complete by the time I left work today. Initial plots show that the variance appears realistic in a qualitative sense, but further analysis tomorrow will reveal the sucess (or failure) of the real-time poor clutter quality filtering technique we are searching for. Analysis will include maps of N, dN/dt, RI and the variance of these fields at various times to get a feel as to the values and variations which exist within them. Plots showing a time series of variance of N and dN/dt alongisde variations in RI will (hopefully) show a good correlation.
The hypothesis is that points which have high RI values most of the time (such as radio towers or buildings) will have correspondingly low variance statistics from N and dN/dt; when RI decreases, variance should increase accordingly. Targets with typically low RI may not show a high correlation, and since these targets may be corrupted by phase noise (moving trees/power lines/traffic/etc) then variance of N and dN/dt would not expect to be high but rather more random and less correlated to RI than a solid clutter target.
P.S. – NSF Meeting tomorrow, 2:30, 5th floor as usual. Most everyone will be away in Kyoto, Japan, but whoever is left behind is welcome to come and join us. Nick and myself will be presenting our recent findings in brief ppt format.
Correlating Reliability Index (RI) with Variance in Scan-to-Scan and Absolute Refractivity
November 10, 2009
Hello all,
Today I am working on developing an algorithm which will process one day’s worth of radar data and output a time series of RI and the variance of scan-to-scan and absolute refractivity. Since RI is typically calculated over the course of 10 consecutive radar scans, it seems logical to compute variance statistics over the same window.
However, it can be seen that absolute refractivity (N) and scan-to-scan refractivity (~dN/dt) may not be zero-mean, nor are they stationary (mean value may change in time). Both zero-mean and stationary assumptions must hold in order to robustly calculate variance in time. These assumptions can fail when:
- Atmospheric moisture and/or temperature changes rapidly, which would lead to a sloping value of N and a non-stationary residual after subtracting the mean.
- Scan-to-scan variations change with respect to time (as in the case where drying is followed by moistening, warming followed by cooling, etc.), leading to a sloping value of scan-to-scan and consequently a non-stationary residual after subtracting the mean.
In order to correct for these issues, a high-pass filter must be implemented on both N and dN/dt. This filter will allow only short-term perturbations to pass through. After subtracting the mean, these perturbations will be zero-mean and stationary, allowing for robust variance estimates. Once completed, short-term variance statistics can be compiled for each point in the domain. Since RI is calculated from phase measurements from 10 scans, variance will also be calculated from 10 scans.
The goal is to find an empirical method by which the variance of scan-to-scan and/or N can be estimated from phase variations in time (RI). If a satisfactory fit is found, then we can use RI to estimate the stability of the targets’ refractivity measurements. Poor refractivity stability leads to large variations in N and dN/dt, which is undesirable for assimilation into ARPS, since perturbations in N or dN/dt may be unrealistic and can lead to unrealistic conditions in ARPS.
The usefulness of this is easily shown in the case where data is used for assimilation. A pixel on the periphery of the domain may have a very high variance in N compared to the rest of the field, leading to unrealistic perturbations in N and dN/dt. If these perturbations are fed into ARPS without any filtering they would likely lead to unrealistic phenomena within ARPS, which would then be propagated forward in time and space and may cause, e.g., incorrect timing and/or location of convection initiation.
First Post!
November 10, 2009
Hey everyone,
Welcome to my research blog, and thanks for checking it out!
This blog is intended to be the place where the scientific community can come and see the studies that I am currently involved in. I also have the goal of making this an open forum, where any discussion, comments and suggestions provided will be welcome and will help steer the research into the future. Please feel free to post a comment, reply to any or all discussions, and/or email me with any further questions or concerns at daniel.michaud at ou dot edu.
Check back for further updates…they’ll be coming soon.


