United States Department of Agriculture
Natural Resources Conservation Service
Natural Resources Inventory and Analysis Institute Go to Accessibility Information
Skip to Page Content
Picture of Map Analysis





Use of NRI Point Data in Image Processing

By Glenn Lawson, USDA/NRCS/NRIAI, glawson@ftw.nrcs.usda.gov

ABSTRACT

The use of models and geographic information systems in the decision making process has increased the demand for a current and detailed land cover map. The use of satellite imagery has been limited because of the need for ground observations or photographic interpretation to create training samples to identify landcover in a supervised or unsupervised classification process. Now that the NRI point locations have been digitized an increased interest in the use of this information for image classification has developed. Pilot tests by Glenn H Lawson, of the Natural Resources Inventory and Analysis Institute, USDA, Natural Resources Conservation Service (NRCS), using NRI data with Landsat TM scenes in Iowa have been completed and are encouraging. Methods for using NRI point data to classify and provide accuracy estimates of satellite imagery were tested.

CONTENTS

INTRODUCTION

A good understanding of the National Resource Inventory program (NRI) of the NRCS is essential when using the data for image processing and assessment. The NRI is a statistical collection of natural resource data over a specified length of time and the process may span several months. Several types of resource data are collected at the point. Some data is recorded based on a small area around the point, the field where the data point is located, or along a transect going through the data point. Data was collected using field observation, aerial photographic interpretation and local knowledge of the area. The NRI point locations were located based on a stratified random sample. The sample was stratified by tabular data of natural resources. These included County Boundary, Major Land Resource, Hydrologic Unit, and Federal land. The points were located using a non-geo-referenced grid which did not include man made boundaries such as roads, buildings or fields. These boundaries were not needed because the data collected was to support a statistical view of resource data based on the areas mentioned above. However, this would be of great value when using the data for image processing. The date the actual data was collected was not of statistical importance but would have been very helpful for image processing. The physical location of the points was determined by transposing the location of the original non-geo-referenced grid to a non-geo-referenced non-ortho aerial photograph. In order to digitize the original point and maintain location accuracy, the points were transferred from the aerial photographs on to an appropriate geo-referenced base map and digitized. A complete history and description of terms and data used in the NRI program can be found on the web at http://www.nrcs.usda.gov/technical/nri/.

Cover training and assessment samples specifically done for image classification to classify land cover would entail geographic location of consistent cover types to identify the sensed values from the satellite. One would not select a sample that was located close to other land covers if landcover were the objective of the image classification. However, NRI data collection is obtained from a specific point that is not moved for any reason. The location of the point was assigned to statistically provide resource information and not to specifically provide ground truthing information for image classification and assessment. The point is not moved to provide better information for image classification because location stability is needed to statistically assess trending and to measure change. However, in the process of resource data collection points are assigned certain landcover values. Landcover values that could be of use in landcover classification include broad cover/use, specific cover/use, Cowardin wetland type, Conservation Reserve Program cover type, and forest type. Data was collected for wildlife use in terms of earth cover. This was done on a small area around the point (3 acres) in 1992 and along an X transect (the center of the X is the point) in 1997. This data is very helpful in image classification where one needs to define an area not a point to develop training samples. Other resource information collected at or around the point can be very useful when properly used.

The use of the cover values alone would seem logical to use in a supervised classification. A supervised classification meaning to develop a training sample (signature file) that is used by the image classification software to cluster (group) like combinations of sensor values based on the user defined classes (values). Image classification is generally done on a grid or raster digital layer where each raster is evaluated with the training sample and given a value of meaningful use. Such as in land cover, a 1 could represent artificial or bare cover, 2 could represent trees, 3 grass, etc. Supervised image classification is based on areas of samples that are used in the classification process, not points of data.

Figures 1 and 2 demonstrate some of the problems using the NRI data as ground truth for image processing and assessment. Figure 1 is a sketch of a land cover.

Figure 1
sketch of a land cover

Figure 2 is the same sketch as figure 1, except a false grid that could be used by a satellite scanner of 30 meters by 30 meters and the fake point locations and landcover values for NRI points are imposed over the image. Note how the scanners will analyze multiple cover types within the same 30 by 30 meter area and assign one value for each scanner used. Therefore, a sensed 30 by 30 meter area may be the results of values sensed for corn, a mixture of corn, grass, roads, etc., or mixture of corn of various conditions and stages.

Figure 2
alternate sketch of a land cover

Figure 2 shows the importance of aligning the satellite image with the NRI point locations. Figure 2 demonstrates the problem of NRI points being given a geographic location based on a statistically random location with no consideration being given to geographic factors. Notice how many points are located close to the edge of different covers. Some are located in a 30 by 30 meter sensed area that has several different covers contributing to the sensed values.

One must remember that the NRI use/cover codes represent a point not an area. The grid or raster has a specified size in any image, such as thirty meters by thirty meters as the Landsat TM data used for Iowa. Each grid or raster is the value of all the bands of data senesced at the time of collection. This caused some problems using the NRI data for signature development and accuracy assessment due to the differences of point and raster data. To classify the image and provide the highest level of accuracy assessment the information for land cover at the NRI point was evaluated based on additional area information at the point such as earth cover and expanded over an area of a three acre circle around the point. A point that had a landuse value of corn or soybeans and a earth cover value of over 90% row crop was separated into a group eligible for classification of the image (pure group). A point that had a landuse of grass pastureland or hayland and a earth cover of over 90% grass herbaceous as added to the pure group as was a point that had a landuse of forest and earth cover of over 90% tree or shrub. All landuses were done this way for all points in an image and one half was held out for possible classification signature areas and one half for accuracy assessment. The one half of the points for classification was expanded to the three-acre area around each point and this area was used to develop a signature file. The signature file was evaluated as a matrix by showing percentage of each signature area as compared to all signature areas in relation to the sensed values of all layers in the raw image used for classification. An example of this for figure 2 is as follows:

Matrix of Signature areas and signature image values expressed in percentage of values by signature area.

  Signature 1 Image Values Signature 2 Image Values Signature 3 Image Values Signature 4 Image Values Signature 5 Image Values
Signature 1 100%        
Signature 2   30%   10% 60%
Signature 3     100%    
Signature 4       100%  
Signature 5   15% 40%   45%

According to this matrix the area for point 1 could be used as a signature area for soybeans, area for point 4 for grass, and area for point 5 for corn. The area for point 2 contained 60% corn signature and only 40% grass signature therefore, could not be used in classification signature file. The area for point 5 was not included in classification signature file because it contained over 10% signature of other cover types. Small grain areas were not classified not only because of this but also mainly because the satellite imagery was taken at a time that the sensed values would not successfully separate it in the classification process.

Figure 3 shows the results of a classification of the area using the signature areas from points 1, 3, and 4. Notice the area of small grain classified as grass with one pixel of corn. The road area was placed in the image by creating a raster layer of roads from a vector layer.

Figure 3
results of a classification of the area

The yellow area classified as corn the brown as soybeans. The gray pixels represent the raster version of the vector layer of roads. The fake NRI points were imposed in the image to demonstrate certain facts when using the NRI points for accessory assessment. Pixels where points 1, 4, and 5 are located show the same classified landcover as the NRI point. The pixel point 2 is located was classified as corn but the NRI point shows grass. Compare figure 1 with figure 3 and you will agree both are right. The pixel contained more signatures for corn and classified properly. The NRI point was in the grass area of the pixel and was defined properly. Both the NRI and classified image were correct, however, the accessory assessment would show an error. The pixel containing point 3 classified as best it could because there was no signature for small grain in the classification process. Point 3 would be dropped from the accessory assessment, as there was no possible way to assess the classified image for a cover that could not exist. However, it should be encouraging that the area of small grain mostly classified as grass and contained some corn pixel. This is encouraging especially when one looks at the matrix of the signatures earlier mentioned.

PROCESS

Due to many of the problems discussed, the NRI point data digital data was separated into groups that could be used to classify and assess the classification of satellite data. Therefore, two basic groups were created. The first group included all points that could be used to assess the accuracy of the classified map. The only points that were excluded were those that had values that could not be found in the digital map, as show earlier in the example of small grain. The classified digital map does not have a class for orchard land, small grain, etc.; therefore, all points for orchard land, small grain, etc. were dropped from all groups. The second group (Pure group) contained only points that the supporting data defined areas that should represent specific uses. The second group was randomly split in half. One half was used to classify the imagery and the other half was used to assess the classification. Groups were developed for broad and specific cover. Document explaining detail of group definition and rules can be found at classgroups.html.

Much time and work could be saved if an accurate broad cover digital layer could be used instead of creating one by classification. A broad cover/use digital layer was obtained from the Iowa Department Natural Resources (http://www.ag.iastate.edu/centers/cfwru/iowagap/) participating in the Gap Analysis Program (GAP). A complete history and description of terms and data used in the GAP program can be found on the Web at http://www.gap.uidaho.edu. An accuracy assessment (Congalton 1991) was completed to decide if this could be used instead of classifying the Landsat data for broad cover. This was very satisfactory and the results for the entire state were as follows:

CLASSIFICATION ACCURACY ASSESSMENT REPORT Using All Points

Using All points where similar to class in digital map. (Group 1)
Image File: GAP Broad Cover
Date: Sep 9 08:49:41 1999

ACCURACY TOTALS

Class Name Reference Totals Classified Totals Number Correct Producers Accuracy Users Accuracy
Grass 3377 4498 2975 88.10% 66.14%
Row Crop 12393 11376 11006 88.81% 96.75%
Woody 1403 1346 1018 72.56% 75.63%
Totals 17356 17356 14999    
Overall Classification Accuracy = 86.42%

CLASSIFICATION ACCURACY ASSESSMENT REPORT using supporting data points

Using supporting data points where similar to class in digital map. (Group 2)
Image File: GAP Broad Cover
Date: Sep 9 11:19:12 1999

ACCURACY TOTALS

Class Name Reference Totals Classified Totals Number Correct Producers Accuracy Users Accuracy
Grass 2103 2704 1973 93.82% 72.97%
Row Crop 9122 8751 8540 93.62% 97.59%
Woody 1013 835 756 74.63% 90.54%
Totals 12368 12368 11269    
Overall Classification Accuracy = 91.11%

Several Landsat scenes were classified for broad cover/use and compared to the GAP layer. The Gap layer was more accurate in all cases, therefore, the GAP layer was used to develop the more specific cover layer for the entire state of Iowa.

The process for classifying the broad cover map unless a substitute digital layer was successfully located was as follows.

Landsat TM images (http://geo.arc.nasa.gov/sge/landsat/landsat.html) were imported using Erdas Imagine software (http://www.erdas.com). The digital roads from the Census Bureau Tiger files http://www.census.gov/ftp/www/tiger (http://www.census.gov/ftp/pub/www/tiger) were imported using ESRI, ArcInfo, (http://www.esri.com) software and format. The geographic location of each image was checked by comparing it with the road digital layer and the image moved if needed to match. A 30 meters by 30 meters raster layer was developed from the road digital data to use as a masking layer for broad cover classification and embedded in the final land cover digital product. Training samples for water were developed by hand or by using an unsupervised classification process consisting of twenty-five classes. Training samples were developed from a random one half of the group of the digitized NRI points described above as the "Pure Group". The sample classes were assigned broad cover/use from the 1992 NRI data. A digital area of three acres around each point in the "Pure Group" was created to develop a signature file for classifying the image. Three acres areas were used because of the three acres circle imposed around each point in the 1992 inventory and assigned a earth cover value for wildlife. The signature file was used in a supervised maximum likelihood classification process using Erdas, Imagine software to generate a broad cover/use digital layer by Landsat scene. The one half of the pure group held out for classification accuracy assessment was used to develop a matrix of accuracy. The one half of the pure group used for classification and all of group 1 was used to create a separate accuracy matrix for each. Until a total accuracy of 80% or higher for the pure groups and 70% or higher for group one was achieved the process and signature file were evaluated and modified. The same process was repeated until all scenes were classified. However, only six scenes were done this way in Iowa and used to evaluate the GAP broad cover/use layer.

After a broad cover/use layer was accepted (GAP Broad cover layer) the process to develop a more specific cover/use layer using the NRI data was initiated. The process was basically the same as described only the specific values were used in the signature file and classification was carried out masking out each layer in the broad cover/use layer. The resulting classified subsets were merged together to develop an entire scene of specific cover/use.

RESULTS AND DISCUSSION

The results of one scene was as follows:

CLASSIFICATION ACCURACY ASSESSMENT REPORT

Using All points where similar to class in digital map. (Group 1)
Image File: Specific Cover Landsat Scene Path 26 Row 31
Date: Jul 29 09:01:50 1999

ACCURACY TOTALS

Class Name Reference Totals Classified Totals Number Correct Producers Accuracy Users Accuracy
Tree 292 422 240 82.19% 56.87%
Grass 956 1172 813 85.04% 69.37%
Water 19 40 16 84.21% 40.00%
Corn 1447 1367 1126 77.82% 82.37%
Soybean 940 845 702 74.68% 83.08%
Totals 3846 3846 2897    
Overall Classification Accuracy = 75.33%

----- End of Accuracy Totals -----

CLASSIFICATION ACCURACY ASSESSMENT REPORT

Using supporting data points where similar to class in digital map. (Group 2)
Using one half of pure group (Group 2) for accuracy assessment
Image File: Specific Cover Landsat Scene Path 26 Row 31
Date: Jul 29 08:53:51 1999

ACCURACY TOTALS

Class Name Reference Totals Classified Totals Number Correct Producers Accuracy Users Accuracy
Tree 102 111 86 84.31% 77.48%
Grass 275 293 249 90.55% 84.98%
Water 18 8 1 5.56% 12.50%
Corn 532 546 454 85.34% 83.15%
Soybean 372 359 296 79.57% 82.45%
Totals 1317 1317 1086    
Overall Classification Accuracy = 82.46%


----- End of Accuracy Totals -----

CLASSIFICATION ACCURACY ASSESSMENT REPORT

Using one half of pure group (Group 2) used for classification
Image File: Specific Cover Landsat Scene Path 26 Row 31
Date: Th Jul 29 08:44:04 1999

ACCURACY TOTALS

Class Name Reference Totals Classified Totals Number Correct Producers Accuracy Users Accuracy
Tree 122 128 101 82.79% 78.91%
Grass 329 372 306 93.01% 82.26%
Water 20 5 5 25.00% 100.00%
Corn 497 483 402 80.89% 83.23%
Soybean 368 356 291 79.08% 83.23%
Totals 1344 1344 1105    
Overall Classification Accuracy = 82.22%

----- End of Accuracy Totals -----

The process or the signature file was modified until a classified image had a group one assessment of 65 to 75% and both halves of group two (pure group) had an assessment of 75 to 85%. Each Landsat scene was done and the resulting classified images merged. They edges of each scene was trimmed prior to merging because of poor classification of the borders of a satellite scene.

The results of the merged map of Iowa was as follows:

CLASSIFICATION ACCURACY ASSESSMENT REPORT

Merged specific cover/use Iowa
Used group 1.
Date : Wed Sep 8 11:58:31 1999

ACCURACY TOTALS

Class Name Reference Totals Classified Totals Number Correct Producers Accuracy Users Accuracy
Tree 1403 1936 1113 79.33% 57.49%
Grass 3269 3432 2525 77.24% 73.57%
Corn 7683 6503 5513 71.76% 84.78%
Soybean 4710 4815 3522 74.78% 73.15%
Totals 17356 17356 12673    
Overall Classification Accuracy = 73.02%

----- End of Accuracy Totals -----

CLASSIFICATION ACCURACY ASSESSMENT REPORT

Merged specific cover/use Iowa
Used all of group 2
Date : Wed Sep 8 12:34:31 1999

ACCURACY TOTALS

Class Name Reference Totals Classified Totals Number Correct Producers Accuracy Users Accuracy
Tree 1013 1150 86 84.11% 77.09%
Grass 2017 2272 1809 89.69% 79.62%
Corn 5534 4853 4213 76.13% 86.81%
Soybean 3588 3846 2879 80.24% 74.86%
Totals 12368 12368 9753    
Overall Classification Accuracy = 78.86%

----- End of Accuracy Totals -----

Document containing additional accuracy assessment can be found at addassessment.html.

CONCLUSION

The broad cover/use and specific cover/use collected at the point does not provide enough area information to be used alone for developing a good training sample to classify an image. However, the use of some of the data other than land use/cover can support a valid value for each point that can be used to develop training sample areas and accuracy assessment points for land cover classification in image processing.

No matter how one develops training samples or accuracy points or how good they are, the land cover classification can not classify classes that do not have sensor data to define it. The image must be taken at the time or times when satellite sensors collect information that will properly define the classification needs. The Landsat TM images used for the Iowa project contained seven bands of sensor data. The thermal band was not used in the classification process. Each raster is assigned a sensed value for each of the six bands in a 30 meters by 30 meters area. Any thirty by thirty meters area containing two different crops such as soybeans and corn would have different sensed values for some or all the bands than that of an area of pure corn. This would not only be the case where there are different crops but also where consistent cover types vary in growth stages or other resource factors are affecting cover condition.

The resulting merged map of land cover provides an excellent source for those in need of a digital land cover map of Iowa. The covers are limited to only the major cover types, however, it does have an accuracy value that can be used and is more current than most layers used at the present. Calculations of specific cover acres will be high in most covers because the classification was limited to the major cover types only.

The NRI points can successfully be used as ground truth to produce a high quality land cover map by classifying satellite data.

A good understanding and educated use of the NRI data and point location is essential for satisfactory use in classification and accuracy assessment of satellite imagery.

Iowa 1992 Land Cover
Iowa 1992 Land Cover

REFERENCES

Congalton, R.G., 1991. A review of assessing the accuracy of classifications of remotely sensed data, Remote Sensing of Environment, 37: 35-46.  

  Back to Top