Soil Veg Profile Description Forms

Submitted by dylan on Thu, 2007-09-27 23:09.

An Example Record

77-CA-55-038x

 
OCR output

$
SECATA QVAR.) SILT LOAM
- - - 3 X
Location: Tuolumne County, California; NW/4, SE/4, 5.35, T.lN, R.15E, MMD.
9gSCYlbEd by: BFS. Date Sampled: 04/77. '
Classification: Fine, mixed, thermic Typic Rhodoxeralfs.
Vegetation: Grasses.
Climate: he precipitation is 34 inches per year. Mean annual temperature is 59 deg. F, with Jan. mean of
43 deg. F, and July mean of 76 deg. F. Parent Material: Sedimentary Schist. Geolo ic Formation; Paleozoic
marine. Topography: This soil occurred at an elevation of 1990 feet, on a lower third hi l slope of simple
convex moderately sloping relief (8% slope) with a southeast aspect. This soil has slow permeability, is
well drained with slow runoff and has slight water erosion.
Remarks: This variant differs in temperature regime, color, and clay distribution.
HORIZON QEQCRIPTION
A1 Brown (7.5yr 5/4) silt loam, dark reddish brown (5YR 3/3) moist; very weak fine and medium
0-2H subangular blocky; hard, friable, nonsticky and nonplastic; many microfine and very fine
roots; common very fine and fine tubular pores; pH 5.7; clear smooth boundary.
Bl Reddish brown (5YR 5/4) silt loam, dark reddish brown (5YR 3/4) moist; massive; hard, friable,
2-10* slightly sticky and slightly plastic; common very fine and fine roots; many fine and medium
tubular pores; pH 5.8; gradual smooth boundary.
B2lt Reddish brown (2.5YR 4/4) clay, dark reddish brown (2.5YR 3/4) moist; moderate medium and
10-23u coarse subangular blocky; very hard, firm, sticky and plastic; few very fine and fine roots;
few fine and medium tubular pores; common moderately thick clay films on ped faces; pH 6.1;
gradual smooth boundary. Temperature 680F at 20 inches.
)
B22t Reddish brown (2.5yr 4.4) clay loam, dark reddish brown (2.5YR 3/4) moist; very weak coarse
23-33n subangular blocky; very hard, firm, sticky and plastic; very few fine roots; very few fine and
medium tubular pores; common moderately thick clay films on ped faces; pH 6.0; gradual smooth
boundary.
B23 & R B23 represents small amount of soil (20%) like the above horizon between cracks in slate.
33-38+H
350

Data Extraction Ideas

(REGEX)

  • Horizon Colors

    Regular expression is used to extract Munsell colors from horizons based on the common pattern used in the Soil Veg profile description forms. This pattern has some additional features to account for common OCR mistakes.

    grep -o --color -E -e '\b[0-9A-Z](\.)?([0-9A-Z])?[Y|y|R|r]([R|r])? [0-9][\/|.][0-9](-[0-9]\/[0-9])?' test.txt

     
    returns:

    7.5yr 5/4
    5YR 3/3
    5YR 5/4
    5YR 3/4
    2.5YR 4/4
    2.5YR 3/4
    2.5yr 4.4
    2.5YR 3/4
    

    Extract all colors from the volume 1 data:

    for x in *
    do echo `basename $x .txt`"|"`cat $x | \
    grep -o --color -E -e '\b[0-9A-Z](\.)?([0-9A-Z])?[Y|y|R|r]([R|r])? [0-9][\/|.][0-9](-[0-9]\/[0-9])?' \
    | sed 's/IO/10/g' | tr '\n' '|'` ; done > vol1_colors.txt
  • Legal Location (PLSS)
    This could still use some work...

    grep -o --color -E -e '[N|S][E|W]\/[2|4](, [N|S][E|W]\/[2|4])?, [S|5]\.[0-9]([0-9])?, T(\.)?[0-9a-z]([0-9a-z])?[N|S], R(\.)?[0-9a-z]([0-9a-z])?[E|W], ...' test.txt

     
    returns:

    NW/4, SE/4, 5.35, T.lN, R.15E, MMD