Analysis of SSURGO Data in PostGIS: An Overview

 
Note
Several of the pages documenting soil survey related queries have been taken down for a couple of days while I update the syntax.

 
News
It looks like the NCSS has constructed a web-based, SQL interface to their main database. This new tool was recently highlighted in issue 40 of the NCSS Newsletter and looks like a promising new tool with good documentation. A listing "web services" offered by the NCSS is listed on this page.

 
Overview
The analysis of SSURGO data is complicated by a series of hierarchical, one-to-many tabular relationships between spatial and attribute data. Understanding the structure of the SSURGO database is critical for correct interpretation and aggregation of soil properties. Before undertaking any SSURGO-based analysis please take some time to become familliar with the details on the product, including intended uses, minimum mapping units sizes, and other important details. In addition, becoming familliar with the SSURGO metadata is a must. The metadata page includes detailed descriptions of table structure, column names and units, as well as important information on the sources of much of the tabular data included in a SSURGO database.

SSURGO data can be downloaded by survey area from The Soil Datamart, with spatial data delivered in shapefile format and attribute data delivered as plain text. Unfortunately, an M.S. Access database template is required to utilize SSURGO attribute data as delivered from the Soil Datamart. For assistance with this procedure please see the NRCS SSURGO page. Using this approach, most analysis of SSURGO must be done survey-by-survey within a GIS environment. For general instructions see this document.

Several online applications allow for simple interaction with the SSURGO database without the need for a GIS or RDBMS knowledge. Some examples include:

 
An Open Source Approach to SSURGO
We have developed an alternate approach to working with SSURGO data using PostGIS, a spatially-enabled version of the popular and open source relational database management system Postgresql, to store all spatial and tabular data for 138 survey areas. This approach facillitates rapid access, analysis, and aggregation of over half a million soil polygons. SQL (structured query language) is used to directly interact with both soil spatial and attribute data. If other forms of spatial data are imported into PostGIS (such as landcover, climatic data, etc.) nearly all spatial and attribute analysis can be done entirely from PostGIS. A series of examples illustrating common tasks will be presented in the following pages.

 
General Approach to Working with SSURGO (also outlined in this document) (more ideas on map unit composition)

  1. Identify the soil properties that are to be included in some analysis
  2. Decide on the appropriate form of aggregation to be used to summarize horizons
    • depth weighted (i.e. to calculate average percent clay through horizons)
    • top 1m (i.e. to summarize shrink-swell capacity of the topsoil)
    • top horizon (i.e. to summarize surface organic carbon)
    • profile sum (i.e. to calculate total water holding capacity)
    • most limiting (i.e. to calculate the most limiting hydraulic conductivity within the soil profile)
  3. Aggregate horizon data
  4. Decide on the appropriate form of aggregation to be used to summarize components
    • component percent weighted (i.e. this will include information from each component, weighted by their estimated percentage of the entire map unit)
    • largest component (i.e. this usually result in the selection of the 'dominant' component, however when there are multuple components with the same estimated percentage ties will occur)
    • major component flag (i.e. components flagged as a 'major component' by the NRCS represent dominant soil types within a map unit. note that there are sometimes multiple components flagged as 'major components'.)
    • dominant condition (i.e. this is usually used for categorical data like hydric conditions. aggregation is performed by selecting the most frequent condition within a map unit)
  5. Aggregation of component data
  6. join the above aggregated data (2x aggregation process for horizon data) to the map unit polygons
  7. See the diagram at the bottom of this page for a graphical summary

SSURGO discontinuity example: Boundary between Glenn and Colusa counties, illustrating differences in soil survey vintage.SSURGO discontinuity example: Boundary between Glenn and Colusa counties, illustrating differences in soil survey vintage.

 
Notes on the Format of SSURGO

  • Adjacent surveys may have been composed by different individuals, and may be of widely different vintages. Any given survey must comply with basic standards, but older surveys reflect a more generalized approach than more modern surveys. The figure to the right illustrates such differences.
  • Polygons represent a repeating pattern of legend entries: groups of map-able soil concepts called map units
  • Map unit data is stored in the mapunit table, and is referenced by the field mukey
  • Pre-aggregated map unit data is stored in the muaggatt table, and is referenced by the field mukey
  • Map units are comprised of multiple, unmapped soil types called 'components'
  • Component data is stored in the component table, and is referenced by the field cokey
  • Soil components (or soil type) are associated with multiple horizons
  • Horizon data is stored in the chorizon table, and is referenced by the field cokey
  • Since there is a 1:many:many (mapunit:component:horizon) relationship between spatial and horizon-level soil property data two aggregation steps are required in order to produce a thematic map
  • This diagram illustrates the hierarchy of scales involved in soil survey information.

SSURGO Database Structure DiagramSSURGO Database Structure Diagram

Logistics: Getting Connected and Executing Queries

 
Building and Saving SQL code snippets
A simple text editor is the best environment for working on (and saving!) your SQL queries. Linux users are encouraged to use either 'Kate' or 'Kwrite'. Windows users should look into Notetab Light or SciTE. Mac users should check out TextWrangler.

 
Connecting to the database
In general, the simplest way to interact with our composite soil survey database is by connecting to the best with an SSH client. Mac/Linux users have this functionality built-in. Windows users will need to use something like Putty, or Xming. Once an ssh connection with the beast has been setup, you can connect to the database with the following:

psql -U soil ssurgo_combined

 
where psql is the postgresql client program, -U soil means connect as the user called "soil", and ssurgo_combined is the name of the actual database.

 
You can quit from the database shell using \q (followed by enter). Typing \? (followed by enter) will give a list of commands available in the psql shell.

 
Query structure and SQL
Numerous resources exisit for learning about SQL. I would recommend looking at the first couple chapters from the PostgreSQL book by Douglas and Douglas (the big purple book on the shelf). For an interactive learning approach SQL Zoo seems like a good start. In general most queries will have the format:

SELECT column_x, column_y, column_z
FROM table_x
WHERE column_x = 'something'
-- optional
GROUP BY column_x
ORDER BY column_x ; -- semi-colon denotes end of SQL statement

 
A simple query: column selection and filtering
Once connected it is possible to issue queries to the database. The results of a query are normally returned to the screen. For example, asking for the horizon boundaries and horizon names from the componenent identified with the given cokey (467038:646635) might look like this:

-- the query
SELECT cokey, hzname, hzdept_r, hzdepb_r
FROM chorizon
WHERE cokey = '467038:646635' ;

-- the results
     cokey     | hzname | hzdept_r | hzdepb_r
---------------+--------+----------+----------
 467038:646635 | Ap     |        0 |       18
 467038:646635 | Bw     |       18 |       41
 467038:646635 | Bk1    |       41 |       69
 467038:646635 | Bk2    |       69 |      109
 467038:646635 | Bk3    |      109 |      145
 467038:646635 | Bk4    |      145 |      183

 
A more complicated query: column calculation and aggregation
Compute the total water holding capacity for a given component, identified by (467038:646635). Note that several operations are being performed on the data:

  1. compute the available water holding capacity (in centemeters) for each horizon [green text]
  2. compute the profile depth (in centemeters) for each horizon [green text]
  3. aggregate the profile data by summing AHC and depth of each horizon [red text]
-- the query
select cokey, sum( (hzdepb_r - hzdept_r) * awc_r) as component_whc, sum( (hzdepb_r - hzdept_r)  ) as depth 
from chorizon 
where cokey = '467038:646635'
group by cokey ;

-- the results
     cokey     | component_whc | depth 
---------------+---------------+-------
 467038:646635 |         27.91 |   183

 
Note that it is also possible to save the results of a query into a new table. This is the usual strategy when performing any analysis that returns geometry (soil polygons, etc.) as geometry data cannot be visualized in a text display. Tables created in this manner can be exported from the database for map creation or further analysis. Spatial tables can be viewed directly by applications like QGIS or Mapserver. A simple method of accessing PostGIS data is not yet possible with ArcGIS.

Learning by Example: Common Queries

 
Selection, Filtering and Sorting

  • Selecting a summary of horizon thickness, sand, silt, clay and AWC for a given component (chorizon table).
    SELECT  hzname AS name, hzdepb_r - hzdept_r AS thickness, sandtotal_r AS sand, silttotal_r AS silt, claytotal_r AS clay, awc_r AS awc
    FROM chorizon
    WHERE cokey = '467038:646635'
    ORDER BY hzdept_r;

    Query Results
    name | thickness | sand | silt | clay | awc
    --------+-----------+------+------+------+------
    Ap | 18 | 35 | 37 | 28 | 0.16
    Bw | 23 | 35 | 40 | 25 | 0.15
    Bk1 | 28 | 35 | 40 | 25 | 0.16
    Bk2 | 40 | 35 | 40 | 25 | 0.16
    Bk3 | 36 | 35 | 40 | 25 | 0.16
    Bk4 | 38 | 60 | 25 | 15 | 0.13
  • Select some common fields from the component table:
     SELECT cokey, majcompflag, comppct_r, wei, weg, tfact
    FROM component
    WHERE mukey = '467038'
    ORDER BY comppct_r DESC;

    Query Results
    cokey | majcompflag | comppct_r | wei | weg | tfact
    ---------------+-------------+-----------+-----+-----+-------
    467038:646635 | Yes | 85 | 48 | 6 | 5
    467038:646636 | No | 5 | | |
    467038:646638 | No | 4 | | |
    467038:646640 | No | 2 | | |
    467038:646639 | No | 2 | | |
    467038:646637 | No | 2 | | |
  • Create a new table 'yolo_comp' using the component table, for a specific survey area (Yolo County), with components sorted by their component percentages.
    CREATE TABLE yolo_comp AS
    SELECT * FROM component
    WHERE areasymbol = 'ca113'
    ORDER BY mukey, comppct_r DESC ;

     
    The resulting table can be copied to a CSV file (in the current working directory) like this: \copy yolo_comp TO 'yolo_comp.csv' CSV HEADER

     
    If you are done with the table, remove it with the following SQL: DROP TABLE yolo_comp ;

Identifying the Largest Components

 
Overview:
Two methods for the selection of the largest components (based on the comppct_r column) within map units. This approach to selecting a single component per map unit is appropriate in situations where a single representative feature or property is sought. The partitioning of components within a map unit is closely related to the map unit type: complex, association, consociation or an "undifferentiated group". A breakdown of the number of components per each map unit type is summarized by the following query (26400 map units / 45971 major components):

SELECT mapunit.mukind, round(avg(n_components)) AS avg, min(n_components), max(n_components)
FROM mapunit JOIN
        (
        SELECT mapunit.mukey, count(component.mukey) AS n_components
        FROM
        mapunit JOIN component
        ON mapunit.mukey = component.mukey
        WHERE component.majcompflag = 'Yes' AND mapunit.mukind IS NOT NULL
        GROUP BY mapunit.mukey
        ) AS a
ON a.mukey = mapunit.mukey
GROUP BY mapunit.mukind ;

Query Results
mukind | avg | min | max
------------------------+-----+-----+-----
Consociation | 1 | 1 | 3
Complex | 2 | 1 | 4
Association | 3 | 1 | 4
Undifferentiated group | 2 | 1 | 4

 
Method 1: Filtering component percentages with the DISTINCT keyword.

  1. Use of the SELECT DISTINCT ON operator. Note that the ordering is done before the DISTINCT filter is applied, resulting in the largest (by component percentage) component. Note that results from ties are unpredictable.
    • SELECT DISTINCT ON (mukey) mukey, cokey, comppct_r, compname, taxorder
      FROM component
      WHERE areasymbol = 'ca654' AND majcompflag = 'Yes'
      ORDER BY mukey, comppct_r DESC ;

      Query Results
      mukey | cokey | comppct_r | compname | taxorder
      ---------+----------------+-----------+------------+-----------
      1487066 | 1487066:638252 | 80 | Auberry | Alfisols
      1487067 | 1487067:638274 | 80 | Auberry | Alfisols
      1487068 | 1487068:638383 | 85 | Crouch | Mollisols
      1487069 | 1487069:632318 | 85 | Cajon | Entisols
      1487070 | 1487070:632372 | 85 | Excelsior | Entisols
      1487071 | 1487071:632519 | 85 | Kimberlina | Entisols
      1487072 | 1487072:632642 | 85 | Nord | Mollisols
      1487076 | 1487076:632832 | 85 | Wasco | Entisols
      1487077 | 1487077:632866 | 85 | Whitewolf | Entisols
      464171 | 464171:640646 | 85 | Academy | Alfisols
    • We can inspect possible ties or other sources of error by looking at the smallest components.

      SELECT * FROM
              (
              SELECT DISTINCT ON (mukey) mukey, cokey, comppct_r, compname, taxorder
              FROM component
              WHERE areasymbol = 'ca654' AND majcompflag = 'Yes'
              ORDER BY mukey, comppct_r DESC
              ) AS a
      ORDER BY a.comppct_r ASC ;

      Query Results
      mukey | cokey | comppct_r | compname | taxorder
      ---------+----------------+-----------+---------------------+-------------
      464432 | 464432:1380237 | 10 | Ramona | Alfisols
      464184 | 464184:640686 | 30 | Ahwahnee | Alfisols
      464211 | 464211:640756 | 30 | Auberry | Alfisols
      464210 | 464210:640752 | 35 | Auberry | Alfisols
      464529 | 464529:641600 | 35 | Vista | Inceptisols
      464530 | 464530:641604 | 35 | Vista | Inceptisols
      464531 | 464531:641609 | 35 | Rock outcrop |

 
Finding the largest component (based on comppct_r) per mukey

  1. Identify how many map units there are in a given region or survey area. In this case San Joaquin County; areasymbol = 'ca077'.
    SELECT count(mukey) FROM mapunit WHERE areasymbol = 'ca113';

    Query Results
    count
    -------
    186
  2. Identify which map units contain components where a conventional, select largest component percentage type approach will not work. This commonly happens when there are several major components with the same comppct_r value, within the same map unit. Any returned mukey values need to be evaluated.
    SELECT component.mukey, count(component.mukey)
    FROM component
            JOIN
            (
            SELECT mukey, max(comppct_r) AS comppct FROM component WHERE majcompflag = 'Yes' AND areasymbol = 'ca077' GROUP BY mukey
            ) AS a
    ON component.comppct_r = a.comppct AND component.mukey = a.mukey
    GROUP BY component.mukey
    HAVING count(component.mukey) > 1 ;

    Query Results
    mukey | count
    --------+-------
    462043 | 2
    462068 | 2
  3. Evaluation of the two "trouble" map units identified above. Note the use of the where mukey in ('462043', '462068') syntax.
    SELECT mukey, cokey, comppct_r, compname, taxorder
    FROM component
    WHERE majcompflag = 'Yes' AND mukey IN ('462043', '462068')
    ORDER BY mukey, comppct_r DESC ;

    Query Results
    mukey | cokey | comppct_r | compname | taxorder
    --------+---------------+-----------+------------+-----------
    462043 | 462043:634006 | 45 | Dumps |
    462043 | 462043:634007 | 45 | Tailings |
    462068 | 462068:634159 | 30 | Honker | Alfisols
    462068 | 462068:634160 | 30 | Vallecitos | Alfisols
    462068 | 462068:634161 | 25 | Gonzaga | Mollisols

Soil Texture Example

 
Premise

Compute a series of weighted-average soil texture fractions (sand, silt, clay), for every component, of every map unit in Yolo County. These values will be further weighted by the spatial distribution of each map unit.

CREATE TABLE yolo_wt_mean_texture AS
-- join with polygons, and compute areas weights
SELECT mapunit_poly.mukey,
sum(ST_Area(wkb_geometry)) / (SELECT ST_Area(wkb_geometry) FROM mapunit_bound_poly WHERE areasymbol = 'ca113') AS area_wt,
sand, silt, clay
FROM
mapunit_poly
JOIN
        (
        -- compute component percent weighted mean
        SELECT mukey,
        sum(comppct_r * sand) / sum(comppct_r) AS sand,
        sum(comppct_r * silt) / sum(comppct_r) AS silt,
        sum(comppct_r * clay) / sum(comppct_r) AS clay
        FROM
        component
        JOIN
                (
                -- compute hz thickness weighted mean
                SELECT cokey,
                sum((hzdepb_r - hzdept_r) * sandtotal_r) / sum(hzdepb_r - hzdept_r) AS sand,
                sum((hzdepb_r - hzdept_r) * silttotal_r) / sum(hzdepb_r - hzdept_r) AS silt,
                sum((hzdepb_r - hzdept_r) * claytotal_r) / sum(hzdepb_r - hzdept_r) AS clay
                FROM chorizon
                WHERE sandtotal_r IS NOT NULL
                AND silttotal_r IS NOT NULL
                AND claytotal_r IS NOT NULL
                AND areasymbol = 'ca113'
                GROUP BY cokey
                ) AS co_agg
        ON component.cokey = co_agg.cokey
        GROUP BY component.mukey
        ) AS mu_agg
ON mapunit_poly.mukey = mu_agg.mukey
GROUP BY mapunit_poly.mukey, sand, silt, clay;

 
Simple Visualization with R

# dump the data:
# echo 'select * from yolo_wt_mean_texture' |  psql -A -F "," -U xxx -h xxx dbname > temp/yolo_texture.csv
#
# remember to trim off the last line!

# load some libs:
library(plotrix)

# read in the data
x <- read.csv('yolo_texture.csv')

# simple soil texture, with symbol size weighted by area weight
soil.texture(x[,3:5], cex=sqrt(50*x$area_wt), pch=16, col.symbol=rgb(65,105,225, alpha=100, max=255),
show.lines=T, show.names=T, col.lines='black', col.names='black', main='Yolo County Soil Textures')

triax.points(cbind(weighted.mean(x$sand, x$area_wt), weighted.mean(x$silt, x$area_wt), weighted.mean(x$clay, x$area_wt)),
col.symbols='orange', pch=16, cex=2)

Yolo County Common Soil Textures: Blue symbols mark common soil textures and their relative size denotes abundance. The orange symbol marks the weighted average soil texture for all of Yolo County.Yolo County Common Soil Textures: Blue symbols mark common soil textures and their relative size denotes abundance. The orange symbol marks the weighted average soil texture for all of Yolo County.

Identification of Dated Surfaces via Soil Series

East Side Alluvial FormationsEast Side Alluvial Formations

 
Overview
A simple association between dated landforms and soil series name [1] was used to extract soil polygons from a composite soil survey database.

Soil Series Associated Formation Approximate Age (1000 yrs ago)
Redding Laguna 1600 - 2000 kya
Corning Laguna 1600 - 2000 kya
Keyes Laguna 1600 - 2000 kya
Whitney Turlock Lake 500 - 700 kya
Montpellier Turlock Lake 500 - 700 kya
Rocklin Turlock Lake 500 - 700 kya
Snelling Riverbank 100 - 300 kya
San Joaquin Riverbank 100 - 300 kya
Exiter Riverbank 100 - 300 kya
Madera Riverbank 100 - 300 kya
Hanford Modesto 10 - 40 kya
Grangeville Holocene < 10 kya
  1. Create a soil series - dated landform lookup table:

    -- create a lookup table
    CREATE TABLE dated_landforms (
    soil_series varchar(20),
    formation varchar(20),
    approx_age varchar(30)
    );
    -- populate table
    INSERT INTO dated_landforms VALUES ('Redding','Laguna','1600 - 2000 kya') ;
    INSERT INTO dated_landforms VALUES ('Corning','Laguna','1600 - 2000 kya') ;
    INSERT INTO dated_landforms VALUES ('Keyes','Laguna','1600 - 2000 kya') ;
    INSERT INTO dated_landforms VALUES ('Whitney','Turlock Lake','500 - 700 kya') ;
    INSERT INTO dated_landforms VALUES ('Montpellier','Turlock Lake','500 - 700 kya') ;
    INSERT INTO dated_landforms VALUES ('Rocklin','Turlock Lake','500 - 700 kya') ;
    INSERT INTO dated_landforms VALUES ('Snelling','Riverbank','100 - 300 kya') ;
    INSERT INTO dated_landforms VALUES ('San Joaquin','Riverbank','100 - 300 kya') ;
    INSERT INTO dated_landforms VALUES ('Exiter','Riverbank','100 - 300 kya') ;
    INSERT INTO dated_landforms VALUES ('Madera','Riverbank','100 - 300 kya') ;
    INSERT INTO dated_landforms VALUES ('Hanford','Modesto','10 - 40 kya') ;
    INSERT INTO dated_landforms VALUES ('Grangeville','Holocene','< 10 kya') ;
  2. Select map units (mukey) by suitible series concepts, associated with major components. Note that the DISTINCT ON (mukey) ... ORDER BY mukey, comppct_r DESC pattern can be used to select the largest component for each map unit. Pattern matching is used to safely join variants with the soil series names in our look-up table: i.e. 'San Joaquin variant' will match 'San Joaquin'.

    -- keep only the largest formation (by total component percent) within each map unit key
    SELECT DISTINCT ON (mukey) mukey, formation, sum(comppct_r) AS formation_pct
    FROM component
    JOIN
    dated_landforms
    -- use pattern matching to also include variants
    ON compname ~~* lower(soil_series || '%') = 't'
    -- subset to select survey areas on the east side of the valley
    AND component.areasymbol IN ('ca654', 'ca651', 'ca649', 'ca644', 'ca648', 'ca077')
    AND majcompflag = 'Yes'
    -- combine components of the same formation
    GROUP BY mukey, formation
    -- use in conjunction with the DISTINCT statement to keep only the largest
    ORDER BY mukey, formation_pct DESC;
  3. Create a new classified table called east_side_all. This query involves 6 survey areas, 11423 polygons, and requires about 11 seconds to complete.

    CREATE TABLE east_side_all AS
    -- select geom column and the feature id
    -- along with our formation names
    SELECT wkb_geometry AS wkb_geometry, ogc_fid, a.*
    FROM mapunit_poly
    JOIN
    (
    SELECT DISTINCT ON (mukey) mukey, formation, sum(comppct_r) AS formation_pct
    FROM component
    JOIN
    dated_landforms
    ON compname ~~* lower(soil_series || '%') = 't'
    AND component.areasymbol IN ('ca654', 'ca651', 'ca649', 'ca644', 'ca648', 'ca077')
    AND majcompflag = 'Yes'
    GROUP BY mukey, formation
    ORDER BY mukey, formation_pct DESC
    ) AS a
    ON mapunit_poly.mukey = a.mukey
    -- limit polygon selection by survey area
    AND mapunit_poly.areasymbol IN ('ca654', 'ca651', 'ca649', 'ca644', 'ca648', 'ca077') ;
  4. Index, register geometry, and setup permissions
     -- indexing
    CREATE INDEX east_side_all_fid_idx ON east_side_all  (ogc_fid) ;
    CREATE INDEX east_side_all_gis_idx ON east_side_all USING gist (wkb_geometry gist_geometry_ops) ;
    &nbsp;
    -- register geometry
    INSERT INTO geometry_columns VALUES ('','public','east_side_all','wkb_geometry',2,9001,'POLYGON');
    &nbsp;
    -- permissions
    GRANT SELECT ON TABLE east_side_all TO soil  ;
  5. Tabulate acreage for each formation. (approximately 1/3 second to complete)
     
    SELECT * FROM
            (
            SELECT round(sum(area(wkb_geometry)) * 0.000247) AS area_ac, formation
            FROM east_side_all
            GROUP BY formation
            ) AS a
    ORDER BY a.area_ac DESC;

    area_ac formation
    295868 Riverbank
    253424 Modesto
    151149 Turlock Lake
    121085 Laguna
    96981 Holocene

    (5 rows)

  6. Dump to local file in SHP format
     ogr2ogr east_side_all.shp PG:"dbname=ssurgo_combined user=xxxx password=xxxx host=xxxx" east_side_all

 
References:

  1. Smith, D.W. & Verrill, W.L. Witham, C.; Bauder, E.; Belk, D.; Ferren Jr., W. & Ornduff, R. (ed.) Vernal Pool-Soil-Landform Relationships in the Central Valley, California 1998, 15-23

Seasonally Wet Soils and Shrink-Swell Potential

PostGIS: ssurgo example- wet polygons
Example 1 map
PostGIS: ssurgo example- shrink swell potential
Example 2 map

 
Example 1: The location of seasonaly wet soils via two methods: hydricrating and USDA Soil Taxonomy interpretation.

-- optionally link polygons here...
-- compute the percent of each map unit that contains components that may be seasonally wet
SELECT mukey, wet_flag, sum(comppct_r) AS pct_mu
FROM
        (
        SELECT mukey, cokey, comppct_r,
        -- slightly less restrictive than hydricrating alone
        CASE WHEN ((taxclname ~~* '%aqu%') = 't') OR (hydricrating = 'Yes') then 1 ELSE 0 END AS wet_flag
        FROM component  
        -- subset to Yolo County
        WHERE areasymbol = 'ca113'
        ) AS yolo_wet_components
-- only keep map unit with some component that meets the criteria
WHERE wet_flag = 1
-- aggregate by map unit, wet_flag
GROUP BY yolo_wet_components.mukey, yolo_wet_components.wet_flag
ORDER BY mukey;

 

 
Example 1.1 Extract a list of wet components, and sum area based on component (series) name

-- compute estimated, total acreage of seasonally wet map units in Yolo Co.
SELECT mapunit.musym, mapunit.mukey, mapunit.muname,
round(sum(ST_Area(wkb_geometry) * (yolo_mu_data.pct_mu::numeric/100.0) * 0.000247)::numeric, 2) AS wet_area_ac
FROM
mapunit_poly
JOIN
        (
        -- compute the percent of each map unit that contains components that may be seasonally wet
        SELECT mukey, wet_flag, sum(comppct_r) AS pct_mu
        FROM
                (
                SELECT mukey, cokey, comppct_r,
                -- slightly less restrictive than hydricrating alone
                CASE WHEN ((taxclname ~~* '%aqu%') = 't') OR (hydricrating = 'Yes') then 1 ELSE 0 END AS wet_flag
                FROM component  
                -- subset to Yolo County
                WHERE areasymbol = 'ca113'
                ) AS yolo_wet_components
        -- only keep map unit with some component that meets the criteria
        WHERE wet_flag = 1
        -- aggregate by map unit, wet_flag
        GROUP BY yolo_wet_components.mukey, yolo_wet_components.wet_flag
        ) AS yolo_mu_data
ON mapunit_poly.mukey = yolo_mu_data.mukey
JOIN mapunit
ON mapunit.mukey = yolo_mu_data.mukey
-- aggregate by map unit
GROUP BY mapunit.mukey, mapunit.muname, mapunit.musym
ORDER BY wet_area_ac DESC;

 musym | mukey  |                             muname                             | wet_area_ac 
-------+--------+----------------------------------------------------------------+-------------
 Sc    | 459268 | Sacramento clay                                                |    35710.90
 Mf    | 459244 | Marvin silty clay loam                                         |    18877.18
 Sg    | 459272 | Sacramento soils, flooded                                      |    12273.51
 Cn    | 459219 | Clear Lake soils, flooded                                      |    11665.84
 Cc    | 459216 | Capay soils, flooded                                           |    11029.83
 Sv    | 459288 | Sycamore complex, drained                                      |     8410.41
 St    | 459286 | Sycamore silty clay loam, drained                              |     6900.77
 Ck    | 459218 | Clear Lake clay                                                |     6737.88
 Sa    | 459266 | Sacramento silty clay loam                                     |     6014.48
 Sp    | 459283 | Sycamore silt loam, drained                                    |     5537.18
 Sw    | 459289 | Sycamore complex, flooded                                      |     5041.96
 Wb    | 459301 | Willows clay                                                   |     4950.34
 Ss    | 459285 | Sycamore silty clay loam                                       |     4859.59
 Pb    | 459254 | Pescadero silty clay, saline-alkali                            |     4700.77
 So    | 459282 | Sycamore silt loam                                             |     3953.42
 Rh    | 459262 | Riverwash                                                      |     3698.88
 Pc    | 459255 | Pescadero soils, flooded                                       |     3589.01
 Tb    | 459292 | Tyndall very fine sandy loam                                   |     3299.03
 Ob    | 459252 | Omni silty clay                                                |     3174.44

[...]

 
Example 2: Classify soils according to shrink-swell capacity of the top 1 meter of soil, weighted by horizon thickness and component percent.

-- set a lower boundary for the query
\SET lwr_bdy 100

-- add a class label, based on NRCS guidelines
SELECT mapunit.musym, mapunit.muname, mapunit.muacres,
round(mu_wt_lep::numeric, 2) AS lep,
CASE WHEN mu_wt_lep < 3 THEN 'Low'
WHEN mu_wt_lep >= 3 AND mu_wt_lep < 6 THEN 'Moderate'
WHEN mu_wt_lep >= 6 AND mu_wt_lep < 9 THEN 'High'
WHEN mu_wt_lep >= 9 THEN 'Very High'
END AS lep_class
FROM
        (
        -- compute map unit lep, weighted by component percent, to set depth
        SELECT component.mukey,
        sum(component.comppct_r * co_wt_mean_lep) / sum(component.comppct_r) AS mu_wt_lep
        FROM
                (
                -- compute a horizon-thickness weighted mean lep to a set depth
                SELECT cokey, sum(thick * lep_r) / sum(thick) AS co_wt_mean_lep
                FROM
                        (
                        -- compute horizon thickness, but only to a set depth
                        SELECT cokey, hzdept_r, hzdepb_r, lep_r,
                        CASE WHEN hzdepb_r > :lwr_bdy THEN (:lwr_bdy - hzdept_r)
                        ELSE (hzdepb_r - hzdept_r) END AS thick
                        FROM chorizon
                        WHERE areasymbol = 'ca113'
                        AND lep_r IS NOT NULL
                        AND hzdept_r <= :lwr_bdy
                        ) AS hz_lep
                GROUP BY cokey
                ) AS co_lep
        JOIN component
        ON co_lep.cokey = component.cokey
        GROUP BY mukey
        ) AS mu_lep
JOIN mapunit
ON mu_lep.mukey = mapunit.mukey
ORDER BY muacres DESC, lep DESC;

 musym |                              muname                              | muacres | lep  | lep_class 
-------+------------------------------------------------------------------+---------+------+-----------
 Ya    | Yolo silt loam                                                   |   39698 | 2.52 | Low
 Sc    | Sacramento clay                                                  |   34886 | 7.50 | High
 Ca    | Capay silty clay                                                 |   33465 | 7.50 | High
 MrG2  | Millsholm rocky loam, 15 to 75 percent slopes, eroded            |   30118 | 1.50 | Low
 Rg    | Rincon silty clay loam                                           |   24580 | 6.36 | High
 BrA   | Brentwood silty clay loam, 0 to 2 percent slopes                 |   23045 | 7.50 | High
 CtD2  | Corning gravelly loam, 2 to 15 percent slopes, eroded            |   22080 | 5.34 | Moderate
 Mf    | Marvin silty clay loam                                           |   20970 | 6.60 | High
 DaF2  | Dibble clay loam, 30 to 50 percent slopes, eroded                |   18612 | 7.11 | High
 SmE2  | Sehorn-Balcom complex, 15 to 30 percent slopes, eroded           |   17794 | 6.17 | High
 TaA   | Tehama loam, 0 to 2 percent slopes                               |   16622 | 3.75 | Moderate
 BdF2  | Balcom-Dibble complex, 30 to 50 percent slopes, eroded           |   16405 | 5.73 | Moderate
 SmD   | Sehorn-Balcom complex, 2 to 15 percent slopes                    |   16117 | 6.50 | High
 BaF2  | Balcom silty clay loam, 30 to 50 percent slopes, eroded          |   12637 | 4.50 | Moderate
 Sg    | Sacramento soils, flooded                                        |   12258 | 6.27 | High
 Cn    | Clear Lake soils, flooded                                        |   11666 | 6.92 | High
 SmF2  | Sehorn-Balcom complex, 30 to 50 percent slopes, eroded           |   11226 | 6.33 | High
 Cc    | Capay soils, flooded                                             |   11030 | 7.50 | High
 Sv    | Sycamore complex, drained                                        |    9241 | 4.18 | Moderate
 Ms    | Myers clay                                                       |    8938 | 7.50 | High
 PfF2  | Positas gravelly loam, 30 to 50 percent slopes, eroded           |    7920 | 5.34 | Moderate
 St    | Sycamore silty clay loam, drained                                |    7839 | 4.50 | Moderate
 Ck    | Clear Lake clay                                                  |    6946 | 7.50 | High
 Ra    | Reiff very fine sandy loam                                       |    6847 | 1.50 | Low

[...]