Analysis of SSURGO Data in PostGIS: An Overview
Note
Several of the pages documenting soil survey related queries have been taken down for a couple of days while I update the syntax.
News
It looks like the NCSS has constructed a web-based, SQL interface to their main database. This new tool was recently highlighted in issue 40 of the NCSS Newsletter and looks like a promising new tool with good documentation. A listing "web services" offered by the NCSS is listed on this page.
Overview
The analysis of SSURGO data is complicated by a series of hierarchical, one-to-many tabular relationships between spatial and attribute data. Understanding the structure of the SSURGO database is critical for correct interpretation and aggregation of soil properties. Before undertaking any SSURGO-based analysis please take some time to become familliar with the details on the product, including intended uses, minimum mapping units sizes, and other important details. In addition, becoming familliar with the SSURGO metadata is a must. The metadata page includes detailed descriptions of table structure, column names and units, as well as important information on the sources of much of the tabular data included in a SSURGO database.
SSURGO data can be downloaded by survey area from The Soil Datamart, with spatial data delivered in shapefile format and attribute data delivered as plain text. Unfortunately, an M.S. Access database template is required to utilize SSURGO attribute data as delivered from the Soil Datamart. For assistance with this procedure please see the NRCS SSURGO page. Using this approach, most analysis of SSURGO must be done survey-by-survey within a GIS environment. For general instructions see this document.
Several online applications allow for simple interaction with the SSURGO database without the need for a GIS or RDBMS knowledge. Some examples include:
An Open Source Approach to SSURGO
We have developed an alternate approach to working with SSURGO data using PostGIS, a spatially-enabled version of the popular and open source relational database management system Postgresql, to store all spatial and tabular data for 138 survey areas. This approach facillitates rapid access, analysis, and aggregation of over half a million soil polygons. SQL (structured query language) is used to directly interact with both soil spatial and attribute data. If other forms of spatial data are imported into PostGIS (such as landcover, climatic data, etc.) nearly all spatial and attribute analysis can be done entirely from PostGIS. A series of examples illustrating common tasks will be presented in the following pages.
General Approach to Working with SSURGO (also outlined in this document) (more ideas on map unit composition)
- Identify the soil properties that are to be included in some analysis
- Decide on the appropriate form of aggregation to be used to summarize horizons
- depth weighted (i.e. to calculate average percent clay through horizons)
- top 1m (i.e. to summarize shrink-swell capacity of the topsoil)
- top horizon (i.e. to summarize surface organic carbon)
- profile sum (i.e. to calculate total water holding capacity)
- most limiting (i.e. to calculate the most limiting hydraulic conductivity within the soil profile)
- Aggregate horizon data
- Decide on the appropriate form of aggregation to be used to summarize components
- component percent weighted (i.e. this will include information from each component, weighted by their estimated percentage of the entire map unit)
- largest component (i.e. this usually result in the selection of the 'dominant' component, however when there are multuple components with the same estimated percentage ties will occur)
- major component flag (i.e. components flagged as a 'major component' by the NRCS represent dominant soil types within a map unit. note that there are sometimes multiple components flagged as 'major components'.)
- dominant condition (i.e. this is usually used for categorical data like hydric conditions. aggregation is performed by selecting the most frequent condition within a map unit)
- Aggregation of component data
- join the above aggregated data (2x aggregation process for horizon data) to the map unit polygons
- See the diagram at the bottom of this page for a graphical summary
SSURGO discontinuity example: Boundary between Glenn and Colusa counties, illustrating differences in soil survey vintage.
Notes on the Format of SSURGO
- Adjacent surveys may have been composed by different individuals, and may be of widely different vintages. Any given survey must comply with basic standards, but older surveys reflect a more generalized approach than more modern surveys. The figure to the right illustrates such differences.
- Polygons represent a repeating pattern of legend entries: groups of map-able soil concepts called map units
- Map unit data is stored in the mapunit table, and is referenced by the field mukey
- Pre-aggregated map unit data is stored in the muaggatt table, and is referenced by the field mukey
- Map units are comprised of multiple, unmapped soil types called 'components'
- Component data is stored in the component table, and is referenced by the field cokey
- Soil components (or soil type) are associated with multiple horizons
- Horizon data is stored in the chorizon table, and is referenced by the field cokey
- Since there is a 1:many:many (mapunit:component:horizon) relationship between spatial and horizon-level soil property data two aggregation steps are required in order to produce a thematic map
- This diagram illustrates the hierarchy of scales involved in soil survey information.
SSURGO Database Structure Diagram
Logistics: Getting Connected and Executing Queries
Building and Saving SQL code snippets
A simple text editor is the best environment for working on (and saving!) your SQL queries. Linux users are encouraged to use either 'Kate' or 'Kwrite'. Windows users should look into Notetab Light or SciTE. Mac users should check out TextWrangler.
Connecting to the database
In general, the simplest way to interact with our composite soil survey database is by connecting to the best with an SSH client. Mac/Linux users have this functionality built-in. Windows users will need to use something like Putty, or Xming. Once an ssh connection with the beast has been setup, you can connect to the database with the following:
psql -U soil ssurgo_combined
where psql is the postgresql client program, -U soil means connect as the user called "soil", and ssurgo_combined is the name of the actual database.
You can quit from the database shell using \q (followed by enter). Typing \? (followed by enter) will give a list of commands available in the psql shell.
Query structure and SQL
Numerous resources exisit for learning about SQL. I would recommend looking at the first couple chapters from the PostgreSQL book by Douglas and Douglas (the big purple book on the shelf). For an interactive learning approach SQL Zoo seems like a good start. In general most queries will have the format:
SELECT column_x, column_y, column_z
FROM table_x
WHERE column_x = 'something'
-- optional
GROUP BY column_x
ORDER BY column_x ; -- semi-colon denotes end of SQL statement
A simple query: column selection and filtering
Once connected it is possible to issue queries to the database. The results of a query are normally returned to the screen. For example, asking for the horizon boundaries and horizon names from the componenent identified with the given cokey (467038:646635) might look like this:
-- the query
SELECT cokey, hzname, hzdept_r, hzdepb_r
FROM chorizon
WHERE cokey = '467038:646635' ;
-- the results
cokey | hzname | hzdept_r | hzdepb_r
---------------+--------+----------+----------
467038:646635 | Ap | 0 | 18
467038:646635 | Bw | 18 | 41
467038:646635 | Bk1 | 41 | 69
467038:646635 | Bk2 | 69 | 109
467038:646635 | Bk3 | 109 | 145
467038:646635 | Bk4 | 145 | 183
A more complicated query: column calculation and aggregation
Compute the total water holding capacity for a given component, identified by (467038:646635). Note that several operations are being performed on the data:
- compute the available water holding capacity (in centemeters) for each horizon [green text]
- compute the profile depth (in centemeters) for each horizon [green text]
- aggregate the profile data by summing AHC and depth of each horizon [red text]
-- the query
select cokey, sum( (hzdepb_r - hzdept_r) * awc_r) as component_whc, sum( (hzdepb_r - hzdept_r) ) as depth
from chorizon
where cokey = '467038:646635'
group by cokey ;
-- the results
cokey | component_whc | depth
---------------+---------------+-------
467038:646635 | 27.91 | 183
Note that it is also possible to save the results of a query into a new table. This is the usual strategy when performing any analysis that returns geometry (soil polygons, etc.) as geometry data cannot be visualized in a text display. Tables created in this manner can be exported from the database for map creation or further analysis. Spatial tables can be viewed directly by applications like QGIS or Mapserver. A simple method of accessing PostGIS data is not yet possible with ArcGIS.
Learning by Example: Common Queries
Selection, Filtering and Sorting
- Selecting a summary of horizon thickness, sand, silt, clay and AWC for a given component (chorizon table).
SELECT hzname AS name, hzdepb_r - hzdept_r AS thickness, sandtotal_r AS sand, silttotal_r AS silt, claytotal_r AS clay, awc_r AS awc
FROM chorizon
WHERE cokey = '467038:646635'
ORDER BY hzdept_r;
Query Results
name | thickness | sand | silt | clay | awc
--------+-----------+------+------+------+------
Ap | 18 | 35 | 37 | 28 | 0.16
Bw | 23 | 35 | 40 | 25 | 0.15
Bk1 | 28 | 35 | 40 | 25 | 0.16
Bk2 | 40 | 35 | 40 | 25 | 0.16
Bk3 | 36 | 35 | 40 | 25 | 0.16
Bk4 | 38 | 60 | 25 | 15 | 0.13
- Select some common fields from the component table:
SELECT cokey, majcompflag, comppct_r, wei, weg, tfact
FROM component
WHERE mukey = '467038'
ORDER BY comppct_r DESC;
Query Results
cokey | majcompflag | comppct_r | wei | weg | tfact
---------------+-------------+-----------+-----+-----+-------
467038:646635 | Yes | 85 | 48 | 6 | 5
467038:646636 | No | 5 | | |
467038:646638 | No | 4 | | |
467038:646640 | No | 2 | | |
467038:646639 | No | 2 | | |
467038:646637 | No | 2 | | |
- Create a new table 'yolo_comp' using the component table, for a specific survey area (Yolo County), with components sorted by their component percentages.
CREATE TABLE yolo_comp AS
SELECT * FROM component
WHERE areasymbol = 'ca113'
ORDER BY mukey, comppct_r DESC ;
The resulting table can be copied to a CSV file (in the current working directory) like this: \copy yolo_comp TO 'yolo_comp.csv' CSV HEADER
If you are done with the table, remove it with the following SQL: DROP TABLE yolo_comp ;
Identifying the Largest Components
Overview:
Two methods for the selection of the largest components (based on the comppct_r column) within map units. This approach to selecting a single component per map unit is appropriate in situations where a single representative feature or property is sought. The partitioning of components within a map unit is closely related to the map unit type: complex, association, consociation or an "undifferentiated group". A breakdown of the number of components per each map unit type is summarized by the following query (26400 map units / 45971 major components):
SELECT mapunit.mukind, round(avg(n_components)) AS avg, min(n_components), max(n_components)
FROM mapunit JOIN
(
SELECT mapunit.mukey, count(component.mukey) AS n_components
FROM
mapunit JOIN component
ON mapunit.mukey = component.mukey
WHERE component.majcompflag = 'Yes' AND mapunit.mukind IS NOT NULL
GROUP BY mapunit.mukey
) AS a
ON a.mukey = mapunit.mukey
GROUP BY mapunit.mukind ;
Query Results
mukind | avg | min | max
------------------------+-----+-----+-----
Consociation | 1 | 1 | 3
Complex | 2 | 1 | 4
Association | 3 | 1 | 4
Undifferentiated group | 2 | 1 | 4
Method 1: Filtering component percentages with the DISTINCT keyword.
- Use of the SELECT DISTINCT ON operator. Note that the ordering is done before the DISTINCT filter is applied, resulting in the largest (by component percentage) component. Note that results from ties are unpredictable.
-
SELECT DISTINCT ON (mukey) mukey, cokey, comppct_r, compname, taxorder
FROM component
WHERE areasymbol = 'ca654' AND majcompflag = 'Yes'
ORDER BY mukey, comppct_r DESC ;
Query Results
mukey | cokey | comppct_r | compname | taxorder
---------+----------------+-----------+------------+-----------
1487066 | 1487066:638252 | 80 | Auberry | Alfisols
1487067 | 1487067:638274 | 80 | Auberry | Alfisols
1487068 | 1487068:638383 | 85 | Crouch | Mollisols
1487069 | 1487069:632318 | 85 | Cajon | Entisols
1487070 | 1487070:632372 | 85 | Excelsior | Entisols
1487071 | 1487071:632519 | 85 | Kimberlina | Entisols
1487072 | 1487072:632642 | 85 | Nord | Mollisols
1487076 | 1487076:632832 | 85 | Wasco | Entisols
1487077 | 1487077:632866 | 85 | Whitewolf | Entisols
464171 | 464171:640646 | 85 | Academy | Alfisols
- We can inspect possible ties or other sources of error by looking at the smallest components.
SELECT * FROM
(
SELECT DISTINCT ON (mukey) mukey, cokey, comppct_r, compname, taxorder
FROM component
WHERE areasymbol = 'ca654' AND majcompflag = 'Yes'
ORDER BY mukey, comppct_r DESC
) AS a
ORDER BY a.comppct_r ASC ;
Query Results
mukey | cokey | comppct_r | compname | taxorder
---------+----------------+-----------+---------------------+-------------
464432 | 464432:1380237 | 10 | Ramona | Alfisols
464184 | 464184:640686 | 30 | Ahwahnee | Alfisols
464211 | 464211:640756 | 30 | Auberry | Alfisols
464210 | 464210:640752 | 35 | Auberry | Alfisols
464529 | 464529:641600 | 35 | Vista | Inceptisols
464530 | 464530:641604 | 35 | Vista | Inceptisols
464531 | 464531:641609 | 35 | Rock outcrop |
Finding the largest component (based on comppct_r) per mukey
- Identify how many map units there are in a given region or survey area. In this case San Joaquin County; areasymbol = 'ca077'.
SELECT count(mukey) FROM mapunit WHERE areasymbol = 'ca113';
Query Results
count
-------
186
- Identify which map units contain components where a conventional, select largest component percentage type approach will not work. This commonly happens when there are several major components with the same comppct_r value, within the same map unit. Any returned mukey values need to be evaluated.
SELECT component.mukey, count(component.mukey)
FROM component
JOIN
(
SELECT mukey, max(comppct_r) AS comppct FROM component WHERE majcompflag = 'Yes' AND areasymbol = 'ca077' GROUP BY mukey
) AS a
ON component.comppct_r = a.comppct AND component.mukey = a.mukey
GROUP BY component.mukey
HAVING count(component.mukey) > 1 ;
Query Results
mukey | count
--------+-------
462043 | 2
462068 | 2
- Evaluation of the two "trouble" map units identified above. Note the use of the where mukey in ('462043', '462068') syntax.
SELECT mukey, cokey, comppct_r, compname, taxorder
FROM component
WHERE majcompflag = 'Yes' AND mukey IN ('462043', '462068')
ORDER BY mukey, comppct_r DESC ;
Query Results
mukey | cokey | comppct_r | compname | taxorder
--------+---------------+-----------+------------+-----------
462043 | 462043:634006 | 45 | Dumps |
462043 | 462043:634007 | 45 | Tailings |
462068 | 462068:634159 | 30 | Honker | Alfisols
462068 | 462068:634160 | 30 | Vallecitos | Alfisols
462068 | 462068:634161 | 25 | Gonzaga | Mollisols
Soil Texture Example
Premise
Compute a series of weighted-average soil texture fractions (sand, silt, clay), for every component, of every map unit in Yolo County. These values will be further weighted by the spatial distribution of each map unit.
CREATE TABLE yolo_wt_mean_texture AS
-- join with polygons, and compute areas weights
SELECT mapunit_poly.mukey,
sum(ST_Area(wkb_geometry)) / (SELECT ST_Area(wkb_geometry) FROM mapunit_bound_poly WHERE areasymbol = 'ca113') AS area_wt,
sand, silt, clay
FROM
mapunit_poly
JOIN
(
-- compute component percent weighted mean
SELECT mukey,
sum(comppct_r * sand) / sum(comppct_r) AS sand,
sum(comppct_r * silt) / sum(comppct_r) AS silt,
sum(comppct_r * clay) / sum(comppct_r) AS clay
FROM
component
JOIN
(
-- compute hz thickness weighted mean
SELECT cokey,
sum((hzdepb_r - hzdept_r) * sandtotal_r) / sum(hzdepb_r - hzdept_r) AS sand,
sum((hzdepb_r - hzdept_r) * silttotal_r) / sum(hzdepb_r - hzdept_r) AS silt,
sum((hzdepb_r - hzdept_r) * claytotal_r) / sum(hzdepb_r - hzdept_r) AS clay
FROM chorizon
WHERE sandtotal_r IS NOT NULL
AND silttotal_r IS NOT NULL
AND claytotal_r IS NOT NULL
AND areasymbol = 'ca113'
GROUP BY cokey
) AS co_agg
ON component.cokey = co_agg.cokey
GROUP BY component.mukey
) AS mu_agg
ON mapunit_poly.mukey = mu_agg.mukey
GROUP BY mapunit_poly.mukey, sand, silt, clay;
Simple Visualization with R
# dump the data:
# echo 'select * from yolo_wt_mean_texture' | psql -A -F "," -U xxx -h xxx dbname > temp/yolo_texture.csv
#
# remember to trim off the last line!
# load some libs:
library(plotrix)
# read in the data
x <- read.csv('yolo_texture.csv')
# simple soil texture, with symbol size weighted by area weight
soil.texture(x[,3:5], cex=sqrt(50*x$area_wt), pch=16, col.symbol=rgb(65,105,225, alpha=100, max=255),
show.lines=T, show.names=T, col.lines='black', col.names='black', main='Yolo County Soil Textures')
triax.points(cbind(weighted.mean(x$sand, x$area_wt), weighted.mean(x$silt, x$area_wt), weighted.mean(x$clay, x$area_wt)),
col.symbols='orange', pch=16, cex=2)
Yolo County Common Soil Textures: Blue symbols mark common soil textures and their relative size denotes abundance. The orange symbol marks the weighted average soil texture for all of Yolo County.
Identification of Dated Surfaces via Soil Series
East Side Alluvial Formations
Overview
A simple association between dated landforms and soil series name [1] was used to extract soil polygons from a composite soil survey database.
| Soil Series |
Associated Formation |
Approximate Age (1000 yrs ago) |
| Redding |
Laguna |
1600 - 2000 kya |
| Corning |
Laguna |
1600 - 2000 kya |
| Keyes |
Laguna |
1600 - 2000 kya |
| Whitney |
Turlock Lake |
500 - 700 kya |
| Montpellier |
Turlock Lake |
500 - 700 kya |
| Rocklin |
Turlock Lake |
500 - 700 kya |
| Snelling |
Riverbank |
100 - 300 kya |
| San Joaquin |
Riverbank |
100 - 300 kya |
| Exiter |
Riverbank |
100 - 300 kya |
| Madera |
Riverbank |
100 - 300 kya |
| Hanford |
Modesto |
10 - 40 kya |
| Grangeville |
Holocene |
< 10 kya |
-
Create a soil series - dated landform lookup table:
-- create a lookup table
CREATE TABLE dated_landforms (
soil_series varchar(20),
formation varchar(20),
approx_age varchar(30)
);
-- populate table
INSERT INTO dated_landforms VALUES ('Redding','Laguna','1600 - 2000 kya') ;
INSERT INTO dated_landforms VALUES ('Corning','Laguna','1600 - 2000 kya') ;
INSERT INTO dated_landforms VALUES ('Keyes','Laguna','1600 - 2000 kya') ;
INSERT INTO dated_landforms VALUES ('Whitney','Turlock Lake','500 - 700 kya') ;
INSERT INTO dated_landforms VALUES ('Montpellier','Turlock Lake','500 - 700 kya') ;
INSERT INTO dated_landforms VALUES ('Rocklin','Turlock Lake','500 - 700 kya') ;
INSERT INTO dated_landforms VALUES ('Snelling','Riverbank','100 - 300 kya') ;
INSERT INTO dated_landforms VALUES ('San Joaquin','Riverbank','100 - 300 kya') ;
INSERT INTO dated_landforms VALUES ('Exiter','Riverbank','100 - 300 kya') ;
INSERT INTO dated_landforms VALUES ('Madera','Riverbank','100 - 300 kya') ;
INSERT INTO dated_landforms VALUES ('Hanford','Modesto','10 - 40 kya') ;
INSERT INTO dated_landforms VALUES ('Grangeville','Holocene','< 10 kya') ;
-
Select map units (mukey) by suitible series concepts, associated with major components. Note that the DISTINCT ON (mukey) ... ORDER BY mukey, comppct_r DESC pattern can be used to select the largest component for each map unit. Pattern matching is used to safely join variants with the soil series names in our look-up table: i.e. 'San Joaquin variant' will match 'San Joaquin'.
-- keep only the largest formation (by total component percent) within each map unit key
SELECT DISTINCT ON (mukey) mukey, formation, sum(comppct_r) AS formation_pct
FROM component
JOIN
dated_landforms
-- use pattern matching to also include variants
ON compname ~~* lower(soil_series || '%') = 't'
-- subset to select survey areas on the east side of the valley
AND component.areasymbol IN ('ca654', 'ca651', 'ca649', 'ca644', 'ca648', 'ca077')
AND majcompflag = 'Yes'
-- combine components of the same formation
GROUP BY mukey, formation
-- use in conjunction with the DISTINCT statement to keep only the largest
ORDER BY mukey, formation_pct DESC;
-
Create a new classified table called east_side_all. This query involves 6 survey areas, 11423 polygons, and requires about 11 seconds to complete.
CREATE TABLE east_side_all AS
-- select geom column and the feature id
-- along with our formation names
SELECT wkb_geometry AS wkb_geometry, ogc_fid, a.*
FROM mapunit_poly
JOIN
(
SELECT DISTINCT ON (mukey) mukey, formation, sum(comppct_r) AS formation_pct
FROM component
JOIN
dated_landforms
ON compname ~~* lower(soil_series || '%') = 't'
AND component.areasymbol IN ('ca654', 'ca651', 'ca649', 'ca644', 'ca648', 'ca077')
AND majcompflag = 'Yes'
GROUP BY mukey, formation
ORDER BY mukey, formation_pct DESC
) AS a
ON mapunit_poly.mukey = a.mukey
-- limit polygon selection by survey area
AND mapunit_poly.areasymbol IN ('ca654', 'ca651', 'ca649', 'ca644', 'ca648', 'ca077') ;
-
Index, register geometry, and setup permissions
-- indexing
CREATE INDEX east_side_all_fid_idx ON east_side_all (ogc_fid) ;
CREATE INDEX east_side_all_gis_idx ON east_side_all USING gist (wkb_geometry gist_geometry_ops) ;
-- register geometry
INSERT INTO geometry_columns VALUES ('','public','east_side_all','wkb_geometry',2,9001,'POLYGON');
-- permissions
GRANT SELECT ON TABLE east_side_all TO soil ;
-
Tabulate acreage for each formation. (approximately 1/3 second to complete)
SELECT * FROM
(
SELECT round(sum(area(wkb_geometry)) * 0.000247) AS area_ac, formation
FROM east_side_all
GROUP BY formation
) AS a
ORDER BY a.area_ac DESC;
| area_ac |
formation |
| 295868 |
Riverbank |
| 253424 |
Modesto |
| 151149 |
Turlock Lake |
| 121085 |
Laguna |
| 96981 |
Holocene |
(5 rows)
-
Dump to local file in SHP format
ogr2ogr east_side_all.shp PG:"dbname=ssurgo_combined user=xxxx password=xxxx host=xxxx" east_side_all
References:
-
Smith, D.W. & Verrill, W.L. Witham, C.; Bauder, E.; Belk, D.; Ferren Jr., W. & Ornduff, R. (ed.) Vernal Pool-Soil-Landform Relationships in the Central Valley, California 1998, 15-23
Seasonally Wet Soils and Shrink-Swell Potential
Example 1 map
Example 2 map
Example 1: The location of seasonaly wet soils via two methods: hydricrating and USDA Soil Taxonomy interpretation.
-- optionally link polygons here...
-- compute the percent of each map unit that contains components that may be seasonally wet
SELECT mukey, wet_flag, sum(comppct_r) AS pct_mu
FROM
(
SELECT mukey, cokey, comppct_r,
-- slightly less restrictive than hydricrating alone
CASE WHEN ((taxclname ~~* '%aqu%') = 't') OR (hydricrating = 'Yes') then 1 ELSE 0 END AS wet_flag
FROM component
-- subset to Yolo County
WHERE areasymbol = 'ca113'
) AS yolo_wet_components
-- only keep map unit with some component that meets the criteria
WHERE wet_flag = 1
-- aggregate by map unit, wet_flag
GROUP BY yolo_wet_components.mukey, yolo_wet_components.wet_flag
ORDER BY mukey;
Example 1.1 Extract a list of wet components, and sum area based on component (series) name
-- compute estimated, total acreage of seasonally wet map units in Yolo Co.
SELECT mapunit.musym, mapunit.mukey, mapunit.muname,
round(sum(ST_Area(wkb_geometry) * (yolo_mu_data.pct_mu::numeric/100.0) * 0.000247)::numeric, 2) AS wet_area_ac
FROM
mapunit_poly
JOIN
(
-- compute the percent of each map unit that contains components that may be seasonally wet
SELECT mukey, wet_flag, sum(comppct_r) AS pct_mu
FROM
(
SELECT mukey, cokey, comppct_r,
-- slightly less restrictive than hydricrating alone
CASE WHEN ((taxclname ~~* '%aqu%') = 't') OR (hydricrating = 'Yes') then 1 ELSE 0 END AS wet_flag
FROM component
-- subset to Yolo County
WHERE areasymbol = 'ca113'
) AS yolo_wet_components
-- only keep map unit with some component that meets the criteria
WHERE wet_flag = 1
-- aggregate by map unit, wet_flag
GROUP BY yolo_wet_components.mukey, yolo_wet_components.wet_flag
) AS yolo_mu_data
ON mapunit_poly.mukey = yolo_mu_data.mukey
JOIN mapunit
ON mapunit.mukey = yolo_mu_data.mukey
-- aggregate by map unit
GROUP BY mapunit.mukey, mapunit.muname, mapunit.musym
ORDER BY wet_area_ac DESC;
musym | mukey | muname | wet_area_ac
-------+--------+----------------------------------------------------------------+-------------
Sc | 459268 | Sacramento clay | 35710.90
Mf | 459244 | Marvin silty clay loam | 18877.18
Sg | 459272 | Sacramento soils, flooded | 12273.51
Cn | 459219 | Clear Lake soils, flooded | 11665.84
Cc | 459216 | Capay soils, flooded | 11029.83
Sv | 459288 | Sycamore complex, drained | 8410.41
St | 459286 | Sycamore silty clay loam, drained | 6900.77
Ck | 459218 | Clear Lake clay | 6737.88
Sa | 459266 | Sacramento silty clay loam | 6014.48
Sp | 459283 | Sycamore silt loam, drained | 5537.18
Sw | 459289 | Sycamore complex, flooded | 5041.96
Wb | 459301 | Willows clay | 4950.34
Ss | 459285 | Sycamore silty clay loam | 4859.59
Pb | 459254 | Pescadero silty clay, saline-alkali | 4700.77
So | 459282 | Sycamore silt loam | 3953.42
Rh | 459262 | Riverwash | 3698.88
Pc | 459255 | Pescadero soils, flooded | 3589.01
Tb | 459292 | Tyndall very fine sandy loam | 3299.03
Ob | 459252 | Omni silty clay | 3174.44
[...]
Example 2: Classify soils according to shrink-swell capacity of the top 1 meter of soil, weighted by horizon thickness and component percent.
-- set a lower boundary for the query
\SET lwr_bdy 100
-- add a class label, based on NRCS guidelines
SELECT mapunit.musym, mapunit.muname, mapunit.muacres,
round(mu_wt_lep::numeric, 2) AS lep,
CASE WHEN mu_wt_lep < 3 THEN 'Low'
WHEN mu_wt_lep >= 3 AND mu_wt_lep < 6 THEN 'Moderate'
WHEN mu_wt_lep >= 6 AND mu_wt_lep < 9 THEN 'High'
WHEN mu_wt_lep >= 9 THEN 'Very High'
END AS lep_class
FROM
(
-- compute map unit lep, weighted by component percent, to set depth
SELECT component.mukey,
sum(component.comppct_r * co_wt_mean_lep) / sum(component.comppct_r) AS mu_wt_lep
FROM
(
-- compute a horizon-thickness weighted mean lep to a set depth
SELECT cokey, sum(thick * lep_r) / sum(thick) AS co_wt_mean_lep
FROM
(
-- compute horizon thickness, but only to a set depth
SELECT cokey, hzdept_r, hzdepb_r, lep_r,
CASE WHEN hzdepb_r > :lwr_bdy THEN (:lwr_bdy - hzdept_r)
ELSE (hzdepb_r - hzdept_r) END AS thick
FROM chorizon
WHERE areasymbol = 'ca113'
AND lep_r IS NOT NULL
AND hzdept_r <= :lwr_bdy
) AS hz_lep
GROUP BY cokey
) AS co_lep
JOIN component
ON co_lep.cokey = component.cokey
GROUP BY mukey
) AS mu_lep
JOIN mapunit
ON mu_lep.mukey = mapunit.mukey
ORDER BY muacres DESC, lep DESC;
musym | muname | muacres | lep | lep_class
-------+------------------------------------------------------------------+---------+------+-----------
Ya | Yolo silt loam | 39698 | 2.52 | Low
Sc | Sacramento clay | 34886 | 7.50 | High
Ca | Capay silty clay | 33465 | 7.50 | High
MrG2 | Millsholm rocky loam, 15 to 75 percent slopes, eroded | 30118 | 1.50 | Low
Rg | Rincon silty clay loam | 24580 | 6.36 | High
BrA | Brentwood silty clay loam, 0 to 2 percent slopes | 23045 | 7.50 | High
CtD2 | Corning gravelly loam, 2 to 15 percent slopes, eroded | 22080 | 5.34 | Moderate
Mf | Marvin silty clay loam | 20970 | 6.60 | High
DaF2 | Dibble clay loam, 30 to 50 percent slopes, eroded | 18612 | 7.11 | High
SmE2 | Sehorn-Balcom complex, 15 to 30 percent slopes, eroded | 17794 | 6.17 | High
TaA | Tehama loam, 0 to 2 percent slopes | 16622 | 3.75 | Moderate
BdF2 | Balcom-Dibble complex, 30 to 50 percent slopes, eroded | 16405 | 5.73 | Moderate
SmD | Sehorn-Balcom complex, 2 to 15 percent slopes | 16117 | 6.50 | High
BaF2 | Balcom silty clay loam, 30 to 50 percent slopes, eroded | 12637 | 4.50 | Moderate
Sg | Sacramento soils, flooded | 12258 | 6.27 | High
Cn | Clear Lake soils, flooded | 11666 | 6.92 | High
SmF2 | Sehorn-Balcom complex, 30 to 50 percent slopes, eroded | 11226 | 6.33 | High
Cc | Capay soils, flooded | 11030 | 7.50 | High
Sv | Sycamore complex, drained | 9241 | 4.18 | Moderate
Ms | Myers clay | 8938 | 7.50 | High
PfF2 | Positas gravelly loam, 30 to 50 percent slopes, eroded | 7920 | 5.34 | Moderate
St | Sycamore silty clay loam, drained | 7839 | 4.50 | Moderate
Ck | Clear Lake clay | 6946 | 7.50 | High
Ra | Reiff very fine sandy loam | 6847 | 1.50 | Low
[...]