Identifying the Largest Components

Overview:
Two methods for the selection of the largest components (based on the comppct_r column) within map units. This approach to selecting a single component per map unit is appropriate in situations where a single representative feature or property is sought. The partitioning of components within a map unit is closely related to the map unit type: complex, association, consociation or an "undifferentiated group". A breakdown of the number of components per each map unit type is summarized by the following query (26400 map units / 45971 major components):

SELECT mapunit.mukind, round(avg(n_components)) AS avg, min(n_components), max(n_components)
FROM mapunit JOIN
        (
        SELECT mapunit.mukey, count(component.mukey) AS n_components
        FROM
        mapunit JOIN component
        ON mapunit.mukey = component.mukey
        WHERE component.majcompflag = 'Yes' AND mapunit.mukind IS NOT NULL
        GROUP BY mapunit.mukey
        ) AS a
ON a.mukey = mapunit.mukey
GROUP BY mapunit.mukind ;
Query Results
mukind | avg | min | max
------------------------+-----+-----+-----
Consociation | 1 | 1 | 3
Complex | 2 | 1 | 4
Association | 3 | 1 | 4
Undifferentiated group | 2 | 1 | 4

 
Method 1: Filtering component percentages with the DISTINCT keyword.

1. Use of the SELECT DISTINCT ON operator. Note that the ordering is done before the DISTINCT filter is applied, resulting in the largest (by component percentage) component. Note that results from ties are unpredictable.

SELECT DISTINCT ON (mukey) mukey, cokey, comppct_r, compname, taxorder
FROM component
WHERE areasymbol = 'ca654' AND majcompflag = 'Yes'
ORDER BY mukey, comppct_r DESC ;
Query Results
mukey | cokey | comppct_r | compname | taxorder
---------+----------------+-----------+------------+-----------
1487066 | 1487066:638252 | 80 | Auberry | Alfisols
1487067 | 1487067:638274 | 80 | Auberry | Alfisols
1487068 | 1487068:638383 | 85 | Crouch | Mollisols
1487069 | 1487069:632318 | 85 | Cajon | Entisols
1487070 | 1487070:632372 | 85 | Excelsior | Entisols
1487071 | 1487071:632519 | 85 | Kimberlina | Entisols
1487072 | 1487072:632642 | 85 | Nord | Mollisols
1487076 | 1487076:632832 | 85 | Wasco | Entisols
1487077 | 1487077:632866 | 85 | Whitewolf | Entisols
464171 | 464171:640646 | 85 | Academy | Alfisols

We can inspect possible ties or other sources of error by looking at the smallest components.

SELECT * FROM
        (
        SELECT DISTINCT ON (mukey) mukey, cokey, comppct_r, compname, taxorder
        FROM component
        WHERE areasymbol = 'ca654' AND majcompflag = 'Yes'
        ORDER BY mukey, comppct_r DESC
        ) AS a
ORDER BY a.comppct_r ASC ;
Query Results
mukey | cokey | comppct_r | compname | taxorder
---------+----------------+-----------+---------------------+-------------
464432 | 464432:1380237 | 10 | Ramona | Alfisols
464184 | 464184:640686 | 30 | Ahwahnee | Alfisols
464211 | 464211:640756 | 30 | Auberry | Alfisols
464210 | 464210:640752 | 35 | Auberry | Alfisols
464529 | 464529:641600 | 35 | Vista | Inceptisols
464530 | 464530:641604 | 35 | Vista | Inceptisols
464531 | 464531:641609 | 35 | Rock outcrop |

 
Finding the largest component (based on comppct_r) per mukey

1. Identify how many map units there are in a given region or survey area. In this case San Joaquin County; areasymbol = 'ca077'.
SELECT count(mukey) FROM mapunit WHERE areasymbol = 'ca113';

Query Results
count
-------
186

2. Identify which map units contain components where a conventional, select largest component percentage type approach will not work. This commonly happens when there are several major components with the same comppct_r value, within the same map unit. Any returned mukey values need to be evaluated.

SELECT component.mukey, count(component.mukey)
FROM component
        JOIN
        (
        SELECT mukey, max(comppct_r) AS comppct FROM component WHERE majcompflag = 'Yes' AND areasymbol = 'ca077' GROUP BY mukey
        ) AS a
ON component.comppct_r = a.comppct AND component.mukey = a.mukey
GROUP BY component.mukey
HAVING count(component.mukey) > 1 ;
Query Results
mukey | count
--------+-------
462043 | 2
462068 | 2

3. Evaluation of the two "trouble" map units identified above. Note the use of the where mukey in ('462043', '462068') syntax.

SELECT mukey, cokey, comppct_r, compname, taxorder
FROM component
WHERE majcompflag = 'Yes' AND mukey IN ('462043', '462068')
ORDER BY mukey, comppct_r DESC ;

Query Results
mukey | cokey | comppct_r | compname | taxorder
--------+---------------+-----------+------------+-----------
462043 | 462043:634006 | 45 | Dumps |
462043 | 462043:634007 | 45 | Tailings |
462068 | 462068:634159 | 30 | Honker | Alfisols
462068 | 462068:634160 | 30 | Vallecitos | Alfisols
462068 | 462068:634161 | 25 | Gonzaga | Mollisols