Simple Python Interface to USGS TNM Elevation Service

Submitted by dylan on Tue, 2008-04-29 06:31.

 
Premise
Wanted a simpler way to access the USGS seamless elevation look-up service. Python seemed like a logical start. Note that the response from the USGS webservice is not correctly identified as valid XML by the python XML-parser. Therefore there is a small amount of scrubbing used to coerce the response into valid XML. Comments on why this is, or is not, a good idea are welcome. Here is a link to an interactive version.
 
Update It looks like the USGS service does not accept POST-style requests. I have made some small changes to the script below.

 
Example Listing

# nice CLI interface
import sys
from optparse import OptionParser

# url access
import urllib, urllib2

# simple XML parsing
from xml.dom import minidom

# CSV file parsing
import csv


# define service URL
base_url = 'http://gisdata.usgs.gov/XMLWebServices/TNM_Elevation_Service.asmx/getElevation?X_Value=%f&Y_Value=%f&Elevation_Units=meters&Source_Layer=-1&Elevation_Only=1'


#process command line (optparse)
parser = OptionParser()
parser.add_option("-f", "--file", dest="infile", help="input csv file containing WGS84 (lon,lat,site_id) record", metavar="FILE")


# process args
(options, args) = parser.parse_args()

#require an input file
if not options.infile:
        print "ERROR: must supply an input file!"
        sys.exit(1)


#open input file
try:
        infile = options.infile
        reader = csv.reader(open(infile, "rb"))
except:
        print "ERROR: Cannot open: " + infile
        sys.exit(1)


# read the csv file
for row in reader:
        lon = float(row[0])
        lat = float(row[1])
        site_id = row[2]
       
        # the base URL of the web service
        url = 'http://gisdata.usgs.gov/XMLWebServices/TNM_Elevation_Service.asmx/getElevation'
       
        # url GET args
        values = {'X_Value' : lon,
        'Y_Value' : lat,
        'Elevation_Units' : 'meters',
        'Source_Layer' : '-1',
        'Elevation_Only' : '1', }
       
        # make some fake headers, with a user-agent that will
        # not be rejected by bone-headed servers
        user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'
        headers = {'User-Agent' : user_agent}
       
        # encode the GET arguments
        data = urllib.urlencode(values)
       
        # make the URL into a qualified GET statement:
        get_url = url + '?' + data
       
        # make the request: note that by ommitting the url arguments
        # we force a GET request, instead of a POST
        req = urllib2.Request(url=get_url, headers=headers)
        response = urllib2.urlopen(req)
        the_page = response.read()
       
        # convert the HTML back into plain XML
        for entity, char in (('lt', '<'), ('gt', '>'), ('amp', '&')):
                the_page = the_page.replace('&%s;' % entity, char)

        # clean some cruft... XML won't parse with this stuff in there...
        the_page = the_page.replace('<string xmlns="http://gisdata.usgs.gov/XMLWebServices/">', '')
        the_page = the_page.replace('<?xml version="1.0" encoding="utf-8"?>\r\n', '')
        the_page = the_page.replace('</string>', '')
        the_page = the_page.replace('<!-- Elevation Values of -1.79769313486231E+308 (Negative Exponential Value) may mean the data source does not have values at that point.  --> <USGS_Elevation_Web_Service_Query>', '')
       
        # parse the cleaned XML
        dom = minidom.parseString(the_page)
        children = dom.getElementsByTagName('Elevation_Query')[0]
       
        # extract the interesting parts
        elev = float(children.getElementsByTagName('Elevation')[0].firstChild.data)
        data_source = children.getElementsByTagName('Data_Source')[0].firstChild.data
       
        # print to stdout
        print "%f,%f,%s,%f,%s" % (lon, lat, site_id, elev, data_source)

 
Incantation

python elev_lookup.py -f ca_cities.txt

 
Input File CA cities in WGS84 (lon,lat,id) records

-122.32,37.78,Alameda NAS
-120.53,41.48,Alturas
-124.1,40.98,Arcata
-119.05,35.43,Bakersfield
-121.45,39.13,Beale AFB
-116.95,33.93,Beaumont
-116.62,35.28,Bicycle Lk
-116.68,34.27,Big Bear Apt
-118.6,37.60,Bishop
-120.7,39.28,Blue Canyon
-114.72,33.62,Blythe
-118.37,34.20,Burbank

 
Output

-122.320000,37.780000,Alameda NAS,2.123548,NED Contiguous U. S. 1/3W arc second elevation data
-120.530000,41.480000,Alturas,1331.056396,NED Contiguous U. S. 1/3W arc second elevation data
-124.100000,40.980000,Arcata,63.749836,NED Contiguous U. S. 1/3W arc second elevation data
-119.050000,35.430000,Bakersfield,148.473618,NED Contiguous U. S. 1/3W arc second elevation data
-121.450000,39.130000,Beale AFB,29.950783,NED Contiguous U. S. 1/3W arc second elevation data
-116.950000,33.930000,Beaumont,792.840881,NED Contiguous U. S. 1/3W arc second elevation data
-116.620000,35.280000,Bicycle Lk,711.765869,NED Contiguous U. S. 1/3W arc second elevation data
-116.680000,34.270000,Big Bear Apt,1697.037720,NED Contiguous U. S. 1/3W arc second elevation data
-118.600000,37.600000,Bishop,2209.536865,NED Contiguous U. S. 1/3W arc second elevation data
-120.700000,39.280000,Blue Canyon,1603.887573,NED Contiguous U. S. 1/3W arc second elevation data
-114.720000,33.620000,Blythe,120.626007,NED Contiguous U. S. 1/3W arc second elevation data
-118.370000,34.200000,Burbank,224.287033,NED Contiguous U. S. 1/3W arc second elevation data
-117.350000,33.300000,Camp Pendlet,21.895588,NED Contiguous U. S. 1/3W arc second elevation data
AttachmentSize
ca_cities.txt2.8 KB

Not working for me

This module seems like it should do exactly what I am looking for, but when I run it, I get the following error messages:


Traceback (most recent call last):
File "D:\bc\python\gdalext\src\get_elev.py", line 64, in
response = urllib2.urlopen(req)
File "D:\Python25\Lib\urllib2.py", line 121, in urlopen
return _opener.open(url, data)
File "D:\Python25\Lib\urllib2.py", line 380, in open
response = meth(req, response)
File "D:\Python25\Lib\urllib2.py", line 491, in http_response
'http', request, response, code, msg, hdrs)
File "D:\Python25\Lib\urllib2.py", line 412, in error
result = self._call_chain(*args)
File "D:\Python25\Lib\urllib2.py", line 353, in _call_chain
result = func(*args)
File "D:\Python25\Lib\urllib2.py", line 575, in http_error_302
return self.parent.open(new)
File "D:\Python25\Lib\urllib2.py", line 380, in open
response = meth(req, response)
File "D:\Python25\Lib\urllib2.py", line 491, in http_response
'http', request, response, code, msg, hdrs)
File "D:\Python25\Lib\urllib2.py", line 418, in error
return self._call_chain(*args)
File "D:\Python25\Lib\urllib2.py", line 353, in _call_chain
result = func(*args)
File "D:\Python25\Lib\urllib2.py", line 499, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 500: Internal Server Error

Do I need to configure my Python 2.5 installation to make this request?

I am running Windows XP, if that makes a difference.

small fix

Thanks for the note. It looks like something changed either with the USGS service (no longer accepts POST method), or the default behavior of urlib2. I have made changes to the script that result in the original behavior, specifically forcing urlib2 to send a GET request:

# see the code above for the complete set of commands
get_url = url + '?' + data
req = urllib2.Request(url=get_url, headers=headers)

Thanks, and another approach

I appreciate your quick response and useful fix.

Things do seem to be changing at the USGS service. The first few times I used it, the XML response was in the form that needed cleanup of &< etc. The responses I'm getting now are clean, nicely-formatted XML. That is, instead of &<Elevation&>

Last night, the service took a few minutes to respond to some of my requests, rather than almost instantaneous responses at other times. I don't know if the delay was at their end or my end or somewhere in between. If such delays are going to be frequent, it might be wise to use threads and time out with an error message, but I have no idea how to do that.

I experimented a bit and came up with a simpler approach. I moved the get_elevation portion to a separate module to make it easier to call from other modules, leaving the original program as:

# nice CLI interface
import sys
from optparse import OptionParser

# url access
import urllib

# CSV file parsing
import csv

import usgs_elev

#process command line (optparse)
parser = OptionParser()
parser.add_option("-f", "--file", dest="infile", help="input csv file containing WGS84 (lon,lat,site_id) record", metavar="FILE")

# process args
(options, args) = parser.parse_args()

#require an input file
if not options.infile:
        print "ERROR: must supply an input file!"
        sys.exit(1)

#open input file
try:
        infile = options.infile
        reader = csv.reader(open(infile, "rb"))
except:
        print "ERROR: Cannot open: " + infile
        sys.exit(1)


# read the csv file
for row in reader:
        lon = float(row[0])
        lat = float(row[1])
        site_id = row[2]
       
        elev, data_source = usgs_elev.get_elev_and_source(lat, lon, elev_units='meters')
       
        # print to stdout
        print "%f,%f,%s,%.3f,%s" % (lon, lat, site_id, elev, data_source)
</code>

The get_elevation function is in usgs_elev.py:

<code># url access
import urllib

def get_between(str, left_text, right_text):
    """Returns the text in str that is between left_text and right_text.
   
    Example:
        get_between("
aabbbccc", "aa", "ccc")
        returns 'bbb'
    "
""
    left_index = str.find(left_text)
    if left_index < 0:
            return None
   
    right_index = str.find(right_text)
    if (right_index < 0):
            return None

    return str[left_index + len(left_text) : right_index]

def _get_elev(lat, lon, elev_units='METERS', return_source = False):
    """Returns elevation of a point given its latitude and longitude.
   
    Other choice for elev_units is 'FEET'
    Uses the USGS Elevation Query web service.
    Vertical accuracy (as of August 2008) is 7 - 15 meters"
""
   
    # the URL of the web service, with placeholders to fill in
    url= 'http://gisdata.usgs.gov/xmlwebservices2/elevation_service.asmx/getElevation?X_Value=%s&Y_Value=%s&Elevation_Units=%s&Source_Layer=-1&Elevation_Only=1'
   
    # make some fake headers, with a user-agent that will
    # not be rejected by bone-headed servers
#    BC: not needed for my system - may be needed elsewhere
#    user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'
#    headers = {'User-Agent' : user_agent}
   
    # make the URL into a qualified GET statement:
    get_url = url % (lon, lat, elev_units)
    print get_url
   
    # make the request:
    response = urllib.urlopen(get_url)
    resp_text = response.read()

    # uncomment the line below to see USGS's response
    # print resp_text  
   
    elev = float(get_between(resp_text, "<Elevation>", "</Elevation>"))
    if return_source:
        source = get_between(resp_text, "<Data_Source>", "</Data_Source>")
        return (elev, source)
    else:
        return elev
     
def get_elev(lat, lon, elev_units='METERS'):
    return _get_elev(lat, lon, elev_units, return_source = False)  

def get_elev_and_source(lat, lon, elev_units='METERS'):
    return _get_elev(lat, lon, elev_units, return_source = True)  

I used '%' formatting to create the request rather than urlencode. Since we're only inserting numbers into those slots, they shouldn't need any special formatting.

I got the results with a brute force string extraction. It would guess that parsing the XML through minidom is slower and generates more objects that need to be garbage-collected. It probably doesn't matter, since the slowest part of the program is the USGS response. But XML parsing seemed like an unnecessary complication in this case.

I'm new to Python and very new to Web services, so please excuse any blunders I made or shortcuts I shouldn't have taken.

I made the GET request through urllib.urlopen() rather than urllib2. I don't know what difference that might make.

I'm using a USGS service at a different URL than the one you used:
http://gisdata.usgs.gov/xmlwebservices2/elevation_service.asmx?op=getElevation
Entering that address in your browser puts you on a page that allows you to hand-enter the parameters and gives information on various ways to access the service.

It seems to provide the same responses. I don't know how it differs from the TNM service your program uses. I don't even remember how I stumbled across the alternate URL.

bad formatting

The Python code I sent is not properly formatted, and will not run as listed.

I put markers around it, but that didn't work. Any suggestions on how to make it format correctly?

formatting

I am not sure if advanced code formatting options are available to anon. posters. I have added the formatting markup: print '%s' % ('hello') (<blockcode type="python">...</blockcode>). I agree that the XML parsing stuff is heavy-handed for such a simple response. I'll try out your approach and post back with some updates. Thanks!

hi, i have a use for this

hi, i have a use for this right now. thanks.
are you sure your try/except is doing anything?

clarification

Thanks for the tip, I moved the relevant file opening code into the try block, as it was originally intended.