Thanks, and another approach

I appreciate your quick response and useful fix.

Things do seem to be changing at the USGS service. The first few times I used it, the XML response was in the form that needed cleanup of &< etc. The responses I'm getting now are clean, nicely-formatted XML. That is, instead of &<Elevation&>

Last night, the service took a few minutes to respond to some of my requests, rather than almost instantaneous responses at other times. I don't know if the delay was at their end or my end or somewhere in between. If such delays are going to be frequent, it might be wise to use threads and time out with an error message, but I have no idea how to do that.

I experimented a bit and came up with a simpler approach. I moved the get_elevation portion to a separate module to make it easier to call from other modules, leaving the original program as:

# nice CLI interface
import sys
from optparse import OptionParser

# url access
import urllib

# CSV file parsing
import csv

import usgs_elev

#process command line (optparse)
parser = OptionParser()
parser.add_option("-f", "--file", dest="infile", help="input csv file containing WGS84 (lon,lat,site_id) record", metavar="FILE")

# process args
(options, args) = parser.parse_args()

#require an input file
if not options.infile:
        print "ERROR: must supply an input file!"
        sys.exit(1)

#open input file
try:
        infile = options.infile
        reader = csv.reader(open(infile, "rb"))
except:
        print "ERROR: Cannot open: " + infile
        sys.exit(1)


# read the csv file
for row in reader:
        lon = float(row[0])
        lat = float(row[1])
        site_id = row[2]
       
        elev, data_source = usgs_elev.get_elev_and_source(lat, lon, elev_units='meters')
       
        # print to stdout
        print "%f,%f,%s,%.3f,%s" % (lon, lat, site_id, elev, data_source)
</code>

The get_elevation function is in usgs_elev.py:

<code># url access
import urllib

def get_between(str, left_text, right_text):
    """Returns the text in str that is between left_text and right_text.
   
    Example:
        get_between("
aabbbccc", "aa", "ccc")
        returns 'bbb'
    "
""
    left_index = str.find(left_text)
    if left_index < 0:
            return None
   
    right_index = str.find(right_text)
    if (right_index < 0):
            return None

    return str[left_index + len(left_text) : right_index]

def _get_elev(lat, lon, elev_units='METERS', return_source = False):
    """Returns elevation of a point given its latitude and longitude.
   
    Other choice for elev_units is 'FEET'
    Uses the USGS Elevation Query web service.
    Vertical accuracy (as of August 2008) is 7 - 15 meters"
""
   
    # the URL of the web service, with placeholders to fill in
    url= 'http://gisdata.usgs.gov/xmlwebservices2/elevation_service.asmx/getElevation?X_Value=%s&Y_Value=%s&Elevation_Units=%s&Source_Layer=-1&Elevation_Only=1'
   
    # make some fake headers, with a user-agent that will
    # not be rejected by bone-headed servers
#    BC: not needed for my system - may be needed elsewhere
#    user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'
#    headers = {'User-Agent' : user_agent}
   
    # make the URL into a qualified GET statement:
    get_url = url % (lon, lat, elev_units)
    print get_url
   
    # make the request:
    response = urllib.urlopen(get_url)
    resp_text = response.read()

    # uncomment the line below to see USGS's response
    # print resp_text  
   
    elev = float(get_between(resp_text, "<Elevation>", "</Elevation>"))
    if return_source:
        source = get_between(resp_text, "<Data_Source>", "</Data_Source>")
        return (elev, source)
    else:
        return elev
     
def get_elev(lat, lon, elev_units='METERS'):
    return _get_elev(lat, lon, elev_units, return_source = False)  

def get_elev_and_source(lat, lon, elev_units='METERS'):
    return _get_elev(lat, lon, elev_units, return_source = True)  

I used '%' formatting to create the request rather than urlencode. Since we're only inserting numbers into those slots, they shouldn't need any special formatting.

I got the results with a brute force string extraction. It would guess that parsing the XML through minidom is slower and generates more objects that need to be garbage-collected. It probably doesn't matter, since the slowest part of the program is the USGS response. But XML parsing seemed like an unnecessary complication in this case.

I'm new to Python and very new to Web services, so please excuse any blunders I made or shortcuts I shouldn't have taken.

I made the GET request through urllib.urlopen() rather than urllib2. I don't know what difference that might make.

I'm using a USGS service at a different URL than the one you used:
http://gisdata.usgs.gov/xmlwebservices2/elevation_service.asmx?op=getElevation
Entering that address in your browser puts you on a page that allows you to hand-enter the parameters and gives information on various ways to access the service.

It seems to provide the same responses. I don't know how it differs from the TNM service your program uses. I don't even remember how I stumbled across the alternate URL.

Reply

The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <div> <img> <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
Image CAPTCHA
Copy the characters (respecting upper/lower case) from the image.