Section author: Mike Fitzpatrick <mike.fitzpatrick@noirlab.edu>
3.1. Exception Handling in Data Lab Services¶
Note
This document is currently in a DRAFT stage.
This document will describe the recommended handling of errors and exceptions in the Python-based Data Lab middleware (i.e. the various ‘manager’ services).
The Java-based servlets (i.e. VOSpace and DALServer) implement IVOA protocols that describe the required exception handling for those protocols. These are not covered in detail except when/how those protocol exceptions should be handled by the Data Lab server and client codes.
3.1.1. Exceptions vs. return values¶
The two philosophies about how to handle errors are sometimes described as EAFP (Easier to Ask for Forgiveness than Permission) and LBYL (Look Before You Leap). Under the LBYL model, the application using the interface, the interface itself and the service each must anticipate all potential error conditions (either by catching system library exceptions and/or writing code to check in advance that parameters are valid, files exist, etc) and respond with an error code. With the EAFP model, exceptions can either be handled, ignored or dealt with at a more appropriate level in the application. The EAFP model also has readability benefits and so is the model adopted for Data Lab
Exceptions are, by definition, unexpected behavior of the code, but they can also be raised in response to improper use of the client or service (e.g. missing or invalid input). We require that all interface methods (client and server) describe their calling arguments as well as what (if anything) is returned by the method or service in the method docstrings. …..
3.1.2. Service Architecture Overview¶
The existing Data Lab service and client code all have a similar structure, however since they were developed at different times in the project there remain a number of inconsistencies between the services. This section describes the target structure we wish to have for all code following a throrough code review, notes will be used to identify known problems to be addressed in the current release.
3.1.3. Server-side Code¶
Data Lab services are implemented using the Python Flask microframework. These services define a middleware layer that clients (web-based, programmatic or command-line) access from their interfaces. As middleware, these services, depending on their function, may in turn call lower-level services (e.g. VOSpace, TAP, SIA, etc) or access a resource such as a database directly. The return value of each service is documented in the service implementation docstring.
Note
The code review process is intended to discover those docstrings that don’t yet provide the required service documentation.
As an example, a simple ‘echo’ service might look something like:
@app.route('/echo')
def echo(arg):
''' ECHO - A simple echo service endpoint
Parameters
----------
arg : str
The argument to be echoed.
Returns
-------
(Status 200) A string that echoes the argument
(Status 400) An error message string
'''
if arg is not None:
return "Hello %s!\n" % arg
else
raise Exception('Missing "arg" parameter')
In this example, simply raising the generic Exception
will cause the service
to return a 500 (Internal Server Error) to the caller. In order to return a
specific error message we need to define an errorhandler()
method for the
Flask application. For example,
@app.errorhandler (Exception)
def handle_invalid_request (error):
return app.make_response(('Error: '+error.message, 400, ''))
This is better in that it returns a proper HTTP response with the specified
error message, but the status code is fixed at 400 (or whatever value chosen).
The solution is to create an exception subclass in the server code (and an
associated Flask errorhandler()
) that allows us to set the message, status
code, and optionally an error payload:
class dlInvalidRequest(Exception):
def __init__(self, message, status_code=None, payload=None):
Exception.__init__(self, message)
self.message = message
self.status_code = (status_code if status_code is not None else 400)
self.payload = payload
def to_dict(self):
""" Method to return a JSON formatting of the error. """
rv = dict(self.payload or ())
rv['message'] = self.message
rv['code'] = self.status_code
return rv
@app.errorhandler(dlInvalidRequest)
def handle_invalid_request(error):
return app.make_response(('Error: ' + error.message,
error.status_code, ''))
The service code then looks like:
if arg is not None:
return "Hello %s!\n" % arg
else
raise dlInvalidRequest('Missing argument', 400)
When no argument is provided, the service will return a status 400 response with a specific error message useful to the client.
3.1.3.1. Service Return Codes¶
As RESTful web services, the standard set of HTTP return codes are available to communicate status back to the calling client in addition to any returned data. This provides the flexibility needed to return error messages that provide detail on why the service failed when a standard status status message may be ambiguous.
Exceptions in the server code should follow a few simple guidelines. Services will return:
- Status 200 (Successful)
When the service performs the requested action without error.
- Status 400 (Bad Request)
When the call fails due to missing or invalid input to the service. When a backend service returns an error status, that status should be returned to the client when it provides a more detailed explanation of the error.
- Status 403 (Forbidden)
When the client does not present the identity token required to access or modify a requested resource.
- Status 404 (Not Found)
When the service cannot access a requested static resource (e.g. a VOSpace URI).
- Status 503 (Service Unavailable)
When the service requires access to a backend resource that cannot be reached (e.g. a database or storage system), preventing the entire service from executing as required.
Backend-services (e.g. TAP, VOSpace, database) may have return codes specified
by the protocols used, but these will be handled by the service when
determining whether the call succeeded. For example, when deleting a file
from VOSpace, the protocol requires a 204 status code response indicating the
file was deleted, however the service should return a 200 status to the client
because the storeClient.rm()
method succeeded. In the event of an error
in the VOSpace service that returns a non-204 code, the rm()
service can
handle or ignore the error or else return a 400 (Bad Request) error along
with the specific error message from VOSpace. Similarly, a query that
requires a user MyDB table that doesn’t exist will return the error message
from the database that identifies the missing table without requiring that
every possible database message map into a corresponding HTTP status code.
By limiting the number of trapped error status codes, the client has fewer
specific exceptions to catch explicitly and can raise the error to the
calling method more easily.
3.1.4. Client-side Code¶
The basic layout of a Data Lab Client interface is something like the following (using the AuthManager as an example):
def login (user, password): # Module method
return ac_client.login(user, password)
class authClient (Object): # AuthManager Object class
def __init__(self):
pass
def login (self, user, password): # Class method
resp = requests.get (svc_url, headers=hdrs)
return resp
def getClient(): # Get a new instance of the authClient
return authClient()
ac_client = getClient() # Create a default client object
Note
The use of MultiMethod signatures is ignored here for brevity.
This structure is intended to allow applications to directly access the
module methods when using the default client instance created by the
import
of the module, but also the ability to create additional clients
when necessary (e.g. when using DEV instances of Data Lab services, or
when using different service profiles without requiring a resetting of the
profile before each use in the default client). For example,
from dl import queryClient as qc # standard import
gp04 = qc.getClient(profile='gp04') # get new client instance with
# 'gp04' profile
res1 = qc.query (sql='....') # query default service
res2 = gp04.query (sql='....') # query 'gp04' service
Note
As of this writing, the use of MultiMethod signatures prevents new client instances (e.g. the ‘gp04’ client above) from working correctly. The solution is understood and will be implemented as part of the MultiMethod docstring work to come.
3.1.4.1. Client-side exception handling¶
You can see from the client code example that the module methods are simply calls to the default client’s class method which performs all the work of the client interface. Our goal is to catch and/or handle exceptions in this class module and simply raise them to the calling procedure. When writng a default client module method, the code may look something like:
def login(user, password):
try:
resp = ac_client.login(user, password)
except Exception as e:
raise
return resp
In this way, an exception either returned by the service or raised by the class method is simply passed to the caller. On success, the normal return value of the method is returned.
The class method that actually calls the service should use an appropriate
try-except
block to raise exceptions in the client code or returned by
the service. Data Lab client interfaces use the requests
module to make
service calls where all exceptions that Requests raises inherit from the
requests.exceptions.RequestException
class, making it possible to trap the
specific errors returned by a service individually as well as HTTP
connection-related issues.
For example, a class method might look something like:
def login(user, passwd):
resp = None
try:
resp = requests.get(url, params={'username':user,'password':passwd})
resp.raise_for_status()
except requests.exceptions.RequestException as err:
if resp is None:
raise Exception (str(err)) # connection error
else:
raise Exception (resp.content) # service error
return resp.content
There are a few things to note in this example:
The
raise_for_status()
is used to raise HTTP errors that inherit from theRequestException
object used by therequests
module. Service errors (e.g. TimeOut) are returned using HTTP status codes and can be caught in the same block.When the response
resp
is None during theexcept
handling, the exception message is returned to indicate the specific HTTP error. However, if we have a validresp
object in an exception it was generated by the server and we return the error message in the response content to pass the message back from the service. Theresp
is initialized to let us differentiate the two exception types so that we can handle HTTP connection problems and service problem differently when needed.Here we raise the generic
Exception
but in production code we generally create a throwable exception class to be used. For Py2/Py3 compatability this allows us to assure the ‘string’ type on the error message currently assumed by legacy code and can be removed later once a full transition to Py3 is complete.
Client code of course does other processing, e.g. validating input parameters, processing return values and so on. These steps may themselves use
try-except
blocks to do additional error handling and should follow
similar concepts when determining which exceptions are raised.
Note
The requests
call is so common to each client method that a utility
method should be implemented to avoid code duplication and provide a
central location for all service-related exception handling. Similar
utility methods could be envisioned for paramter validation, ensuring
standard string-type conversion, etc.
3.1.5. Coding Style: Examples vs. Applications¶
In HowTo
notebooks, science examples and general documentation the intent
is often to convey an example use of an interface or service, implicitly
assuming the example will always succeed. On the other hand, application code
should be written to catch exceptions that might be raised or that risk
aborting a task entirely.
As an example, consider the authClient.login()
method that returns an
authorization token for the user. In example code this might be written as
token = authClient.login ('foobar', getpass())
where we assume a valid user and password are entered and the token
variable then contains a valid auth token for the application. In application
code we would want to protect this call to catch a login error with something
like:
try:
token = authClient.login ('foobar', getpass())
except Exception as e:
print ('Login Error: ' + str(e))
else:
print ("Logged in as user '%s'" % token.split('.')[0])
The added try-except
block in this code snippet, however, distracts from
the example use of login()
being demonstrated. We recommend therefore that
its use should be limited either to examples that show explicitly how errors
are to be handled or when writing production-quality code.
Note
In many cases existing client API code now returns a mix of an ‘OK’ string, and error message, or the (valid) return data from the service. We wish to have all methods throw exceptions on an error and return either nothing or valid data from the service.
Client API methods may return objects of various types. Two issues still to be settled are:
proper handling of boolean return values, i.e. ensure the Python True/False type is returned and not strings
proper handling of string return types, i.e. enforce ‘string’ or allow for return of ‘byte’ types under Py3 that may require decoding.