Section author: Robert Nikutta <robert.nikutta@noirlab.edu>, Alice Jacques <alice.jacques@noirlab.edu>

Version: 20230829

1.5.2. Query Client

1.5.2.1. Overview

The Query Client provides functions to query Data Lab database holdings for data (either synchronously or asynchronously), and to retrieve the query results. Much of its functionality is described in the default Data Lab Jupyter notebook on the Query Client.

1.5.2.2. Help

The most important thing to know when starting to use a new application or program is how to get help when you are stuck. See the Query Client API documentation or post a question on the Data Lab Help Desk for assistance. The available commands for the Data Lab Query Client are:

> from dl import queryClient as qc
> qc?

Help on module dl.queryClient in dl:

NAME
    dl.queryClient

DESCRIPTION
    querymanager.queryClient
    ========================

    Client methods for the DataLab Query Manager Service.

    Query Manager Client Interface:

    isAlive              - Check whether the QueryManager service at the given URL is alive
                           and responding. This is a simple call to the root service URL or
                           ping() method.
    set_svc_url          - Set the Query Manager service URL.
    get_svc_url          - Get the currently-used Query Manager service URL.
    set_profile          - Set the service profile to be used.
    get_profile          - Get the current query profile.
    list_profiles        - Retrieve the profiles supported by the Query Manager service.
    set_timeout_request  - Set the synchronous query timeout value.
    get_timeout_request  - Get the current synchronous query timeout value.
    schema               - Return information about a data service schema.
    services             - List the services available for all datasets.
    query                - Send an SQL or ADQL query to the database or TAP service.
    status               - Get the status of an asynchronous query.
    jobs                 - Get a list of the user's asynchronous jobs.
    results              - Retrieve the results of an asynchronous query, once completed.
    error                - Retrieve the error of an asynchronous query, once completed.
    abort                - Abort the specified asynchronous job.
    mydb_list            - List the tables or table schema in the user's MyDB.
    mydb_create          - Create a table in the user's MyDB.
    mydb_insert          - Insert data into a table in the user's MyDB.
    mydb_import          - Import data into a table in the user's MyDB.
    mydb_truncate        - Truncate the specified table in the user's MyDB.
    mydb_index           - Index the specified column in a table in the user's MyDB.
    mydb_drop            - Drop the specified table from the user's MyDB.
    mydb_rename          - Rename a table in the user's MyDB to a new name.
    mydb_copy            - Copy a table in the user's MyDB to a new name.

1.5.2.3. Database Queries

Data Lab hosts many large datasets that can be easily accessed via the query tools. The qc.query() command will execute the query and return the results directly to Python in whatever format you specify. A simple query looks like this:

> from dl import queryClient as qc
> res = qc.query('select * from smash_dr1.object limit 1000', fmt='pandas')
> len(res)
1000

In this case, the results of 1000 records/rows of the SMASH DR1 object table are returned as a Pandas Data Frame. The basic SQL syntax is SELECT <columns> FROM <catalog>.<table> WHERE <constraints> LIMIT <limit> (see SQL tutorial for more help). Regular SQL (default) and ADQL are supported (specify with qtype='sql|adql').

To see what catalogs are available, use the qc.schema() to print them out:

  > from dl import queryClient as qc
  > print(qc.schema())

    Schema Name   Description
 ---------------   -----------
         allwise   The AllWISE data release
     buzzard_dr1   Buzzard Data Release 1
     catwise2020   CatWise 2020 data
         dad_dr1   DECam Asteroid Database -- Data Release 1
         dad_dr2   DECam Asteroid Database -- Data Release 2
      decaps_dr1   DECaPS Survey Data Release 1
      decaps_dr2   DECaPS Survey Data Release 2
       delve_dr1   First data release of the DELVE survey
       delve_dr2   Second data release of the DELVE survey
         des_dr1   Dark Energy Survey (DES) Data Release 1
         des_dr2   Dark Energy Survey (DES) Data Release 2
        des_sva1   DES SVA1 Data Products
        desi_edr   Early Data Release from the Dark Energy Spectroscopic Instrument
        gaia_dr1   GAIA Data Release 1
        gaia_dr2   Gaia Data Release 2 -- April 25, 2018
        gaia_dr3   Gaia Data Release 3
       gaia_edr3   The early installment of the third Gaia data release
       gnirs_dqs   Gemini North Near-Infrared Distant Quasar Survey
     gogreen_dr1   GOGREEN and GCLASS first data release survey catalogues
       hipparcos   ftp://cdsarc.u-strasbg.fr/pub/cats/I/239/ReadMe
      hipparcos2   A new reduction of the hipparcos astrometric data
 ivoa_decaps_dr1   IVOA SIA/ObsCore Tables for ivoa_decaps_dr1 images
    ivoa_des_dr1   IVOA SIA/ObsCore Tables for Dark Energy Survey DR1 images
    ivoa_des_dr2   IVOA SIA/ObsCore Tables for ivoa_des_dr2 images
 ivoa_des_dr2_se   IVOA SIA/ObsCore Tables for ivoa_des_dr2_se images
   ivoa_des_sva1   IVOA SIA/ObsCore Tables for ivoa_des_sva1 images
     ivoa_des_y1   IVOA SIA/ObsCore Tables for ivoa_des_y1 images
     ivoa_des_y2   IVOA SIA/ObsCore Tables for ivoa_des_y2 images
     ivoa_des_y3   IVOA SIA/ObsCore Tables for ivoa_des_y3 images
     ivoa_des_y4   IVOA SIA/ObsCore Tables for ivoa_des_y4 images
ivoa_gogreen_dr1   IVOA SIA/ObsCore Tables for gogreen_dr1 images
     ivoa_ls_dr8   IVOA SIA/ObsCore Tables for ls_dr8 images
     ivoa_ls_dr9   IVOA SIA/ObsCore Tables for ls_dr9 images
        ivoa_nsa   IVOA SIA/ObsCore Tables for NOAO Science Archive images
    ivoa_nsc_dr2   IVOA SIA/ObsCore Tables for ivoa_nsc_dr2 images
   ivoa_sdss_dr9   IVOA SIA/ObsCore Tables for sdss_dr9 images
  ivoa_smash_dr1   IVOA SIA/ObsCore Tables for SMASH DR1 images
  ivoa_smash_dr2   IVOA SIA/ObsCore Tables for ivoa_smash_dr2 images
  ivoa_splus_dr1   IVOA SIA/ObsCore Tables for ivoa_splus_dr1 images
  ivoa_splus_edr   IVOA SIA/ObsCore Tables for ivoa_splus_edr images
         ls_dr10   Legacy Survey (LS) Data Release 10
          ls_dr8   Legacy Survey Data Release 8
          ls_dr9   Legacy Survey (LS) Data Release 9
        lsst_sim   lsst_sim
            mydb   MyDB table schema
         nsc_dr1   NOIRLab Source Catalog Data Release 1
         nsc_dr2   NOIRLab Source Catalog Data Release 2
         phat_v2   Panchromatic Hubble Andromeda Treasury Version 2
       sdss_dr12   Sloan Digital Sky Survey Data Release 12
       sdss_dr13   Sloan Digital Sky Survey Data Relase 13
       sdss_dr14   Sloan Digital Sky Survey Data Release 14
       sdss_dr16   Sloan Digital Sky Survey Data Release 16
       sdss_dr17   Sloan Digital Sky Survey Data Release 17
         sga2020   Siena Galaxy Atlas 2020
   skymapper_dr2   https://skymapper.anu.edu.au/data-release/
       smash_dr1   SMASH Data Release 1
       smash_dr2   SMASH Data Release 2
          sparcl   SPectra Analysis & Retrievable Catalog Lab
       splus_dr1   The Southern Photometric Local Universe Survey (S-PLUS) DR1
       splus_dr2   The Southern Photometric Local Universe Survey (S-PLUS) DR2
       splus_edr   The Southern Photometric Local Universe Survey (S-PLUS) EDR
        stripe82   Primary object catalogs in Stripe 82 footprint
      tap_schema   TAP Schema Tables
         twomass   2MASS All-Sky Catalogs (2003)
          tycho2   The Tycho-2 Catalogue
 ukidss_dr11plus   Calibrated data from the UKIRT Wide Field Camera (WFCAM).
      unwise_dr1   unWISE DR1 band-merged catalog
            usno   USNO Astrometry Catalogs
         vhs_dr5   VISTA Hemisphere Survey (VHS) band-merged multi-waveband

The available output formats are csv, ascii (tab-delimited), numpy array, numpy structured array (structarray), pandas, Astropy table, and Astropy votable.

It’s also possible to save query results directly to your VOSpace or MyDB. That is done with the out= option. Use a vos:// prefix for VOSpace and mydb:// for MyDB. For VOSpace output the fits and hdf5 formats are also available.

> qc.query('select * from smash_dr1.object limit 1000', fmt='fits', out='vos://smashtest.fits')

> qc.query('select * from smash_dr1.object limit 1000', out='mydb://smashtest')

For longer queries use an async query that will run in the background. Just set async_=True and the returned value will be the JobID. The status can be checked with qc.status(JobID) and the results can be retrieved with res=qc.results(JobID).

> jobid = qc.query('select * from smash_dr1.object limit 8000', async_=True)
> print(f"JobID: {jobid}")
JobID = oawl30k2htvpxn2g

> qc.status(jobid)
'COMPLETED'

> tab = qc.results(jobid)

1.5.2.4. MyDB Tools

The tables in MyDB can also be queried the same way as the standard DL catalogs. Just add a mydb:// prefix to the table in the SQL query.

> tab = qc.query('select * from mydb://smashtest where ra > 100', fmt='table')

There are also a handful of commands for manipulating MyDB tables (see the list of commands under the Help section).

> print(qc.mydb_list())
test2
smashtest
test5

The mydb_import is handy for downloading a table from MyDB to VOSpace to locally. In this example we import a locally stored CSV file called testresult.csv into a MyDB table called testresult.

> qc.mydb_import('testresult', './testresult.csv')