Section author: Robert Nikutta <robert.nikutta@noirlab.edu>, Alice Jacques <alice.jacques@noirlab.edu>
Version: 20230829
1.5.2. Query Client¶
1.5.2.1. Overview¶
The Query Client provides functions to query Data Lab database holdings for data (either synchronously or asynchronously), and to retrieve the query results. Much of its functionality is described in the default Data Lab Jupyter notebook on the Query Client.
1.5.2.2. Help¶
The most important thing to know when starting to use a new application or program is how to get help when you are stuck. See the Query Client API documentation or post a question on the Data Lab Help Desk for assistance. The available commands for the Data Lab Query Client are:
> from dl import queryClient as qc
> qc?
Help on module dl.queryClient in dl:
NAME
dl.queryClient
DESCRIPTION
querymanager.queryClient
========================
Client methods for the DataLab Query Manager Service.
Query Manager Client Interface:
isAlive - Check whether the QueryManager service at the given URL is alive
and responding. This is a simple call to the root service URL or
ping() method.
set_svc_url - Set the Query Manager service URL.
get_svc_url - Get the currently-used Query Manager service URL.
set_profile - Set the service profile to be used.
get_profile - Get the current query profile.
list_profiles - Retrieve the profiles supported by the Query Manager service.
set_timeout_request - Set the synchronous query timeout value.
get_timeout_request - Get the current synchronous query timeout value.
schema - Return information about a data service schema.
services - List the services available for all datasets.
query - Send an SQL or ADQL query to the database or TAP service.
status - Get the status of an asynchronous query.
jobs - Get a list of the user's asynchronous jobs.
results - Retrieve the results of an asynchronous query, once completed.
error - Retrieve the error of an asynchronous query, once completed.
abort - Abort the specified asynchronous job.
mydb_list - List the tables or table schema in the user's MyDB.
mydb_create - Create a table in the user's MyDB.
mydb_insert - Insert data into a table in the user's MyDB.
mydb_import - Import data into a table in the user's MyDB.
mydb_truncate - Truncate the specified table in the user's MyDB.
mydb_index - Index the specified column in a table in the user's MyDB.
mydb_drop - Drop the specified table from the user's MyDB.
mydb_rename - Rename a table in the user's MyDB to a new name.
mydb_copy - Copy a table in the user's MyDB to a new name.
1.5.2.3. Database Queries¶
Data Lab hosts many large datasets that can be easily accessed via the query tools.
The qc.query()
command will execute the query and return the
results directly to Python in whatever format you specify. A simple query looks like this:
> from dl import queryClient as qc
> res = qc.query('select * from smash_dr1.object limit 1000', fmt='pandas')
> len(res)
1000
In this case, the results of 1000 records/rows of the SMASH DR1 object
table are returned as a Pandas Data Frame. The basic SQL syntax is SELECT <columns> FROM <catalog>.<table> WHERE
<constraints> LIMIT <limit>
(see SQL tutorial for more help). Regular
SQL (default) and ADQL are supported (specify with qtype='sql|adql'
).
To see what catalogs are available, use the qc.schema()
to print them out:
> from dl import queryClient as qc
> print(qc.schema())
Schema Name Description
--------------- -----------
allwise The AllWISE data release
buzzard_dr1 Buzzard Data Release 1
catwise2020 CatWise 2020 data
dad_dr1 DECam Asteroid Database -- Data Release 1
dad_dr2 DECam Asteroid Database -- Data Release 2
decaps_dr1 DECaPS Survey Data Release 1
decaps_dr2 DECaPS Survey Data Release 2
delve_dr1 First data release of the DELVE survey
delve_dr2 Second data release of the DELVE survey
des_dr1 Dark Energy Survey (DES) Data Release 1
des_dr2 Dark Energy Survey (DES) Data Release 2
des_sva1 DES SVA1 Data Products
desi_edr Early Data Release from the Dark Energy Spectroscopic Instrument
gaia_dr1 GAIA Data Release 1
gaia_dr2 Gaia Data Release 2 -- April 25, 2018
gaia_dr3 Gaia Data Release 3
gaia_edr3 The early installment of the third Gaia data release
gnirs_dqs Gemini North Near-Infrared Distant Quasar Survey
gogreen_dr1 GOGREEN and GCLASS first data release survey catalogues
hipparcos ftp://cdsarc.u-strasbg.fr/pub/cats/I/239/ReadMe
hipparcos2 A new reduction of the hipparcos astrometric data
ivoa_decaps_dr1 IVOA SIA/ObsCore Tables for ivoa_decaps_dr1 images
ivoa_des_dr1 IVOA SIA/ObsCore Tables for Dark Energy Survey DR1 images
ivoa_des_dr2 IVOA SIA/ObsCore Tables for ivoa_des_dr2 images
ivoa_des_dr2_se IVOA SIA/ObsCore Tables for ivoa_des_dr2_se images
ivoa_des_sva1 IVOA SIA/ObsCore Tables for ivoa_des_sva1 images
ivoa_des_y1 IVOA SIA/ObsCore Tables for ivoa_des_y1 images
ivoa_des_y2 IVOA SIA/ObsCore Tables for ivoa_des_y2 images
ivoa_des_y3 IVOA SIA/ObsCore Tables for ivoa_des_y3 images
ivoa_des_y4 IVOA SIA/ObsCore Tables for ivoa_des_y4 images
ivoa_gogreen_dr1 IVOA SIA/ObsCore Tables for gogreen_dr1 images
ivoa_ls_dr8 IVOA SIA/ObsCore Tables for ls_dr8 images
ivoa_ls_dr9 IVOA SIA/ObsCore Tables for ls_dr9 images
ivoa_nsa IVOA SIA/ObsCore Tables for NOAO Science Archive images
ivoa_nsc_dr2 IVOA SIA/ObsCore Tables for ivoa_nsc_dr2 images
ivoa_sdss_dr9 IVOA SIA/ObsCore Tables for sdss_dr9 images
ivoa_smash_dr1 IVOA SIA/ObsCore Tables for SMASH DR1 images
ivoa_smash_dr2 IVOA SIA/ObsCore Tables for ivoa_smash_dr2 images
ivoa_splus_dr1 IVOA SIA/ObsCore Tables for ivoa_splus_dr1 images
ivoa_splus_edr IVOA SIA/ObsCore Tables for ivoa_splus_edr images
ls_dr10 Legacy Survey (LS) Data Release 10
ls_dr8 Legacy Survey Data Release 8
ls_dr9 Legacy Survey (LS) Data Release 9
lsst_sim lsst_sim
mydb MyDB table schema
nsc_dr1 NOIRLab Source Catalog Data Release 1
nsc_dr2 NOIRLab Source Catalog Data Release 2
phat_v2 Panchromatic Hubble Andromeda Treasury Version 2
sdss_dr12 Sloan Digital Sky Survey Data Release 12
sdss_dr13 Sloan Digital Sky Survey Data Relase 13
sdss_dr14 Sloan Digital Sky Survey Data Release 14
sdss_dr16 Sloan Digital Sky Survey Data Release 16
sdss_dr17 Sloan Digital Sky Survey Data Release 17
sga2020 Siena Galaxy Atlas 2020
skymapper_dr2 https://skymapper.anu.edu.au/data-release/
smash_dr1 SMASH Data Release 1
smash_dr2 SMASH Data Release 2
sparcl SPectra Analysis & Retrievable Catalog Lab
splus_dr1 The Southern Photometric Local Universe Survey (S-PLUS) DR1
splus_dr2 The Southern Photometric Local Universe Survey (S-PLUS) DR2
splus_edr The Southern Photometric Local Universe Survey (S-PLUS) EDR
stripe82 Primary object catalogs in Stripe 82 footprint
tap_schema TAP Schema Tables
twomass 2MASS All-Sky Catalogs (2003)
tycho2 The Tycho-2 Catalogue
ukidss_dr11plus Calibrated data from the UKIRT Wide Field Camera (WFCAM).
unwise_dr1 unWISE DR1 band-merged catalog
usno USNO Astrometry Catalogs
vhs_dr5 VISTA Hemisphere Survey (VHS) band-merged multi-waveband
The available output formats are csv
, ascii
(tab-delimited), numpy array
, numpy structured array
(structarray
), pandas
, Astropy table
, and Astropy votable
.
It’s also possible to save query results directly to your VOSpace or MyDB. That is done with the out=
option.
Use a vos://
prefix for VOSpace and mydb://
for MyDB. For VOSpace output the fits
and hdf5
formats
are also available.
> qc.query('select * from smash_dr1.object limit 1000', fmt='fits', out='vos://smashtest.fits')
> qc.query('select * from smash_dr1.object limit 1000', out='mydb://smashtest')
For longer queries use an async
query that will run in the background. Just set async_=True
and the returned
value will be the JobID
. The status can be checked with qc.status(JobID)
and the results can be
retrieved with res=qc.results(JobID)
.
> jobid = qc.query('select * from smash_dr1.object limit 8000', async_=True)
> print(f"JobID: {jobid}")
JobID = oawl30k2htvpxn2g
> qc.status(jobid)
'COMPLETED'
> tab = qc.results(jobid)
1.5.2.4. MyDB Tools¶
The tables in MyDB can also be queried the same way as the standard DL catalogs. Just add a mydb://
prefix to the
table in the SQL query.
> tab = qc.query('select * from mydb://smashtest where ra > 100', fmt='table')
There are also a handful of commands for manipulating MyDB tables (see the list of commands under the Help section).
> print(qc.mydb_list())
test2
smashtest
test5
The mydb_import
is handy for downloading a table from MyDB to VOSpace to locally. In this example we import
a locally stored CSV file called testresult.csv
into a MyDB table called testresult
.
> qc.mydb_import('testresult', './testresult.csv')