0 votes
9 views

So I tried to retrieve some data from ls_dr9.tractor_n using the following code: 

%%time

aquery = """
        SELECT ra
        FROM   ls_dr9.tractor_n
        WHERE  (dered_mag_z > 16.0 and dered_mag_z < 23.0 and dered_mag_w1 != 'NaN' and dered_mag_w2 != 'NaN'  
               and dered_mag_w1 != 'Infinity' and dered_mag_w2 != 'Infinity' and shape_r < 1.0 and snr_g < 3.0
               and snr_w1 > 3.0 and dered_mag_g > 24.0 or dered_flux_g < 0.0)
        LIMIT 10000
        """

### ls_dr8.tractor is the main catalog combining both north and south

### async mode returns jobid
jobid = qc.query(token, adql=aquery, format='pandas', async_=True, timeout=3600, out='vos://LS_DR8N_part2.csv')
status = ''
while (status != 'COMPLETED' and status != 'ERROR'):
    status = qc.status(token,jobid)
    print(status)
    time.sleep(2)

print ('error message ', qc.error(token,jobid)) 

response=qc.results(token,jobid)
df = convert(response,'pandas')

df.to_csv('LS_DR9N.csv')

But no matter what I select and how I set the limit, it always yields that confusing "error: list index out of range". I thought about the possibility that it's the jobid or convert part which might be wrong, though it did not appear to me how I could change those parts. And I would really appreciate any advice on the code. 

asked Nov 18 by anonymous | 9 views

1 Answer

0 votes
Best answer

Hi, thanks for reaching out. A few fixes to how you invoke your query should fix the problem:

- If you have a LIMIT like 10000, the query runs fast enough for sync mode.

- Note also that to get a pandas dataframe, the argument is 'fmt' (not 'format').

- If you wish to write out the result to a CSV file on vospace, I think you must also use fmt='csv' (which is the default by the way).

- If you are asking for a pandas dataframe as the returned result, you don't do a manual .convert() afterwards, since you already have a dataframe.

- Finally, your query string is SQL, not ADQL (in ADQL you'd use 'SELECT TOP 10000 ...' instead of '... LIMIT 10000' for instance).

So, these all work for me:

df = qc.query(sql=aquery, fmt='pandas') # with LIMIT 10000 takes 0.4s

Or with manual conversion:

res = qc.query(sql=aquery, fmt='csv')
df = convert(res, outfmt='pandas')

If you wanted to write out a CSV file:

res = qc.query(sql=aquery, fmt='csv', out='vos://my_file1.csv')

which you can then retrieve and convert at any time with:

from dl import storeClient as sc
data = sc.get('vos://my_file1.csv')
df = convert(data)

Finally, for longer runs, the async version, writing out to a CSV file on vospace (plus a trick to do the async-wait-loop automatically):

res = qc.query(sql=aquery, fmt='csv', async_=True, wait=True, verbose=True, poll=10, out='vos://my_file2.csv')
df = convert(res)

Hope this is helpful. Let us know if you encounter more problems.

Robert

answered Nov 18 by robertdemo (2,880 points)

282 questions

281 answers

315 comments

1,886 users

Welcome to Data Lab Help Desk, where you can ask questions and receive answers from other members of the community.