Hello Data Lab team,
I am trying to crossmatch three user-uploaded coordinate tables against ls_dr10.tractor using async/MyDB. I would appreciate your advice on whether this workflow is supported, what chunk size/concurrency limits I should use, and whether there may currently be a TAP async or MyDB metadata issue affecting my account.
Input MyDB tables:
mlspecz_north_trainval 395,097 rows
mlspecz_edfn_layered 1,045,139 rows
mlspecz_wide_north_application 9,918,626 rows
Columns:
object_id, ra, dec, sample_name, xmatch_chunk
Target catalog: ls_dr10.tractor
The full async crossmatch query for the smallest table failed immediately with "502 Bad Gateway". No async job ID was returned. This suggests the failure occurred during request handling, query planning, or async job creation, rather than during normal job execution.
The query was:
SELECT
u.object_id,
u.ra AS input_ra,
u.dec AS input_dec,
u.xmatch_chunk,
q3c_dist(u.ra, u.dec, t.ra, t.dec) * 3600.0 AS sep_arcsec,
t.ls_id,
t.ra,
t.dec,
t.type,
t.brick_primary,
t.release,
t.brickid,
t.objid,
t.maskbits,
t.fitbits,
t.flux_g,
t.flux_r,
t.flux_i,
t.flux_z,
t.flux_w1,
t.flux_w2,
t.flux_w3,
t.flux_w4,
t.flux_ivar_g,
t.flux_ivar_r,
t.flux_ivar_i,
t.flux_ivar_z,
t.flux_ivar_w1,
t.flux_ivar_w2,
t.flux_ivar_w3,
t.flux_ivar_w4,
t.wisemask_w1,
t.wisemask_w2
FROM mydb://mlspecz_north_trainval AS u
JOIN ls_dr10.tractor AS t
ON q3c_join(u.ra, u.dec, t.ra, t.dec, 0.0002777777777777778)
WHERE t.brick_primary = 1
The crossmatch radius is 1 arcsec. I also tried rewriting the join using a WHERE-style q3c condition, but got the same 502 error.
To reduce query size, I added object-count chunking using xmatch_chunk and attempted to create MyDB indexes with:
qc.mydb_index(table, 'xmatch_chunk', async_=True)
qc.mydb_index(table, q3c='ra,dec', cluster=True, async_=True)
However, my first chunked notebook submitted many async jobs quickly, which may have overloaded or congested the service. I have now aborted all jobs submitted by me and stopped submitting new chunk batches.
I then checked: https://datalab.noirlab.edu/tap/async
There are still many jobs in QUEUED status, but they appear to be mostly from other users. There are currently no jobs with status EXECUTING. Whenever I submit a new job now, it remains in QUEUED.
There is also a separate MyDB/Data Explorer issue.
After I tried to remove/re-upload or clean up my uploaded MyDB tables, the MyDB panel in the Data Explorer web interface became empty. The MyDB section is visible, but no table names are listed. However, I can still access the tables from Jupyter notebooks, so the tables do not appear to be lost.
Also, I previously removed this table:
mlspecz_lsdr10_north_trainval
but this table name still appears in qc.mydb.list(), while the table itself is no longer accessible. This looks like a Data Explorer/MyDB metadata listing issue, stale cache, or possible MyDB registry inconsistency.
Could you please advise on the following?
1. Is a MyDB-to-ls_dr10.tractor Q3C crossmatch of this size, with user tables larger than 10^5 rows, expected to work through Data Lab async?
2. If the full-table query is not feasible, what chunk size and async-job concurrency limit do you recommend?
3. Can you check whether the TAP async queue is currently stalled or not dispatching jobs?
4. Can you check whether my MyDB metadata is inconsistent, given that my tables are accessible from notebooks but no longer appear in the Data Explorer MyDB panel?
For now I have paused the full workflow to avoid submitting more jobs until I know the recommended limits and whether the current queue/MyDB behaviour is normal.
Thank you very much.