datalab search + cross match

Question

351 views

Dear colleagues,

I have encountered an issue about the search algorithm. Here is the story: I have a sample of ~500 nearby galaxies, which are extracted from SDSS spectra survey. I would like to extract the objects within 0.25 deg of the target using the function of "q3c_join" and meet the criteria of "ls.dered_mag_r < 18 AND ls.dered_mag_r > 15". However, it takes too long to finish the search in the database and has the issue of timeout. I have already created the database using the 500 target galaxies and below is the full script of the SQL:

sql = '''SELECT lsls.ra, lsls.dec, lsls.dered_mag_g as LS_mag_g, lsls.dered_mag_r as LS_mag_r,
lsls.dered_mag_z as LS_mag_z, lsls.snr_z as LS_snr_z,
lsls.dered_mag_w1 as mag_w1, lsls.dered_mag_w2 as mag_w2,
q3c_dist(ng.ra,ng.dec,lsls.ra,lsls.dec) as dist_deg
FROM mydb://nearbygalaxyz009 AS ng
LEFT JOIN LATERAL (
SELECT ls.*
FROM
ls_dr9.tractor AS ls
WHERE
q3c_join(ng.ra, ng.dec, ls.ra, ls.dec, 0.25) AND ls.dered_mag_r < 18
AND ls.dered_mag_r > 15
ORDER BY
q3c_dist(ng.ra,ng.dec,ls.ra,ls.dec)
ASC LIMIT 1
) as lsls ON true;'''

I am writing here to ask for any help. Maybe there is some other function which could resolve this issue. Thanks.

asked Dec 7, 2022 by anonymous | 351 views

1 Answer

Answer 1 · 2022-12-12T12:04:24+0000

Firstly, q3c_join() is the correct function to use in this case, however a 0.25deg radius seems rather large. You say you want objects within that radius but then use a "LIMIT 1" constraint to make is just a nearest-neighbor search. Remove the LIMIT statement if you want all objects, otherwise a much smaller search radius should suffice and run faster.

Even though your mydb table is only ~500 objects, the inner query with the magnitude constraints is executed for each of those objects. A more efficient method would be to create a second mydb table with just the objects satisfying the magnitude constraint (reducing the total size to ~31M rows) and use that in the query. For the q3c_join() to work efficiently, however, the table needs to have a Q3C index which you can create with the mydb_index() method (see below). Your script would then look something like:

from dl import queryClient as qc

# Create the magnitude-constrained tractor table and indexes
stat = qc.query ('select * from ls_dr9.tractor where dered_mag_r between 15.0 an
d 18.0', out='mydb://ls_dered_mag_r')
qc.mydb_index(token,'mydb://ls_dered_mag_r','ra')
qc.mydb_index(token,'mydb://ls_dered_mag_r','dec')
qc.mydb_index(token,'mydb://ls_dered_mag_r','', q3c='ra,dec', cluster=True)

sql = """
       SELECT lsls.*, q3c_dist(ng.ra,ng.dec,lsls.ra,lsls.dec) as dist_deg
            FROM mydb://nearbygalaxyz009 AS ng
            LEFT JOIN LATERAL (
                  SELECT ls.*
                       FROM mydb://ls_dered_mag_r as ls
                       WHERE
                           q3c_join(ng.ra, ng.dec, ls.ra, ls.dec, 0.25)
                       ORDER BY
                           q3c_dist(ng.ra,ng.dec,ls.ra,ls.dec)
                  ) as lsls ON true;
      """

# Execute the query.
res = qc.query(sql=sql)

Creating the mydb://ls_dered_mag_r table will still take some time and may need to be run async separately, but the final query should execute in a few seconds and produce ~4200 rows.

Hope this helps, let us know if you continue to have problems.

datalab search + cross match

Please log in or register to add a comment.

Your answer

1 Answer

Please log in or register to add a comment.

Categories