Execute SQL query at each HEALpixel

Question 1

Hi datalab! I'd like to use the simdr2 catalog to construct a small, somewhat spatially uniform catalog of brighter stars. I can query like:
```
SELECT ra,dec,rmag,ring256 FROM lsst_sim.simdr2 WHERE rmag > 17 AND rmag < 17.5 AND ring256=564654 LIMIT 10;
```

This works great (adding in an ORDER By rmag would be nice but seems to slow things down a lot), but expanding this to do a brute force loop over all 786432 HEALpixels seems bad. Is there a better way to get a few million stars with uniform coverage on the sky?

Question 2

As suggested by my esteemed colleague Dino, one can use the random_id column to help generate sub-samples of a catalog. Since there is a strong gradient of sources with galactic latitude, it can be helpful to break the query into low and high latitude. But something like this:

SELECT ra,dec,umag,gmag,rmag,imag,zmag,ymag,ring256
FROM
lsst_sim.simdr2
WHERE
rmag > 17 AND rmag < 17.3
AND
(galb > 30 OR galb < -30)
AND
random_id < 50

generates 700k stars and I can easily trim that further to make the distribution more uniform.

Question 3

Hi Yoachim, very glad you found a work-around! Yes, random_id is exactly for getting random samples that are independent of the particular order in which the rows of the DB table are stored.
Sorry for the late reply, half the team was/is on vacation, plus we worked towards the DESI EDR release a few days ago...

yoachim · Answer 1 · 2023-06-02T12:01:22+0000

As suggested by my esteemed colleague Dino, one can use the random_id column to help generate sub-samples of a catalog. Since there is a strong gradient of sources with galactic latitude, it can be helpful to break the query into low and high latitude. But something like this:

SELECT ra,dec,umag,gmag,rmag,imag,zmag,ymag,ring256
FROM
lsst_sim.simdr2
WHERE
rmag > 17 AND rmag < 17.3
AND
(galb > 30 OR galb < -30)
AND
random_id < 50

generates 700k stars and I can easily trim that further to make the distribution more uniform.

Hi Yoachim, very glad you found a work-around! Yes, random_id is exactly for getting random samples that are independent of the particular order in which the rows of the DB table are stored.
Sorry for the late reply, half the team was/is on vacation, plus we worked towards the DESI EDR release a few days ago... — robertdemo, Jun 21, 2023

Execute SQL query at each HEALpixel

Please log in or register to add a comment.

Your answer

1 Answer

Please log in or register to add a comment.

Categories