0 votes
I am querying the nsc_dr2.meas table over TAP. I'm trying to grab measurements grouped in chunks by their associated exposure's date. This is pretty slow. I think it's slow because nsc_dr2.meas's "exposure" column is not indexed, so when I JOIN from nsc_dr2.meas to nsc_dr2.exposure, the database must do a full scan.

Am I missing the right way to access data? Or, could an index be added? Indexing nsc_dr2.meas by exposure would probably help many use cases.
by swnelson (160 points) | 107 views

1 Answer

+1 vote
Best answer
Hi Spencer,

Thanks for your message.  We've added indexes to the 'mjd' and 'exposure' columns of the nsc_dr2.meas table and they have finished building.  I didn't see much in your query that could be optimized for the data that you want, however I *would* suggest you join on the 'mjd' value rather than 'exposure' because it should be significantly faster using floating-point indexes rather than text.  Because a full result set will be 1.7B rows you'll still need to process in chunks using the random-id as you've been doing, but each query should now only in 1-2 sec (in the DB).

Hope this helps, let us know if you continue to have problems or questions.
by fitz (5.5k points)
selected by 0
Terrific! Thank you!

427 questions

441 answers


644 users

Welcome to Data Lab Help Desk, where you can ask questions and receive answers from other members of the community.