0 votes
95 views
I am querying the nsc_dr2.meas table over TAP. I'm trying to grab measurements grouped in chunks by their associated exposure's date. This is pretty slow. I think it's slow because nsc_dr2.meas's "exposure" column is not indexed, so when I JOIN from nsc_dr2.meas to nsc_dr2.exposure, the database must do a full scan.

Am I missing the right way to access data? Or, could an index be added? Indexing nsc_dr2.meas by exposure would probably help many use cases.
by swnelson (160 points) | 95 views

1 Answer

+1 vote
Best answer
Hi Spencer,

Thanks for your message.  We've added indexes to the 'mjd' and 'exposure' columns of the nsc_dr2.meas table and they have finished building.  I didn't see much in your query that could be optimized for the data that you want, however I *would* suggest you join on the 'mjd' value rather than 'exposure' because it should be significantly faster using floating-point indexes rather than text.  Because a full result set will be 1.7B rows you'll still need to process in chunks using the random-id as you've been doing, but each query should now only in 1-2 sec (in the DB).

Hope this helps, let us know if you continue to have problems or questions.
by fitz (5.5k points)
selected by 0
Terrific! Thank you!

416 questions

432 answers

437 comments

636 users

Welcome to Data Lab Help Desk, where you can ask questions and receive answers from other members of the community.

Categories