How to find out table size/number of datasets in myDB table?

Question

3 Answers

Answer 1 · 2022-07-19T13:00:51+0000

At the present time, there is no equivalent to the 'tbl_stat' table for MyDB tables, although it would be a reasonable thing to add to enhance the output of the mydb_list() method. For the moment, you'll need to do a 'select count(*).....' of the table to get the number of rows. This may be slower than for some published tables, however it should work in all cases. If you encounter a timeout condition you may need to submit the query as an async job.

If the mydb table in question here is 'catwxvhs', the number of rows is 29145358

Answer 2 · 2022-07-20T10:01:39+0000

Sorry, somehow only now I saw your comment.

First, the 250 GB quota is soft, i.e. I wouldn't worry about it (for now ;-))

Second, the best way to save space is to keep in the crossmatch query only the columns your really want (i.e. not all the columns).

Finally, if you can apply constraints (with WHERE) during or before the crossmatch query, that can reduce the number of rows, potentially significantly. For instance, maybe you are interested only in rows with enough SNR? Or only in objects that are redder than some value?...

As to the question "how big will my table be"? If you're counting columns, you could assign 4 bytes per value for each small int and each float. And 8 bytes for doubles and big ints. varchars are hard to estimate. I think a good but quick upper bound is Ncols x 8 bytes x Nrows. (Note that if you were to write out a table to e.g. a CSV file, the resulting file size is unknown, since much depends on details such as encoding, presence of char-valued fields, etc.

Best,
Robert

commented Jul 29, 2022 by datalab (21.9k points)

> the best way to save space is to keep in the crossmatch query only the columns your really want

Thanks for the tip. I intent to save only one column from VHS. That makes five columns total with the mydb table.

The four 'catwxvhs' columns are all double_precision and VHS japermag3 is listed as 'REAL'. So Ncols x 8 bytes x Nrows seems to fit very well. In the worst case of all 29,145,358 rows of 'catwxvhs' being in the crossmatch (very unlikely) this would mean 5 columns * 8 bytes * 29,145,358 rows = 1,165,814,320 bytes (1.16 GB), right?

> …if you can apply constraints (with WHERE) during or before the crossmatch query, that can reduce the number of rows, potentially significantly.

I have indeed multiple contraints which I've concatenated with AND ( WHERE … AND … AND …). Did I get that right that you're suggesting to move the q3c_radial_query to the end of the line?

commented Jul 31, 2022 by martinkb (720 points)
edited Jul 31, 2022 by 2231

How to find out table size/number of datasets in myDB table?

Please log in or register to add a comment.

Your answer

3 Answers

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Categories