Firstly, q3c_join() is the correct function to use in this case, however a 0.25deg radius seems rather large. You say you want objects within that radius but then use a "LIMIT 1" constraint to make is just a nearest-neighbor search. Remove the LIMIT statement if you want all objects, otherwise a much smaller search radius should suffice and run faster.
Even though your mydb table is only ~500 objects, the inner query with the magnitude constraints is executed for each of those objects. A more efficient method would be to create a second mydb table with just the objects satisfying the magnitude constraint (reducing the total size to ~31M rows) and use that in the query. For the q3c_join() to work efficiently, however, the table needs to have a Q3C index which you can create with the mydb_index() method (see below). Your script would then look something like:
from dl import queryClient as qc
# Create the magnitude-constrained tractor table and indexes
stat = qc.query ('select * from ls_dr9.tractor where dered_mag_r between 15.0 an
d 18.0', out='mydb://ls_dered_mag_r')
qc.mydb_index(token,'mydb://ls_dered_mag_r','ra')
qc.mydb_index(token,'mydb://ls_dered_mag_r','dec')
qc.mydb_index(token,'mydb://ls_dered_mag_r','', q3c='ra,dec', cluster=True)
sql = """
SELECT lsls.*, q3c_dist(ng.ra,ng.dec,lsls.ra,lsls.dec) as dist_deg
FROM mydb://nearbygalaxyz009 AS ng
LEFT JOIN LATERAL (
SELECT ls.*
FROM mydb://ls_dered_mag_r as ls
WHERE
q3c_join(ng.ra, ng.dec, ls.ra, ls.dec, 0.25)
ORDER BY
q3c_dist(ng.ra,ng.dec,ls.ra,ls.dec)
) as lsls ON true;
"""
# Execute the query.
res = qc.query(sql=sql)
Creating the mydb://ls_dered_mag_r table will still take some time and may need to be run async separately, but the final query should execute in a few seconds and produce ~4200 rows.
Hope this helps, let us know if you continue to have problems.