Anomalous geometric clustering of high-pmraerr artifacts in NSC DR2

Question

1 view

Hello Astro Data Lab community,

I have been doing some data analysis on the NSC DR2 catalog via ADQL, specifically filtering for deep-space candidates and tracking pipeline calculation errors. My goal was to isolate cross-epoch tracklet linking failures by targeting extreme kinematic flags.

I used a query filtering for parameters like pmraerr > 90.0 (or extreme proper motion values) combined with an observation gap of deltamjd > 30.0 and ndet >= 3.

When I extracted this dataset and processed the coordinates through a 3D Python projection, I noticed a very unusual statistical distribution. Instead of the pipeline noise and artifacts being scattered randomly across the surveyed sky, the rejected candidates (over 270,000 objects) cluster very tightly—within a 3.0° window—along highly specific, symmetrical geometric axes (forming a hexahedral lattice pattern). A Monte Carlo test adjusting for ecliptic survey bias suggests this clustering is highly non-random.

I have uploaded the specific ADQL query, the Python processing scripts, and the resulting CSV dataset to Zenodo for reproducibility:

https://zenodo.org/records/20178610

My question is regarding the data processing pipeline: What algorithmic or instrumental mechanisms could cause pipeline noise and tracklet failures to align in such a highly structured, non-uniform spatial pattern? Could this geometric clustering be a known artifact of the astrometric reduction pipeline, a systemic bias in how the catalog handles extreme cross-epoch deviations, or an anomaly tied to the underlying survey footprint/tessellation?

Any insights into the pipeline architecture that could explain this specific distribution of errors would be greatly appreciated!

SELECT

id, ra, dec, pmra, pmdec, pmraerr, pmdecerr,

gmag, rmag, imag, ndet, mjd, deltamjd, class_star, flags

FROM nsc_dr2.object

WHERE (

(ra BETWEEN 42.0 AND 48.0 AND dec BETWEEN -3.0 AND 3.0) OR

(ra BETWEEN 132.0 AND 138.0 AND dec BETWEEN -3.0 AND 3.0) OR

(ra BETWEEN 222.0 AND 228.0 AND dec BETWEEN -3.0 AND 3.0) OR

(ra BETWEEN 312.0 AND 318.0 AND dec BETWEEN -3.0 AND 3.0) OR

(ra BETWEEN 357.0 AND 360.0 AND dec BETWEEN 42.0 AND 48.0) OR

(ra BETWEEN 0.0 AND 3.0 AND dec BETWEEN 42.0 AND 48.0) OR

(ra BETWEEN 87.0 AND 93.0 AND dec BETWEEN 42.0 AND 48.0) OR

(ra BETWEEN 177.0 AND 183.0 AND dec BETWEEN 42.0 AND 48.0) OR

(ra BETWEEN 267.0 AND 273.0 AND dec BETWEEN 42.0 AND 48.0) OR

(ra BETWEEN 357.0 AND 360.0 AND dec BETWEEN -48.0 AND -42.0) OR

(ra BETWEEN 0.0 AND 3.0 AND dec BETWEEN -48.0 AND -42.0) OR

(ra BETWEEN 87.0 AND 93.0 AND dec BETWEEN -48.0 AND -42.0) OR

(ra BETWEEN 177.0 AND 183.0 AND dec BETWEEN -48.0 AND -42.0) OR

(ra BETWEEN 267.0 AND 273.0 AND dec BETWEEN -48.0 AND -42.0)

)

AND (pmraerr > 90.0 OR pmdecerr > 90.0 OR gmag > 90.0 OR (pmra*pmra + pmdec*pmdec) > 2500.0)

AND deltamjd > 30.0

AND ndet >= 3

Thanks, Victor

asked May 19 by lomakez (240 points) | 1 view

3 Answers

Answer 1 · 2026-05-19T18:27:28+0000

Additional context regarding the scale of the artifacts and 504 Timeout errors:

Just to add some crucial context to my original post: the 270,000 objects I mentioned were isolated using strict, localized RA/Dec bounding boxes.

When I attempt to expand the search parameters or run a global query for these extreme kinematic errors (pmraerr > 90.0, etc.) across the broader catalog, the volume of these pipeline-rejected candidates appears to be in the millions.

In fact, the result set is so massive that querying it synchronously consistently triggers a 504 Gateway Timeout error, as the server cannot process and return that many anomalous rows within the session limit.

Is there a recommended asynchronous method (perhaps via MyDB or TAP jobs) to extract this global dataset of millions of high-pmraerr artifacts without timing out the server? This massive scale makes understanding the structural distribution of these pipeline failures even more critical.

Answer 2 · 2026-05-19T18:30:12+0000

mportant clarification regarding the predictive nature of the grid:

I would like to emphasize a crucial methodological point: these specific coordinate zones were not found by blindly scanning the entire catalog to fit a pattern. They were mathematically derived a priori using noncommutative geometry (specifically, through the minimization of the spectral action on a helical manifold).

The targeted ADQL query was locked onto these exact theoretical vectors. Crucially, when running control queries in other areas of the sky outside of these mathematically predicted zones, this tight geometric clustering of high-error artifacts completely disappears. The phenomenon is strictly confined to these specific resonant boundaries.

Answer 3 · 2026-05-19T18:32:07+0000

Conceptual framework: The Ovoid (Egg-shaped) Topological Boundary

To visualize the physical structure behind these mathematical predictions, it helps to look at the macroscopic vacuum architecture as a system of nested membranes—conceptually similar to the structure of an egg.

The inner resonance sphere (the "yolk") and the outer bounding surface (the "shell") are defined by the rigid 3:1 surface area ratio dictated by the hexahedral geometry. However, when we map this rigid mathematical lattice onto the actual physical space, it interacts with the galactic potential and the Solar System's velocity through the interstellar medium.

This interaction deforms the perfect theoretical spheres into an asymmetric, ovoid (egg-like) boundary. The anomalies and pipeline errors I have isolated are not just random noise; they are tracking the structural stress points—the exact topological "fractures"—along the inner and outer membranes of this egg-shaped heliospheric system.

Anomalous geometric clustering of high-pmraerr artifacts in NSC DR2

Please log in or register to add a comment.

Your answer

3 Answers

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Categories