Page 1 of 5

Updated stars.dat including Gaia EDR3 data (update 2022-03-03)

Posted: 12.01.2020, 13:45
by ajtribick
Here are the Python scripts for generating a stars.dat from the Gaia EDR3 data, using the subset of stars cross-matched to Hipparcos (primarily HIP2, but some attempt is made to cross-match HIP1 stars not present in HIP2) and TYC2 (including TDSC and TYC2 supplement 1). XHIP is used to fill in the data for missing Hipparcos stars. More details are in the README.

Latest release: v1.1.1 (download celestia-gaia-stardb-1.1.1.zip)
Gaia DR2 based release (previous version): v1.0.4 (download celestia-gaia-stardb-1.0.4.zip)

The repository is here, note that the scripts may take an hour or two to run and require downloading quite a lot of data, so I'd recommend picking up the zip file in the release instead, unless you want to edit the scripts yourself.

This may well conflict with star definitions in other files, e.g. binary stars, extrasolar planet hosts or CHARM2 stellar radii, I have made no attempt to address this.

Posted: 12.01.2020, 17:14
by Janus
Very nice work.
Readable for python, which is great.
Not my favorite language, or one I like, but an improvement over the perl script used to make the original DB.

I would recommend you check your alpha release download name, output.zip is indistinct.
Makes it easy to misplace.
Just like the places that offer setup.exe, with no clue what it sets up in the name, so you have to extract it to know.

I will test the output.zip data once my current debug run for a customer is done.
I think I have finally found the memory corruption they are having, but I have been wrong before.


Janus.

Posted: 12.01.2020, 17:34
by ajtribick
Good point, I renamed the zip file.

an improvement over the perl script used to make the original DB.
Well, I do like to believe I've gotten a little better at coding over the last decade or so...

Posted: 12.01.2020, 18:20
by Janus
Just to be clear, I meant no disrespect.
Most python reads like poly macro expanded templated C/C++ code to me, this doesn't.
I dislike & distrust layered macros, and am wary of templates, though I admit the latter have their place.

The over use of shorthand renders so many scripts unreadable to anyone who doesn't code just like the author.
Which you avoided, and is greatly appreciated.
I am not very good with python, but I could follow it.

I am of the firm opinion that sourcecode is for humans, and the rest is the compilers problem.
If you can't scan the sourcecode like a book looking for things, then something is wrong.

You might laugh, but I tend to use php for stuff like this, and if I need a quick readout, I then send it on to a browser where js is used for display.
I like that php allows you to set a hard timeout in the config file so a hung script is not an issue.
I also find php more readable, I just get along with it.

A quick question.
Does the Gaia data include stellar motion?
If so, how hard is it to calculate XYZ movement from it?
I know this limits the time the movement is valid for, but I can live with a few million years, it will do.

I ask because I am playing with modifications to star::getposition & star::getorbitbarycenter to use the time to compute a position.
The result is multiplying the time in years from 2000, then adding the result to give a time adjusted position.
The purpose is set a time in celestia, then use a script to change the time ten years per second, and watch the stars move.
I will include a time ratio settable from script so you can things like make a year equal to a thousand, to allow looking at stars from a million years ago, or for a million years from now.

Tweaking the database and its reader is simple, I have done so before.
I dislike perl, a LOT!, but I can use it.
I made a DB with double star positions, another where I added Ra/Dec/Dist, and some others.
Using the python scripts will be easier.

Though I can eventually work out the math, it is much simpler to ask if anyone else has already done the work.
I know about the astronexus hyg-databse, but the gaia data would be more accurate.


Janus.

Posted: 12.01.2020, 18:34
by Sirius_Alpha
This is absolutely astounding. The structure of the local Galaxy is very visible.

You can see clusters very easily now from afar. The older poor-accuracy parallaxes and photometric distances were so bad that you really couldn't see clusters beyond the Pleaides. But now all kinds of detail and clusters pop out. There's still some smearing due to the parallax uncertainties, but it's nowhere near as bad as before.

Clusters.png
Several galactic clusters circled in red.


Below is a view of the Praesepe Cluster from the Pr0201 system. We always kinda knew these sorts of views were possible from known extrasolar planetary systems but before this it's not like we could approximate them very well.

Pr0201.png
Praesepe Cluster from Pr0201.

Posted: 12.01.2020, 18:53
by SevenSpheres
I've added these files in this branch of Celestia's GitHub repository (it previously added LukeCEL's star database files). Should I make a pull request from that branch?

Posted: 12.01.2020, 19:18
by onetwothree
SevenSpheres wrote:I've added these files in this branch of Celestia's GitHub repository (it previously added LukeCEL's star database files). Should I make a pull request from that branch?

Maybe not yet until cross-index issue is resolved.

Posted: 12.01.2020, 19:30
by ajtribick
Janus wrote:A quick question.
Does the Gaia data include stellar motion?
If so, how hard is it to calculate XYZ movement from it?
I know this limits the time the movement is valid for, but I can live with a few million years, it will do.
Gaia DR2 does contain proper motions and radial velocities. This information is not present in the output files (the generation scripts do use the HIP2 proper motions to update the positions of the non-Gaia stars to the Gaia DR2 epoch though).

SevenSpheres wrote:Should I make a pull request from that branch?
You should keep at least the reference to the source data (see my repository's README.md) around if you're going to merge this add-on into the code base, and provide credit in the appropriate Celestia README files.

Posted: 12.01.2020, 19:39
by LukeCEL
Awesome, thank you!

Posted: 16.01.2020, 06:17
by Janus
@ajtribick

Nice set of scripts.
However, small issue.

They require a gaia account to do the queries.
Something you should warn people about.


Janus.

Posted: 16.01.2020, 07:02
by ajtribick
Janus wrote:However, small issue.

They require a gaia account to do the queries.
Something you should warn people about.
Indeed, that's why it's the second bullet point in the Prerequisites section of the README.md that's displayed on the main page of the repository on GitHub.

Posted: 15.02.2020, 12:19
by ajtribick
Fixed an issue with duplicate stars which resulted from not accounting for duplicate entries in the SAO cross match. The v0.1.2-alpha release is linked in the first post of the thread.

Posted: 15.02.2020, 17:58
by Art Blos
ajtribick wrote:I also haven't tested it in a 32-bit build either, so no guarantees this won't cause everything to crash with out-of-memory.
You underestimate 32-bit systems. Your database not big enough to crash with out-of-memory. :smile:

Can you tell me the exact number of stars?

Posted: 15.02.2020, 18:09
by SevenSpheres
Art Blos wrote:Can you tell me the exact number of stars?

I can tell you: there are 2,463,589 stars according to Celestia's console.

Posted: 15.02.2020, 18:15
by Art Blos
SevenSpheres wrote:I can tell you: there are 2,463,589 stars according to Celestia's console.
Cool! In LukeCEL's version was 2 448 220. I.e +15 369 stars. :clap:

Added after 3 minutes 30 seconds:
I hope the accuracy of the coordinates is also at a height compared to its version. At least a little bit. :smile:

Posted: 15.02.2020, 18:19
by ajtribick
Good to know it works on 32-bit! One way of finding out how many stars there are in a stars.dat file is to take the file size in bytes, subtract 14 (to account for the file header, which incidentally also contains the number of stars) then divide by 20.

Posted: 15.02.2020, 18:23
by Art Blos
ajtribick wrote:Good to know it works on 32-bit! One way of finding out how many stars there are in a stars.dat file is to take the file size in bytes, subtract 14 (to account for the file header, which incidentally also contains the number of stars) then divide by 20.
This formula is works, thanks.

Are you planning to increase the number of stars?

Posted: 16.02.2020, 09:43
by jujuapapa
Great work ajtribick ! :clap:

=> same question than ArtBlos...

#
### UPDATE
#


I saw differences between GrantHutchinson's catalog (found on celestiamotherlode :
http://celestiamotherlode.net/catalog/show_addon_details.php?addon_id=1114) and gaia catalog !

First example : HIP 33562
extended : D=74.973 LY & ST= K0V
gaia : D=0.58163 Kpc & ST=G5IV


2nd example : TYC 7630-1045-1
Ext : D=110.53 LY & ST=K8V
gaia : D=1.3803 Kpc & ST=K5

Which are true ? :eek:

Anyone has seen problems ?

Posted: 16.02.2020, 14:32
by Art Blos
jujuapapa wrote:I saw differences between GrantHutchinson's catalog
This catalog was created back in 2005, and also had certain problems with duplicates. Leave it in the past.

Posted: 16.02.2020, 20:48
by ajtribick
jujuapapa wrote:I saw differences between GrantHutchinson's catalog (found on celestiamotherlode :
http://celestiamotherlode.net/catalog/show_addon_details.php?addon_id=1114) and gaia catalog !
This is expected. That catalogue (originally by Pascal Hartmann, converted to Celestia 1.4.0 format by Grant Hutchison) is based on the All Sky Compiled Catalogue (ASCC) and uses distances estimated from the B-V colour index and the apparent magnitude. This was required because most of the stars did not have measured parallaxes.

I also use the ASCC to provide magnitudes for Tycho-2 stars, but I'm not using photometric distance estimates: instead I'm taking distances from either the Gaia distance catalogue (preferred), distances based on cluster membership in XHIP (second choice), or computed distances based on parallaxes in XHIP (third choice). Note that in this third case I'm not using 1/parallax as the distance, I'm using the technique in Astraatmadja & Bailer-Jones (2016), which is consistent with the Gaia distance catalogue.

Spectral types are different because I'm again using different sources: in the case of HIP 33562, ASCC and HIP1 give a spectral type of K0. I'm using the XHIP spectral type which is G5IV. For Tycho stars, I use the Tycho-2 spectral type catalogue. For stars which don't have spectral type entries I'm using a spectral type from the estimated temperature: Gaia (first choice), the Tycho effective temperature catalogue (second choice) or estimating it using the colour indices (B-V, V-I, V-K, J-K and H-K, reddening is not taken into account). I don't attempt to estimate luminosity classes.

Art Blos wrote:Are you planning to increase the number of stars?
No. Celestia's current approach of loading the entire database and building the octree at startup doesn't scale to handling the Gaia catalogue. It would also require changing Celestia's identifier handling: currently the HIP/encoded TYC identifier (which incidentally originated with the Pascal Hartmann catalogue) is treated as fundamental but that's not going to work with the larger catalogue as the additional stars will not be HIP/TYC stars. Gaia identifiers are 64-bit integers so you can't stuff them into the 32-bit field, in any case you don't want to use those as the fundamental identifier because you need to include non-Gaia stars, notably the stars that are too bright for Gaia to measure).