Page 1 of 1

SIMBATCH or how to exploit Simbad in batch...

Posted: 23.08.2008, 18:27
by t00fri
Well...

perhaps not everyone interested in astronomical data is aware of the following:

The powerful possibility to exploit the SIMBAD astronomical world data base
http://simbad.u-strasbg.fr/simbad/ in batch operation for large amounts of data, using simple scripting commands.

I do this since a long time and thought I point this possibility out in public for once.

Note, the same batch facility also exists for the extragalactic database NED and has been used extensively by me in my deepsky.dsc file.

This batch option can be enormously useful for many applications.

+++++++++++++++++++++++++
Most of you will know that -- besides accuracy -- my main concern is standardization of Celestia's data base. This starts e.g. with the request for exhaustive cross listings of identifiers for celestial objects, with a syntax that e.g. is understood in SIMBAD queries!
++++++++++++++++++++++++++

So here is a specific example, how I derived complete cross listings for the identifiers in my forthcoming globular clusters implementation --consistently with SIMBAD syntax!

  • One additional command in my globulars.pl PERL script suffices to print out just the main names of my 140 globulars into a file (see the example below).
  • In order to submit this file to SIMBATCH, you merely need to add a format statement on top. It specifies what you want to have displayed and HOW the format should be.

    Here is a simple example of such a format statement along with the first few globular names:

    Code: Select all

    format object form1 "NumberIds: %#IDLIST\nIdentifiers:%IDLIST[%*(S):%*(S):%*(S):%*(S):%*(S):%*(S):%*(S):%*(S):%*(S):%*(S):%*(S):%*(S):%*(S):%*(S):%*(S):%*(S):%*(S):%*(S):%*(S):%*(S)]\nRaDec(decimal): %COO(d;A,D;FK5;J2000)\nSpectralType: %SP(S)\nFluxlist(U,B,V,R,I): %FLUXLIST(U,B,V,R,I)[%6.2*(F), ]\n\n"

    NGC 104
    NGC 288
    NGC 362
    NGC 1261
    Pal 1
    AM 1
    ERI Star Cluster
    Pal 2
    NGC 1851
    NGC 1904
    NGC 2298
    NGC 2419
    NGC 2808
    Name E 3
    Pal 3

    ... 140 entries altogether ...


    The syntax is explained with lots of examples in the corresponding HELP:
    http://simbad.u-strasbg.fr/simbad/sim-h ... im-fscript

    This input file can be VERY conveniently submitted to SIMBAD with a browser facility.
    http://simbad.u-strasbg.fr/simbad/sim-fscript With this browser, select your input file on your machine and click "submit file". A few seconds later, the result from my formatted output request is ready. Note I requested as output, the complete listings of
    • alternative identifiers in SIMBAD, along with their number
    • Ra Dec coordinates as a cross-check in decimal degrees
    • Spectral Type
    • All available color info in the (U,B,V,R,I) "fluxlist"

    The result looks like this:

    ::console:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

    C.D.S. - SIMBAD4 rel 1.094b - 2008.08.23CEST20:06:40
    total execution time: 14.123 secs
    simbatch done

    ::data::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

    Code: Select all

    NumberIds: 19
    Identifiers:FAUST 10:CD-72 20:CPD-72 35:HD 2051:2E 0021.8-7221:1RXS J002404.6-720456:1ES 0021-72.3:1E 0021.8-7221:Cl Melotte 1:2MASX J00240535-7204531:[FS2003] 0013:[BM83] X0021-723:RBS 55:NGC 104:GCl 1:C 0021-723:2E 82:* 47 Tuc:* ksi Tuc:
    RaDec(decimal): 006.02362,-72.08128
    SpectralType: G4
    Fluxlist(U,B,V,R,I):      ~,   5.78,   4.91,      ~,      ~,

    NumberIds: 4
    Identifiers:Cl Melotte 3:NGC 288:GCl 2:C 0050-268::::::::::::::::
    RaDec(decimal): 013.18875,-26.57861
    SpectralType: A
    Fluxlist(U,B,V,R,I):      ~,  10   ,   9.37,      ~,      ~,

    NumberIds: 5
    Identifiers:Cl Melotte 4:NGC 362:LI-SMC 158:GCl 3:C 0100-711:::::::::::::::
    RaDec(decimal): 015.80946,-70.84822
    SpectralType: F9
    Fluxlist(U,B,V,R,I):      ~,   7.97,   7.21,      ~,      ~,

    NumberIds: 3
    Identifiers:NGC 1261:GCl 5:C 0310-554:::::::::::::::::
    RaDec(decimal): 048.06396,-55.21681
    SpectralType: F7
    Fluxlist(U,B,V,R,I):      ~,   9.79,   9.10,      ~,      ~,

    NumberIds: 4
    Identifiers:Cl Pal 1:ZW VII 7:LEDA 13165:C 0325+794::::::::::::::::
    RaDec(decimal): 053.33042,+79.58194
    SpectralType: ~
    Fluxlist(U,B,V,R,I):      ~,  17.2 ,      ~,      ~,      ~,

    ... 140 entries outputted...

  • So, with this exhaustive cross ID listing, it was just a trivial exercise in Perl to implement the cross identifiers known to SIMBAD into my script that generates the globular.dsc data set for my implementation of Globulars.


But before I release the globular code/data, we should discuss a bit:
++++++++++++++++++++++++++
In some cases, the number of alternative names@SIMBAD is quite HIGH. E.g. 19 for the famous 47 Tuc globular! Usually, it's rather 4-5 entries.

So which ones do we want to include, which ones to skip?
+++++++++++++++++++++++++++

Possibly a clever general strategy would be to make Celestia recognize PRECISELY the names SIMBAD recognizes, but to display only a selection of the most popular alternative names on the top of the OGL canvas!

Let me know what you think....

Fridger

Re: SIMBATCH or how to exploit Simbad in batch...

Posted: 23.08.2008, 19:37
by ajtribick
Thanks for the information. This might be very useful for generating the cross-index files.

Re: SIMBATCH or how to exploit Simbad in batch...

Posted: 23.08.2008, 19:47
by t00fri
ajtribick wrote:Thanks for the information. This might be very useful for generating the cross-index files.

I thought you might like this ;-)

Fridger

Re: SIMBATCH or how to exploit Simbad in batch...

Posted: 23.08.2008, 21:17
by Fightspit
If I understand, you allow Celestia to recognise and use SIMBAD databases ( catalogs of stars for exemple ?) directly without making manual ssc or stc files (like deepsky.dsc), is it correct ?

Re: SIMBATCH or how to exploit Simbad in batch...

Posted: 23.08.2008, 21:34
by t00fri
Fightspit wrote:If I understand, you allow Celestia to recognise and use SIMBAD databases ( catalogs of stars for exemple ?) directly without making manual ssc or stc files (like deepsky.dsc), is it correct ?

No, there is a data file called globular.dsc in Celestia that is completely analogous in concept to deepsky.dsc.

But in this new application, the issue is about the syntax used for the alternative names of a given globular. Here is an example for the globular Pal 4 in the data file globular.dsc. Watch in particular the first line:

Code: Select all

Globular "Pal 4:UGCA 237:GCl 17:C 1126+292"
{
        RA                11.4878  # [hours]
        Dec               28.9736  # [degrees]
        Distance        3.562e+05  # [ly]
        Radius              67.34  # [ly], mu25 Isophote
        CoreRadius           0.55  # [arcmin]
        KingConcentration     0.78  # c = log10(r_t/r_c)
        AbsMag              -6.02  # [V mags]
        Axis          [  0.6096  -0.7505  -0.2554]
        Angle               112.2  # [degrees]
        InfoURL  "http://simbad.u-strasbg.fr/sim-id.pl?Ident=Pal 4"
}


The list of alternative identifiers to Pal 4 was obtained via SIMBATCH. Thus any such identifier you then can use BOTH in Celestia and for querying SIMBAD. Precisely the same writing with spaces, slashes and all those gory details...

Fridger

Re: SIMBATCH or how to exploit Simbad in batch...

Posted: 24.08.2008, 08:30
by Guckytos
t00fri wrote:But before I release the globular code/data, we should discuss a bit:
++++++++++++++++++++++++++
In some cases, the number of alternative names@SIMBAD is quite HIGH. E.g. 19 for the famous 47 Tuc globular! Usually, it's rather 4-5 entries.

So which ones do we want to include, which ones to skip?
+++++++++++++++++++++++++++

Possibly a clever general strategy would be to make Celestia recognize PRECISELY the names SIMBAD recognizes, but to display only a selection of the most popular alternative names on the top of the OGL canvas!

Let me know what you think....

Fridger

I think that this sounds like a reasonable idea. Because too many names just clog up the screen. So I think it would be a good idea to let only a maximum of 5 entries be displayed, but Celestia recognising ALL names.

Which names are the most popular ones could on the other hand lead to quite some discussions, perhaps.

So I have 2 ideas here: First, perhaps add an identifier in the Celestia code and the data file, to tell Celestia which of the names is a popular one to be displayed (and of course limit it in the code to 5; so that if someone marks all names [more than 5] as popular, he only gets 5 displayed).
Second, to display a name that has been searched for, and that is not a popular one as the FIRST entry! So that the person knows, that what he is looking at, really is what he searched for.

Just my 2 cents.

Regards,

Guckytos

Re: SIMBATCH or how to exploit Simbad in batch...

Posted: 24.08.2008, 10:40
by Fightspit
Thanks Fridger for clarifing me :)

Re: SIMBATCH or how to exploit Simbad in batch...

Posted: 24.08.2008, 16:58
by ajtribick
Ok, thanks to this I now have a file containing all SIMBAD IDs for all Hipparcos stars in SIMBAD. (This has to be stitched together from six SIMBATCH queries, as the maximum number of returned results seems to be just over 20000)

From this it is easy to generate the HD and SAO cross-index files using a Perl script. (I am restricting entries into the HD cross-index to those stars without a component letter, or whose component letter is either A or J, the latter of which seems to be used for HD entries that refer to multiple components in the same star system)

Other cross-indices can also be easily generated from the same file.

Re: SIMBATCH or how to exploit Simbad in batch...

Posted: 24.08.2008, 18:01
by t00fri
ajtribick wrote:Ok, thanks to this I now have a file containing all SIMBAD IDs for all Hipparcos stars in SIMBAD. (This has to be stitched together from six SIMBATCH queries, as the maximum number of returned results seems to be just over 20000)

From this it is easy to generate the HD and SAO cross-index files using a Perl script. (I am restricting entries into the HD cross-index to those stars without a component letter, or whose component letter is either A or J, the latter of which seems to be used for HD entries that refer to multiple components in the same star system)

Other cross-indices can also be easily generated from the same file.

I have also experimented further with various cross indices for DSOs. I think this is the way to go in the future, notably also for galaxies, binaries etc. But before that, we need to agree on the logistics about how many and which entries we want to retain, in case that their number becomes so large that the alternative names tend to clutter the OGL canvas.

We need to think about a strategy that is ROBUST concerning peoples' personal judgements of which acronyms might be more "popular" than others...

My previous suggestion still stands that Celestia's command line should be made to recognize ALL entries, but there should be a sensible cut-off to something like 5 entries maximally in the display.

Fridger

Re: SIMBATCH or how to exploit Simbad in batch...

Posted: 24.08.2008, 18:48
by ElChristou
t00fri wrote:...We need to think about a strategy that is ROBUST concerning peoples' personal judgements of which acronyms might be more "popular" than others...

What would be your personal vision to such problem? We cannot decide for each body what should or should not being displayed!

Re: SIMBATCH or how to exploit Simbad in batch...

Posted: 24.08.2008, 19:08
by chris
My previous suggestion still stands that Celestia's command line should be made to recognize ALL entries, but there should be a sensible cut-off to something like 5 entries maximally in the display.

This seems reasonable to me, and it's a very easy task to limit the number of names shown for DSOs, stars, and solar system objects. The dsc file creation scripts should order the names so that the most familiar designations (something for the creators of the catalogs to agree upon) appear first.

--Chris

Re: SIMBATCH or how to exploit Simbad in batch...

Posted: 24.08.2008, 19:30
by t00fri
chris wrote:
My previous suggestion still stands that Celestia's command line should be made to recognize ALL entries, but there should be a sensible cut-off to something like 5 entries maximally in the display.

This seems reasonable to me, and it's a very easy task to limit the number of names shown for DSOs, stars, and solar system objects. The dsc file creation scripts should order the names so that the most familiar designations (something for the creators of the catalogs to agree upon) appear first.

--Chris

What is the most "familiar" alternative name ordering e.g for globular clusters? Note we are not talking about star catalogs here, where the issue of catalog familiarity is rather self-evident...

I would NOT ask this here, if it was not a highly subjective and not at all trivial issue. Small and dim globulars that constitute a significant portion of the complete set of 150, appear quite EXCLUSIVELY in all sorts of "unfamiliar" catalogs. On the other hand big catalog entries like ESO xxxx-xx are much less mnemonic than e.g names like e.g. Lynga 7.

Also SIMBAD name hierarchies, being the most "familiar" ones for professionals, do not necessarily match what Chris Laurel would find "most familiar" ;-)

Fridger

Re: SIMBATCH or how to exploit Simbad in batch...

Posted: 24.08.2008, 20:16
by chris
t00fri wrote:
chris wrote:
My previous suggestion still stands that Celestia's command line should be made to recognize ALL entries, but there should be a sensible cut-off to something like 5 entries maximally in the display.

This seems reasonable to me, and it's a very easy task to limit the number of names shown for DSOs, stars, and solar system objects. The dsc file creation scripts should order the names so that the most familiar designations (something for the creators of the catalogs to agree upon) appear first.

--Chris

What is the most "familiar" alternative name ordering e.g for globular clusters? Note we are not talking about star catalogs here, where the issue of catalog familiarity is rather self-evident...

I would NOT ask this here, if it was not a highly subjective and not at all trivial issue. Small and dim globulars that constitute a significant portion of the complete set of 150, appear quite EXCLUSIVELY in all sorts of "unfamiliar" catalogs. On the other hand big catalog entries like ESO xxxx-xx are much less mnemonic than e.g names like e.g. Lynga 7.

Also SIMBAD name hierarchies, being the most "familiar" ones for professionals, do not necessarily match what Chris Laurel would find "most familiar" ;-)

...which is why I didn't suggest any particular prioritization of names. However, a large fraction of Celestia users aren't professionals, but enthusiasts like myself; surely, the preferences of that group should be taken into account. It seems sensible to assign high priority to Messier and Caldwell designations for objects, even if they don't enjoy wide usage among professional astronomers. Beyond this recommendation, I don't have much basis for judging which catalog numbers absolutely must be shown in the Celestia overlay.

--Chris

Re: SIMBATCH or how to exploit Simbad in batch...

Posted: 24.08.2008, 20:53
by t00fri
chris wrote:
t00fri wrote:Also SIMBAD name hierarchies, being the most "familiar" ones for professionals, do not necessarily match what Chris Laurel would find "most familiar" ;-)
...which is why I didn't suggest any particular prioritization of names. However, a large fraction of Celestia users aren't professionals, but enthusiasts like myself; surely, the preferences of that group should be taken into account. It seems sensible to assign high priority to Messier and Caldwell designations for objects, even if they don't enjoy wide usage among professional astronomers. Beyond this recommendation, I don't have much basis for judging which catalog numbers absolutely must be shown in the Celestia overlay.

--Chris

Chris,

surely, we don't have to discuss obvious things: if there are Messier or NGC assignments these are coming first, of course. I always coded things like this.

+++++++++++++++++
The problem comes when we go beyond M xx or NGC xxxx!
Notably since many globulars don't have neither a M xx or a NGC xxxx assignment.
+++++++++++++++++

Let me just take a few examples: Here are the exhaustive lists of alternative ids that I exctracted via SIMBATCH:

a) the famous 47 Tuc globular has these 19 (!) alternative ids:

Code: Select all

FAUST 10:CD-72 20:CPD-72 35:HD 2051:2E 0021.8-7221:1RXS J002404.6-720456:1ES 0021-72.3:1E 0021.8-7221:Melotte 1:2MASX J00240535-7204531:[FS2003] 0013:[BM83] X0021-723:RBS 55:NGC 104:GCl 1:C 0021-723:2E 82:47 Tuc:xi Tuc


or

b) Omega Cen

Code: Select all

1E 1323.8-4713:NGC 5139:HD 116790:GCl 24:GCRV 4762 E:CPD-46 6348:CD-46 8646:C 1323-472:Omega Cen


or

c) Pal 5

Code: Select all

Z 1513.5+0003:Pal 5:[TGM94] 151330+000300:Z 21-61:UGC 9792:Name Serpens Dwarf:MCG+00-39-016:GCl 32:C 1513+000


and so on for 150 globulars. ;-)

++++++++++++++++++++++++
So how does the "enthusiast" order and cut-ff such VERY diverse lists for 150 objects?? ;-)
++++++++++++++++++++++++

Since virtually every alternative id string involves quite a few new catalog root designations, it's even not simple to device a sorting algorithm despite Perl's sorting flexibility that would take account of subjective criteria like "familiarity" ...

My original globular catalog just hat one or two name alternatives per globular. But we should adopt a more general approach that is also extendable to other DSOs and nicely coexists with the SIMBAD syntax. That is what I am "enthusiastic" about....

The same for star cross listings ==> Andrew.

Fridger

Re: SIMBATCH or how to exploit Simbad in batch...

Posted: 25.08.2008, 09:27
by Guckytos
t00fri wrote:
chris wrote:
t00fri wrote:Also SIMBAD name hierarchies, being the most "familiar" ones for professionals, do not necessarily match what Chris Laurel would find "most familiar" ;-)
...which is why I didn't suggest any particular prioritization of names. However, a large fraction of Celestia users aren't professionals, but enthusiasts like myself; surely, the preferences of that group should be taken into account. It seems sensible to assign high priority to Messier and Caldwell designations for objects, even if they don't enjoy wide usage among professional astronomers. Beyond this recommendation, I don't have much basis for judging which catalog numbers absolutely must be shown in the Celestia overlay.

--Chris

Chris,

surely, we don't have to discuss obvious things: if there are Messier or NGC assignments these are coming first, of course. I always coded things like this.

+++++++++++++++++
The problem comes when we go beyond M xx or NGC xxxx!
Notably since many globulars don't have neither a M xx or a NGC xxxx assignment.
+++++++++++++++++

Well,

I am also just an amateur and would surely not know the more scientific names. Okay, the M xx and/or NGC xxxx should come first, as even an amateur like me knows what those stand for, generally. And even for some objects I know which refers to what.

But generally speaking I would say, that the next designations after M and/or NGC should be NAMES. Just look at the examples below:

t00fri wrote:a) the famous 47 Tuc globular has these 19 (!) alternative ids:

Code: Select all

FAUST 10:CD-72 20:CPD-72 35:HD 2051:2E 0021.8-7221:1RXS J002404.6-720456:1ES 0021-72.3:1E 0021.8-7221:Melotte 1:2MASX J00240535-7204531:[FS2003] 0013:[BM83] X0021-723:RBS 55:NGC 104:GCl 1:C 0021-723:2E 82:47 Tuc:xi Tuc

I know definitely this name here 47 Tuc and this xi Tuc but the rest means absolutely nothing to me.

t00fri wrote:b) Omega Cen

Code: Select all

1E 1323.8-4713:NGC 5139:HD 116790:GCl 24:GCRV 4762 E:CPD-46 6348:CD-46 8646:C 1323-472:Omega Cen

Here I only know the name of Omega Cen; the rest means absolutely nothing to me.

t00fri wrote:c) Pal 5

Code: Select all

Z 1513.5+0003:Pal 5:[TGM94] 151330+000300:Z 21-61:UGC 9792:Name Serpens Dwarf:MCG+00-39-016:GCl 32:C 1513+000

Here I only know the name of Serpens Dwarf; the rest means absolutely nothing to me.

t00fri wrote:++++++++++++++++++++++++
So how does the "enthusiast" order and cut-ff such VERY diverse lists for 150 objects?? ;-)
++++++++++++++++++++++++

Well as stated above the following order would be sensible in my opinion:
  1. M xx
  2. NGC xxxx
  3. Name(s) like Serpens Dwarf, 47 Tuc, Omega Cen ...
  4. Whatever

But as stated in another post, it would be a good idea to display a name that has been searched for, and that is not a popular one as the FIRST entry! So that the person knows, that what he is looking at, really is what he searched for.

Regards,

Guckytos

Re: SIMBATCH or how to exploit Simbad in batch...

Posted: 25.08.2008, 10:21
by t00fri
Christian,

thanks for your ideas. So far I have adopted a related but reversed scheme. Since people indeed MAINLY remember the names or Messier designations of DSOs, i.e not even their NGC xxxx number, I start with

--Popular names or Messier id
--NGC xxxx if existing
--other catalog designations (where largely globulars are listed!), cut-off for now around five entries. I have always given precedence to the Messier Id if it exists along with a name.

A maximum of 5 reasonably short entries (as you suggested) does not yet clutter the canvas, but allows to display the complete string of alternative SIMBAD ids to 99%.

Fridger

Re: SIMBATCH or how to exploit Simbad in batch...

Posted: 25.08.2008, 15:36
by Fightspit
Otherwise, it is possible to put 2 lines of names in Celestia when you select an object, the first line corresponds to the most common names like this: Name/M xx/NGC xxx/HD xxx/... ; and the second line corresponds to the full IDs without the previous names displayed on the first line of this object.

With one of Fridger's exemples, we get somethink like this in Celestia when you select 47 Tuc:

47 Tuc/NGC 104/HD 2051
FAUST 10:CD-72 20:CPD-72 35:2E 0021.8-7221:1RXS J002404.6-720456:1ES 0021-72.3:1E 0021.8-7221:Melotte 1:2MASX J00240535-7204531:[FS2003] 0013:[BM83] X0021-723:RBS 55:GCl 1:C 0021-723:2E 82:47 Tuc:xi Tuc

Re: SIMBATCH or how to exploit Simbad in batch...

Posted: 25.08.2008, 17:41
by Guckytos
Fightspit wrote:Otherwise, it is possible to put 2 lines of names in Celestia when you select an object, the first line corresponds to the most common names like this: Name/M xx/NGC xxx/HD xxx/... ; and the second line corresponds to the full IDs without the previous names displayed on the first line of this object.

With one of Fridger's exemples, we get somethink like this in Celestia when you select 47 Tuc:

47 Tuc/NGC 104/HD 2051
FAUST 10:CD-72 20:CPD-72 35:2E 0021.8-7221:1RXS J002404.6-720456:1ES 0021-72.3:1E 0021.8-7221:Melotte 1:2MASX J00240535-7204531:[FS2003] 0013:[BM83] X0021-723:RBS 55:GCl 1:C 0021-723:2E 82:47 Tuc:xi Tuc

Fightspit, I am not completely sure what you are talking about now. The display on screen, when you orbit the DSO or are you talking about the search "Enter/name/Enter" method?

Just personally speaking, I would never ever want to have that much of text on my screen, if I go to or orbit an object. This is way to distracting.

The same goes for the "Enter/name/Enter" method. If you use this, you normally know what the name should be, if not, having a hell of names on the screen, of which most you never heard of, wouldn't help either.

Okay, for those who want to know all about everything there could perhaps a toggle be added "Ultra Info Mode (it shows all we know, so be warned, screen may clogg with info)" :wink:

But as a normal user I say 5 designators for an object are more than enough.

Regards,

Guckytos

Re: SIMBATCH or how to exploit Simbad in batch...

Posted: 28.08.2008, 09:46
by Fightspit
I am talking about the display on screen on the top left hand when you select an object and it will be nice to add an option "Ultra info" which can do that as you said.

But about the search via "Enter", you don't need to write the "full" name of an object, write "47 Tuc" or "HD 2051" or "FAUST 10" for exemple give the same search result.