binary star orbits update
-
Topic authort00fri
- Developer
- Posts: 8772
- Joined: 29.03.2002
- Age: 22
- With us: 22 years 8 months
- Location: Hamburg, Germany
binary star orbits update
Hi all,
my latest stc version of the O(200) visual double star orbits from Soederhjelm (1999) may be downloaded for further testing from here
http://www.shatters.net/~t00fri/binaries_1.02.stc.zip
and the latest version of my 40 spectroscopic binary orbits from D. Poubaix (2000) is here for testing
http://www.shatters.net/~t00fri/binaries-spect.stc.zip
I remind you that for the cases that the visual magnitude is missing in the paper, I have arbitrarily put a value of 5.0 (as a placeholder for now). Unfortunately, unlike in case of missing spectral classes, Celestia does not yet parse ? or 5.0? correctly in case of missing magnitudes. That should be a triviality to fix in the code, however...
Bye Fridger
my latest stc version of the O(200) visual double star orbits from Soederhjelm (1999) may be downloaded for further testing from here
http://www.shatters.net/~t00fri/binaries_1.02.stc.zip
and the latest version of my 40 spectroscopic binary orbits from D. Poubaix (2000) is here for testing
http://www.shatters.net/~t00fri/binaries-spect.stc.zip
I remind you that for the cases that the visual magnitude is missing in the paper, I have arbitrarily put a value of 5.0 (as a placeholder for now). Unfortunately, unlike in case of missing spectral classes, Celestia does not yet parse ? or 5.0? correctly in case of missing magnitudes. That should be a triviality to fix in the code, however...
Bye Fridger
-
Topic authort00fri
- Developer
- Posts: 8772
- Joined: 29.03.2002
- Age: 22
- With us: 22 years 8 months
- Location: Hamburg, Germany
Hi all,
now that this great original site is back...
I would like to stimulate a discussion about possible strategies in
preparing stc files for binary star orbits in Celestia.
While Grant and I have exchanged plenty of emails about this
important issue during the last week or so, I would like to also find
out how people feel in the forum.
Let me set the stage:
Of course, it takes only a fraction of a second
to randomly generate huge stc files with ficticious multiple star
orbits. Thus, for all people who are focussing their interest on
fictional systems, there is no problem whatsoever. Just switch to
another thread .
But the situation changes drastically, as soon as we claim to model
large amounts of known multiple star sytems as realistically as
possible in Celestia. Grant and I both adhere to the latter philosophy, but
our line of attack tends to be quite different.
Let me briefly explain (Grant will probably comment in his own way):
1) Grant has completed so far the 'nearstars.stc' file that includes
all prominent multiple star orbits out to some
maximum distance from our planet.
2) I have so far converted 2 published and much cited data sets for binary stars,
with close to 250 binary orbits in total:
a) S. Soederhjelm(1999)[visual] and
b) D. Pourbaix(2000) [spectroscopic].
(cf my post above this one!)
Many more are in the works...
Let me next describe from my point of view the positive and
negative aspects of the methods we both have applied so far:
(1): Grant's approach of including /all/ multiples out to some maximal
distance is very sound as a strategy. I really like it! However, to
achieve his aim of completeness, he has to combine information from
various sources and 'hand-edit' many individual stars in his list
using numerous little pieces of data. The result implies a total loss of
transparency of analysis. The origin of his data can hardly be
traced anymore and inherent uncertainties are hard to
characterize/localize. On the other hand, the more Celestia's fame will grow, the
more conscious we will have to be as to citing data sources...
In summary: From a purely scientific point of view there is a
high price to pay for realizing this beautiful and most sensible aim.
(2): My attitude, so far was rather to put ahead the idea of "data
purity". In practice this means that I "mass-convert"
(well-known) complete scientific catalogues by means of PERL scripts to Celestia
without modifying anything compared to the originals.
But also here the price to pay is high!
a) these attempts represent a highly random selection of a real
minority of all multiple systems in the sky. This continues to
bother me! It is conceptionally an analoguous dilemma as in case of
galaxies and nebulae.
In XEphem we had instead incorporated the entire WDS catalogue,
hence practically /all/ known multiple systems were identified as
such in our 2d(!) charts. The WDS catalogue, however, contains by far
insufficient information for a 3d orbit reconstruction.
b) Clearly, with my "strict" approach there will be plenty of "empty
slots" in the resulting stc files, since most of the catalogs
either lack the m_B/m_A mass ratios, the spectral classes or the
visual magnitudes of the secondaries here and there . Of course,
there are still quite a few possible measures that would alleviate these
problems, but they would require quite a bit of respective coding
(by Chris typically ).
One can easily merge various entire catalogs together and logically
correlate their complementary contents via the ID's of the stars.
Such a procedure does indeed preserve scientific transparency,
since it is easy to globally state the applied procedure in one or
two sentences. What is a crucial requirement for data purity,
however, is that none of those /input/ catalogues would be modified
in any way relative to their published contents. Hence the
"patching" has to be done either through intermediate file
generation or internally. And that's where Chris has to come in...
So let us know what you think. But I emphasize again: this whole
discussion is based on the presumption to try and do a job as close as
possible to reality...
Oh yes: what I hope for is not a "judgement" of what
method you like better. Rather I would very much
welcome any suggestions of how you think the described
problems of either method could somehow be cleverly
circumvented!
Bye Fridger
now that this great original site is back...
I would like to stimulate a discussion about possible strategies in
preparing stc files for binary star orbits in Celestia.
While Grant and I have exchanged plenty of emails about this
important issue during the last week or so, I would like to also find
out how people feel in the forum.
Let me set the stage:
Of course, it takes only a fraction of a second
to randomly generate huge stc files with ficticious multiple star
orbits. Thus, for all people who are focussing their interest on
fictional systems, there is no problem whatsoever. Just switch to
another thread .
But the situation changes drastically, as soon as we claim to model
large amounts of known multiple star sytems as realistically as
possible in Celestia. Grant and I both adhere to the latter philosophy, but
our line of attack tends to be quite different.
Let me briefly explain (Grant will probably comment in his own way):
1) Grant has completed so far the 'nearstars.stc' file that includes
all prominent multiple star orbits out to some
maximum distance from our planet.
2) I have so far converted 2 published and much cited data sets for binary stars,
with close to 250 binary orbits in total:
a) S. Soederhjelm(1999)[visual] and
b) D. Pourbaix(2000) [spectroscopic].
(cf my post above this one!)
Many more are in the works...
Let me next describe from my point of view the positive and
negative aspects of the methods we both have applied so far:
(1): Grant's approach of including /all/ multiples out to some maximal
distance is very sound as a strategy. I really like it! However, to
achieve his aim of completeness, he has to combine information from
various sources and 'hand-edit' many individual stars in his list
using numerous little pieces of data. The result implies a total loss of
transparency of analysis. The origin of his data can hardly be
traced anymore and inherent uncertainties are hard to
characterize/localize. On the other hand, the more Celestia's fame will grow, the
more conscious we will have to be as to citing data sources...
In summary: From a purely scientific point of view there is a
high price to pay for realizing this beautiful and most sensible aim.
(2): My attitude, so far was rather to put ahead the idea of "data
purity". In practice this means that I "mass-convert"
(well-known) complete scientific catalogues by means of PERL scripts to Celestia
without modifying anything compared to the originals.
But also here the price to pay is high!
a) these attempts represent a highly random selection of a real
minority of all multiple systems in the sky. This continues to
bother me! It is conceptionally an analoguous dilemma as in case of
galaxies and nebulae.
In XEphem we had instead incorporated the entire WDS catalogue,
hence practically /all/ known multiple systems were identified as
such in our 2d(!) charts. The WDS catalogue, however, contains by far
insufficient information for a 3d orbit reconstruction.
b) Clearly, with my "strict" approach there will be plenty of "empty
slots" in the resulting stc files, since most of the catalogs
either lack the m_B/m_A mass ratios, the spectral classes or the
visual magnitudes of the secondaries here and there . Of course,
there are still quite a few possible measures that would alleviate these
problems, but they would require quite a bit of respective coding
(by Chris typically ).
One can easily merge various entire catalogs together and logically
correlate their complementary contents via the ID's of the stars.
Such a procedure does indeed preserve scientific transparency,
since it is easy to globally state the applied procedure in one or
two sentences. What is a crucial requirement for data purity,
however, is that none of those /input/ catalogues would be modified
in any way relative to their published contents. Hence the
"patching" has to be done either through intermediate file
generation or internally. And that's where Chris has to come in...
So let us know what you think. But I emphasize again: this whole
discussion is based on the presumption to try and do a job as close as
possible to reality...
Oh yes: what I hope for is not a "judgement" of what
method you like better. Rather I would very much
welcome any suggestions of how you think the described
problems of either method could somehow be cleverly
circumvented!
Bye Fridger
Last edited by t00fri on 07.12.2004, 10:21, edited 3 times in total.
-
- Site Admin
- Posts: 4211
- Joined: 28.01.2002
- With us: 22 years 10 months
- Location: Seattle, Washington, USA
Fridger,
So essentially the question is whether to preprocess the data files, or leave them alone and let Celestia handle whatever conversions need to be done at load time? If I understand you correctly, then my concern is that each catalog will require slightly differenct processing and that the Celestia code will have to be modified each time we want to use a new data source. While I do see the advantage this offers, I'm naturally inclined choose the path that requires the least effort from me Anyhow, have I grasped the gist of the issue?
How do we handle overlap when using raw catalogs? If the same star appears in both catalogs, do we just use the one from the last catalog to be loaded? Or somehow select the data with the least error?
--Chris
So essentially the question is whether to preprocess the data files, or leave them alone and let Celestia handle whatever conversions need to be done at load time? If I understand you correctly, then my concern is that each catalog will require slightly differenct processing and that the Celestia code will have to be modified each time we want to use a new data source. While I do see the advantage this offers, I'm naturally inclined choose the path that requires the least effort from me Anyhow, have I grasped the gist of the issue?
How do we handle overlap when using raw catalogs? If the same star appears in both catalogs, do we just use the one from the last catalog to be loaded? Or somehow select the data with the least error?
--Chris
I would appreciate the possibility to switch between different star catalogs. It could be most helpfull for sientific and educational purposes. So I think, it's a good idea, on the one hand, to generate clean catalog files for use in celestia. Of course this would only make sense, if some menupoints would be included, that allow to switch between those catalogs instantly from inside the running program.
On the other hand, a most complete stars dataset would be the best representation as long as you don't want to do any specific astronomical examinations. So Grant's approach would fit best for common needs.
I would propose to define a database that holds informations about catalog adaptions and corrections like Grant does it. If the data entries could sufficiently describe the changes done to the pure catalogs, an automatic process could combine pure catalogs and changes to form the same stars catalog as now used by celestia. This way all needs are adressed, and handmade editings doesn't result in transparency loss anymore. Using a CVS-style database would even allow to track for subsequential hand editings done to one star.
Grant would have to try to describe a formalization of the changes he usually does, so that a dataset can be defined.
maxim
On the other hand, a most complete stars dataset would be the best representation as long as you don't want to do any specific astronomical examinations. So Grant's approach would fit best for common needs.
I would propose to define a database that holds informations about catalog adaptions and corrections like Grant does it. If the data entries could sufficiently describe the changes done to the pure catalogs, an automatic process could combine pure catalogs and changes to form the same stars catalog as now used by celestia. This way all needs are adressed, and handmade editings doesn't result in transparency loss anymore. Using a CVS-style database would even allow to track for subsequential hand editings done to one star.
Grant would have to try to describe a formalization of the changes he usually does, so that a dataset can be defined.
maxim
-
Topic authort00fri
- Developer
- Posts: 8772
- Joined: 29.03.2002
- Age: 22
- With us: 22 years 8 months
- Location: Hamburg, Germany
Chris,
First of all, it's a strategy/policy decision one has to take: do we want to preserve as much as possible a way of using published astronomical data that also scientists would be willing to accept or not?
If yes, let me give a simplistic example of a perfectly acceptable and /transparent/ type of strategy and citation:
"Celestia's 3800 multiple star orbits include
a) all data from the 6th catalog of orbits of visual binary stars [<reference>],
b) all data from the 9th catalog of spectroscopic binary orbits [<reference>],
c) ...
with the missing star masses estimated from an empirical mass-luminosity relation from [<reference>]."
It would certainly be cleanest to just add the converted official catalogs without any modifications of individual star data to the distribution, such that everyone interested can check explicitly what is actually used. Whether we combine the various catalogs via intermediate files (like in case of stars.txt) or do all the merging within the Celestia core code, will have to be discussed. In case of intermediate files, I could easily do /all/ of the work by means of PERL. So NO work for you...One could even add the Perl script so as to document the methods used...My Perl scripts are always well commented. This would be even an "open source policy" concerning Celestia's data
Yes that's an important point. I would vote for the data with highest confidence, if possible, but just concerning selections between various whole catalogs NOT individual stars! The advantage of using whole catalogs is that experts have already made a choice among individual papers. Moreover by combining /spectroscopic/ and /visual/ orbit catalogs, the overlap should naturally be quite small.
Eventually, for more dedicated users, one would be dreaming about a GUI catalog editor that quickly allows switching between the contents of specialized catalogs.
Precisely as was mentioned by Maxim above.
Bye Fridger
Chris wrote:If I understand you correctly, then my concern is that each catalog will require slightly differenct processing and that the Celestia code will have to be modified each time we want to use a new data source.
First of all, it's a strategy/policy decision one has to take: do we want to preserve as much as possible a way of using published astronomical data that also scientists would be willing to accept or not?
If yes, let me give a simplistic example of a perfectly acceptable and /transparent/ type of strategy and citation:
"Celestia's 3800 multiple star orbits include
a) all data from the 6th catalog of orbits of visual binary stars [<reference>],
b) all data from the 9th catalog of spectroscopic binary orbits [<reference>],
c) ...
with the missing star masses estimated from an empirical mass-luminosity relation from [<reference>]."
It would certainly be cleanest to just add the converted official catalogs without any modifications of individual star data to the distribution, such that everyone interested can check explicitly what is actually used. Whether we combine the various catalogs via intermediate files (like in case of stars.txt) or do all the merging within the Celestia core code, will have to be discussed. In case of intermediate files, I could easily do /all/ of the work by means of PERL. So NO work for you...One could even add the Perl script so as to document the methods used...My Perl scripts are always well commented. This would be even an "open source policy" concerning Celestia's data
Chris wrote:How do we handle overlap when using raw catalogs? If the same star appears in both catalogs, do we just use the one from the last catalog to be loaded? Or somehow select the data with the least error?
Yes that's an important point. I would vote for the data with highest confidence, if possible, but just concerning selections between various whole catalogs NOT individual stars! The advantage of using whole catalogs is that experts have already made a choice among individual papers. Moreover by combining /spectroscopic/ and /visual/ orbit catalogs, the overlap should naturally be quite small.
Eventually, for more dedicated users, one would be dreaming about a GUI catalog editor that quickly allows switching between the contents of specialized catalogs.
Precisely as was mentioned by Maxim above.
Bye Fridger
-
- Developer
- Posts: 1863
- Joined: 21.11.2002
- With us: 22 years
I think the bigger problem is just being sure you can identify identical stars in two different catalogues, given that double stars "enjoy" a ridiculous number of different names arising from different catalogues, and the names and catalogue numbers of secondary components are often not provided at all.chris wrote:If the same star appears in both catalogs, do we just use the one from the last catalog to be loaded? Or somehow select the data with the least error?
Taking S?derhjelm's catalogue as an example, there is no evidence within the catalogue that Hip 73182 is a binary that itself orbits Hip 73184, which in Celestia's stars.dat has the Flamsteed designation 33 Lib. So the components of Hip 73182 should be named 33 Lib B and 33 Lib C (or 33 Lib Ba and 33 Lib Bb), and their existence in the catalogue should trigger the renaming of Hip 73184 as 33 Lib A. But a "pure data" catalogue can only generate a binary called Hip 73182 A and Hip 73182 B, which sits (apparently) coincidentally close to 33 Lib.
Another example, again from S?derhjelm. He doesn't record that Hip 82817 is commonly known as Wolf 630. So a "pure data" catalogue would generate a double star with names Hip 82817 A and Hip 82817 B, which would replace the Wolf 630 in stars.dat - the user loses a commonly used name in exchange for a catalogue designation. Next, imagine we load another catalogue which contains spectroscopic data related to the subcomponents of Wolf 630 B. A smart stc generator might correctly designate them Wolf 630 Ba and Wolf 630 Bb, but Celestia would not know to overwrite Hip 82817 B with the barycentre of the new Wolf 360 B binary - we'd end up with the two systems coexisting as four stars under a variety of different names. And because of the way the system is mangled by the two "pure" catalogues fighting each other, it would be very messy to try to sort it out with another "bespoke" stc add-on depicting the true layout of the system.
So if "pure" datasets are to be introduced, and considered inviolable (no editing of the original data, and all catalogue data included), Celestia is going to have to be very clever on its feet with large look-up tables that allow it to identify stars from different catalogues as being identical, and to pick up a pair in one catalogue as representing the subcomponents of a "single" star listed in a different catalogue.
To be at all user friendly, it would also have to generate, on-the-fly, good heirarchical designations for individual stars in multiple systems, based on the common names of the star systems (for example, it would have to offer the user 33 Lib A, 33 Lib B, 33 Lib C rather than 33 Lib, Hip 73182 A, Hip 73182 B.
Grant
-
Topic authort00fri
- Developer
- Posts: 8772
- Joined: 29.03.2002
- Age: 22
- With us: 22 years 8 months
- Location: Hamburg, Germany
Grant,
above, you emphasized the 'name' identification for multiple star systems as a severe problem in case of my computer conversions of entire catalogs. While your comment is certainly formally correct in the 2 specific examples /for now/, I don't really see that it is specifically connected to the method I am advocating... Adding further names is a pure "convenience feature" independent of the basic data information.
From a scientific point of view, a system is already uniquely characterized by quoting whatever unique ID for it, e.g. its HIP number. So any /individual/ additions of further (familiar) /names/ to the original catalog data (in whatever form!) would NOT be in conflict with data purity whatsoever! Conflict arises, of course, if one modifies the /data/ that come with the names...
Even if you or anyone else had the patience to add naming conventions individually by hand, it would be OK as to "data purity".
However, I am targetting at numbers of multiple systems that are /completely out of reach/ for "hand-editors" (say ~2400 from SB9 and ~1400 from the 6th catalog of visual binaries) !
In view of this, there are clearly compromises to be made. Yet, I think they are well worth it! In any case, the numbers of mistakes or inaccuracies which my computerized method will generate, can surely be kept /small/ as compared to the huge number of correct implementations! The remaining errors will arise mostly due to inadequacies of the catalogs and not mine, which is /conceptionally/ OK. When the catalogs improve, I can generate Celestia updates thereof in far less than a second... Etc.
+++++++++++++++
I am effectively advocating a strict and transparent "open source" policy for all astronomical data used by Celestia. If my Perl scripts were added to the distribution, everyone would be able to check, reproduce and correct Celestia's data! My well commented and /human readable/ Perl scripts can serve as a simple but concise documentation of any combinations, corrections or even modifications of original data sets within Celestia! Perl is part of every Unix/Linux OS and also exists for Windows.
++++++++++++++
Even in this case of very many stars, it is easy for Perl to exploit /all/ available catalogs at once to extract the maximum number of known cross names. I was anyway planning to do this "little" homework once for all quite soon. This would produce an extensive and most complete lookup table for star names and system topology flags!
The 6th catalog of visual binary orbits, for example, has clear information about the system's "topology" in a column in form of A, B, AB,.. tags. That can easily be correlated with the "world knowledge" for names...Perl is very powerful /and/ fast...and I know very well how to exploit the power of Perl...
As I have emphasized earlier: your "hand-editing" method as applied to individual stars will presumably give the more accurate results, but the range of applicability is very limited, updating is a pain, and sources and their errors can hardly be traced anymore by users.
I know that all catalogs contain some errors. Also Celestia contains some errors. We all have to coexist with errors everywhere and everyday...
Bye Fridger
above, you emphasized the 'name' identification for multiple star systems as a severe problem in case of my computer conversions of entire catalogs. While your comment is certainly formally correct in the 2 specific examples /for now/, I don't really see that it is specifically connected to the method I am advocating... Adding further names is a pure "convenience feature" independent of the basic data information.
From a scientific point of view, a system is already uniquely characterized by quoting whatever unique ID for it, e.g. its HIP number. So any /individual/ additions of further (familiar) /names/ to the original catalog data (in whatever form!) would NOT be in conflict with data purity whatsoever! Conflict arises, of course, if one modifies the /data/ that come with the names...
Even if you or anyone else had the patience to add naming conventions individually by hand, it would be OK as to "data purity".
However, I am targetting at numbers of multiple systems that are /completely out of reach/ for "hand-editors" (say ~2400 from SB9 and ~1400 from the 6th catalog of visual binaries) !
In view of this, there are clearly compromises to be made. Yet, I think they are well worth it! In any case, the numbers of mistakes or inaccuracies which my computerized method will generate, can surely be kept /small/ as compared to the huge number of correct implementations! The remaining errors will arise mostly due to inadequacies of the catalogs and not mine, which is /conceptionally/ OK. When the catalogs improve, I can generate Celestia updates thereof in far less than a second... Etc.
+++++++++++++++
I am effectively advocating a strict and transparent "open source" policy for all astronomical data used by Celestia. If my Perl scripts were added to the distribution, everyone would be able to check, reproduce and correct Celestia's data! My well commented and /human readable/ Perl scripts can serve as a simple but concise documentation of any combinations, corrections or even modifications of original data sets within Celestia! Perl is part of every Unix/Linux OS and also exists for Windows.
++++++++++++++
Even in this case of very many stars, it is easy for Perl to exploit /all/ available catalogs at once to extract the maximum number of known cross names. I was anyway planning to do this "little" homework once for all quite soon. This would produce an extensive and most complete lookup table for star names and system topology flags!
The 6th catalog of visual binary orbits, for example, has clear information about the system's "topology" in a column in form of A, B, AB,.. tags. That can easily be correlated with the "world knowledge" for names...Perl is very powerful /and/ fast...and I know very well how to exploit the power of Perl...
As I have emphasized earlier: your "hand-editing" method as applied to individual stars will presumably give the more accurate results, but the range of applicability is very limited, updating is a pain, and sources and their errors can hardly be traced anymore by users.
I know that all catalogs contain some errors. Also Celestia contains some errors. We all have to coexist with errors everywhere and everyday...
Bye Fridger
It all depends on lookup tables or database entries in the end - that's true.granthutchison wrote:So if "pure" datasets are to be introduced, and considered inviolable (no editing of the original data, and all catalogue data included), Celestia is going to have to be very clever on its feet with large look-up tables that allow it to identify stars from different catalogues as being identical, and to pick up a pair in one catalogue as representing the subcomponents of a "single" star listed in a different catalogue.
But this is not at all an unresolvable or instantly huge problem. A lookup table that is filled at needs will grow slowly, but become huge after sometime without too much efford - probably huge enought to even serve as common reference.
The generation of a star table in celestia format will then be done by some generation program - it may be programmed in Perl, Java, C++ or whatever, that makes no difference. This program inherits the algorithms for picking, combination, selection and dropping as you mentioned them.
t00fri wrote:As I have emphasized earlier: your "hand-editing" method as applied to individual stars will presumably give the more accurate results, but the range of applicability is very limited, updating is a pain, and sources and their errors can hardly be traced anymore by users.
Not if the 'hand-editing' is done via a lookup table or database - this way the original data and the changes to them are kept separate and can be easily traced and verified for every single component. It's only the last step - the star table generation process - that combines pure data with 'handmade' changes.
-----------------------------------------------------------------------------------
So let me outline a strategy that I would be willing to support:
Defining of a data structure, that formalizes all possible hand-editing processes.
Generation of a mySQL/PHP or mySQL/Java Database with Web frontend.
Defining of algorithms for combining 'pure' and 'editing' data.
Coding of a star table generation program (with possibility to choose different generation strategies)
In the end, there would also be the need for a celestia that supports multiple concurrent star tables.
maxim
Well, it's good you're happy with this "pre-processing" approach, since it takes considerable load off the Celestia code - but some very ingenious code will still be required, somewhere. Most binary catalogues are poorly cross-referenced, and I believe the reason for this is that it's a difficult task because of the current state of chaos in binary nomenclature. (I'll be happy if it turns out I'm wrong.)t00fri wrote:From a scientific point of view, a system is already uniquely characterized by quoting whatever unique ID for it, e.g. its HIP number. So any /individual/ additions of further (familiar) /names/ to the original catalog data (in whatever form!) would NOT be in conflict with data purity whatsoever
Some coding load will still have to be borne by Celestia, of course, once it picks up identical systems, since at that point some decision will have to be made about which star to retain. This will be particularly important when a single star in one catalogue turns out to have subcomponents listed in another catalogue - we certainly don't want the single star overwriting the multiple components.
These are quite outdated in some cases, since they are "discovery designations", and they do not necessarily match the heirarchical designations chosen by the compilers of other catalogues.t00fri wrote:The 6th catalog of visual binary orbits, for example, has clear information about the system's "topology" in a column in form of A, B, AB,.. tags.
What I'm saying is that I think it is an "instantly huge" problem - the sheer number of systems Fridger is dealing with, the utterly inadequate provision of cross-indexed designations for secondary components in these systems, and the sheer variety of naming conventions for binary systems and their individual components makes this an immediately huge task - I'm impressed that Fridger is willing to take it on.maxim wrote:But this is not at all an unresolvable or instantly huge problem. A lookup table that is filled at needs will grow slowly, but become huge after sometime without too much efford - probably huge enought to even serve as common reference.
Grant
Well, let me clarify, that I'm currently not talking about binary systems in special, but of star catalogs in general. I'm not sure which catalog data went into the celestia default database, and how many of them are corrected by hand. The strategy I propose remains the same.
What I'm saying is that the buildup of a 'merge, select & change' database needn't be done perfect from the beginning. How many datasets did you change? Dozens? Hundreds? More? In what time range? So starting from pure catalog data we can work towards a more integral solution, without loosing track of the changes or merges we did. That's why I think that such a database would make sense. If you are willing to start formalizing the editing processes you did so far, I could build and initialize a SQL Database, and write a web frontend for it. Then next task would be a to create a pre-processing utility for celestia format output.
maxim
What I'm saying is that the buildup of a 'merge, select & change' database needn't be done perfect from the beginning. How many datasets did you change? Dozens? Hundreds? More? In what time range? So starting from pure catalog data we can work towards a more integral solution, without loosing track of the changes or merges we did. That's why I think that such a database would make sense. If you are willing to start formalizing the editing processes you did so far, I could build and initialize a SQL Database, and write a web frontend for it. Then next task would be a to create a pre-processing utility for celestia format output.
maxim
-
Topic authort00fri
- Developer
- Posts: 8772
- Joined: 29.03.2002
- Age: 22
- With us: 22 years 8 months
- Location: Hamburg, Germany
maxim wrote:...
What I'm saying is that the buildup of a 'merge, select & change' database needn't be done perfect from the beginning. How many datasets did you change? Dozens? Hundreds? More? In what time range? So starting from pure catalog data we can work towards a more integral solution, without loosing track of the changes or merges we did. That's why I think that such a database would make sense.
...
maxim
Maxim,
certainly...
But I ask myself why one should have to dive into SQL for this. After all, SourceForge is operating on CVS, we all use CVS, and CVS could perfectly administrate an arbitrary number of patch files plus Changelogs relative to original catalog files.
So what's the issue here?
Bye Fridger
There's not much to 'dive' into SQL - it's a quite simple and straightforward kind of language.
(Not to mention that a CVS surely also has a SQL database backend )
The basic ideas are the following:
- not only one person will work on the database/'edit changes'.
- several people might want to check for certain changes without doing some annoying file scans.
- there might be some sience guys who'd like to use the results for own purposes (that's ok - it's open source).
- There certainly will be several extensions to the dataset because of new situations on catalog merging.
- one might like to create different result sets, depending on special needs.
- one might like to easily create 'patch files', XML files or something else out of the result sets.
- one might even directly create celestia star catalog files.
So the simplest an most straightforward way is to set up a database on a mySQL/PHP/Apache Server and create a web frontend for it. It would be public, could be hosted anywhere (Ibiblio could do it, as could I) and everybody would be happy (hopefully).
Did I made myself clearer somehow?
maxim
Added later:
Just to avoid any missunderstandings - reading or making a post on this forum also involves a lot of SQL database access in the background. So a user at the frontend just types his data into the formular fields, without having anything to do with SQL itself.
(Not to mention that a CVS surely also has a SQL database backend )
The basic ideas are the following:
- not only one person will work on the database/'edit changes'.
- several people might want to check for certain changes without doing some annoying file scans.
- there might be some sience guys who'd like to use the results for own purposes (that's ok - it's open source).
- There certainly will be several extensions to the dataset because of new situations on catalog merging.
- one might like to create different result sets, depending on special needs.
- one might like to easily create 'patch files', XML files or something else out of the result sets.
- one might even directly create celestia star catalog files.
So the simplest an most straightforward way is to set up a database on a mySQL/PHP/Apache Server and create a web frontend for it. It would be public, could be hosted anywhere (Ibiblio could do it, as could I) and everybody would be happy (hopefully).
Did I made myself clearer somehow?
maxim
Added later:
Just to avoid any missunderstandings - reading or making a post on this forum also involves a lot of SQL database access in the background. So a user at the frontend just types his data into the formular fields, without having anything to do with SQL itself.
Last edited by maxim on 11.12.2004, 23:05, edited 1 time in total.
-
Topic authort00fri
- Developer
- Posts: 8772
- Joined: 29.03.2002
- Age: 22
- With us: 22 years 8 months
- Location: Hamburg, Germany
maxim wrote:There's not much to 'dive' into SQL - it's a quite simple and straightforward kind of language.
(Not to mention that a CVS surely also has a SQL database backend )
The basic ideas are the following:
- not only one person will work on the database/'edit changes'.
- several people might want to check for certain changes without doing some annoying file scans.
- there might be some sience guys who'd like to use the results for own purposes (that's ok - it's open source).
- There certainly will be several extensions to the dataset because of new situations on catalog merging.
- one might like to create different result sets, depending on special needs.
- one might like to easily create 'patch files', XML files or something else out of the result sets.
- one might even directly create celestia star catalog files.
So the simplest an most straightforward way is to set up a database on a mySQL/PHP/Apache Server and create a web frontend for it. It would be public, could be hosted anywhere (Ibiblio could do it, as could I) and everybody would be happy (hopefully).
Did I made myself clearer somehow?
maxim
Maxim,
I think I saw the whole thing quite clearly from the start. Personally, I have no problem with SQL. But since developers tend to "live" at SourceForge and SourceForge works with CVS, you still have not answered my original question. The various steps you enumerated in your previous post /all/ apply also to CVS. So I am looking for an agument from you in favor of SQL that CVS cannot handle.
Bye Fridger
t00fri wrote:The various steps you enumerated in your previous post /all/ apply also to CVS. So I am looking for an agument from you in favor of SQL that CVS cannot handle
It's not a question of CVS 'or' SQL, because SQL 'is' only the backend database, not visible to the user, whereas CVS is the logic of using the database.
CVS is primarily designed for dealing with whole files as needed for (software) development and the changes on them. Or problem are single entries in catalogs (= databases) and the combination, identification and corrections on them. Each such entry will only result in one line in the resulting 'patch file', but for every single entry there may be totally different strategies neccessary for getting a consistent result out of the existing catalogs. While of course for 95% percent of catalog entries the merging task will be totally simple, it could be arbitrary complex for the remaining 5% as Grant tried to demonstrate above.
Having a clever defined dataset and some clever combination strategies (resulting in some PHP, Java or Perl coded algorithms automatically applied when extracting a certain result set out of the database and showing it on the web frontend) will deal with even exotic needs here.
I can't see that a CVS logic, that is designed for working group development tasks can help us here. I can't especially see how one will:
- create different result sets, depending on special needs.
- create 'patch files', XML files or something else out of the result sets.
- directly create celestia star catalog files.
with the help of a CVS.
maxim
-
Topic authort00fri
- Developer
- Posts: 8772
- Joined: 29.03.2002
- Age: 22
- With us: 22 years 8 months
- Location: Hamburg, Germany
Maxim,
thanks for your interesting explanations.
So what. In any case SQL plays at best a hidden role in CVS, unlike what I think you suggested above.
But when we submit patches to a file via CVS, we often enter "single entries" all over the place into files. What is the formal difference of a "catalog" and a "file", really? Of course, I know, I have many options to query catalogs and the changes made on catalogs etc. with SQL, for example. But this is not where I see the /main problematic/:
Rather, we need a transparent and sufficiently understandable logistics for the outside "user world", what precisely we have done to the original data! I doubt that this problem is alleviated significantly by embedding everything into an SQL environment rather than patching and documenting simply via CVS if necessary.
Here you implicitly said it yourself: this may be a cool and fancy setup, but the result coming out of all this at the user level is far from becoming more transparent by means of your proposed technology...
Bye Fridger
thanks for your interesting explanations.
maxim wrote:It's not a question of CVS 'or' SQL, because SQL 'is' only the backend database, not visible to the user, whereas CVS is the logic of using the database.t00fri wrote:The various steps you enumerated in your previous post /all/ apply also to CVS. So I am looking for an agument from you in favor of SQL that CVS cannot handle
So what. In any case SQL plays at best a hidden role in CVS, unlike what I think you suggested above.
Maxim wrote:CVS is primarily designed for dealing with whole files as needed for (software) development and the changes on them. Our problem are single entries in catalogs (= databases) and the combination, identification and corrections on them.
But when we submit patches to a file via CVS, we often enter "single entries" all over the place into files. What is the formal difference of a "catalog" and a "file", really? Of course, I know, I have many options to query catalogs and the changes made on catalogs etc. with SQL, for example. But this is not where I see the /main problematic/:
Rather, we need a transparent and sufficiently understandable logistics for the outside "user world", what precisely we have done to the original data! I doubt that this problem is alleviated significantly by embedding everything into an SQL environment rather than patching and documenting simply via CVS if necessary.
maxim wrote:Having a clever defined dataset and some clever combination strategies (resulting in some PHP, Java or Perl coded algorithms automatically applied when extracting a certain result set out of the database and showing it on the web frontend) will deal with even exotic needs here.
Here you implicitly said it yourself: this may be a cool and fancy setup, but the result coming out of all this at the user level is far from becoming more transparent by means of your proposed technology...
Bye Fridger
Hmm, yes and no!t00fri wrote:So what. In any case SQL plays at best a hidden role in CVS, unlike what I think you suggested above.
In the end the SQL database would be completely hidden by the frontend interface (a PHP/HTML, Java/HTML or Perl/HTML solution) and totally invisible to the user. Just like phpbb or cvs. The central process, and that what I was focused (not very obvious as it seems ) to talk about, is the definition process of the database. That will be, where all the knowledge is concentrated in the end.
Defining and formalizing a database - that means creating the dataset entries the tables of datasets and the relationship between them - is the main task of creating it. You have to evolve some carefull thoughts about what entries will be needed and how they will interdepend, what constraints you will give to dataset entries, what datasets are necessary and which could be occasionally left empty. That might now sound more difficult as it is in the end. The main advantage is, that you have to recall what you are doing when you merge catalogs and to formalize that process. This way you might (will) find weaknesses and strengths in your thoughts, and get for yourself a clearer view on the simple and the special cases. You might also find that the special cases are not so overwhelming many ones or so unsolvable as they looked at the first glance. In the end you will have a formal definition that everybody else can understand, and that could be easily documented. That's why I asked Grant if he could explain a formalized way how he did the catalog merging and combining.
After we did that process, the rest is simply a question of coding and implementing the database and then hide the SQL access by a web frontend with a convenient user interface.
A very simple definition example:
Catalog A has the entries a,b,c,d
Catalog B has the entries a,b,e,g
Celestia needs the entries a,b,d
The entry d could be computed by f(e,g)
So our database needs the dataset entries a,b,d,e,g where either d OR e,g could be left empty.
I wouldn't put that aside so easy. Having a database/catalog and having the intention to make Celestia more sientifical valuable (I think that's something you also would like to see), possibilities for different querys can be very very helpfull. A simple example: Imaging someone, who does some work about astronomy, knows celestia and would like to use it in a presentation or for other visualisation purposes. It would be perfect - unfortunately he wants to show stars with a parallax uncertainty of (let's say) 2%, whereas celestias default catalogs make a cutoff at 1% (it's only a example, ok). So the tool is unusable for him, if he can't create his own celestia star catalog tailored to his special needs.t00fri wrote:Of course, I know, I have many options to query catalogs and the changes made on catalogs etc. with SQL, for example. But this is not where I see the /main problematic/
t00fri wrote:Rather, we need a transparent and sufficiently understandable logistics for the outside "user world", what precisely we have done to the original data! I doubt that this problem is alleviated significantly by embedding everything into an SQL environment rather than patching and documenting simply via CVS if necessary.
If you present the results in a preedited file with some explanation or let it generate out of a database with your (the users) own constraints set, wouldn't make one version more transparent than the other. Because the definition process explained above, and the definition of possible query constraints create their own neccessary documentation, there should be sufficient explanations to cover all needs. The user could select between pure catalog data or data merged from some catalogs with a clear explanation what should be merged how (otherwise he wouldn't understand what result his query would have). So transparency is elemental for the use of a catalog database. Things aren't more hidden than if you write a perl script that does the automative editing process for you, and say 'So, here are the results'. I fact it should be much less hidden.
All in all I think that alone the neccesity to formulate a common understandable strategy for working on star catalogs for creating such database would make it worth doing it.
maxim
-
Topic authort00fri
- Developer
- Posts: 8772
- Joined: 29.03.2002
- Age: 22
- With us: 22 years 8 months
- Location: Hamburg, Germany
Hi all,
I think it makes sense to let people know that my binary star development work is stagnating since several weeks, since I am still waiting for Chris' promised detailed response to a number of necessary code changes that I proposed to him.
I have resent my original letter to him, but Chris seems to be busy with other things since some time.
So from my side, I am all set and ready to proceed. But I don't plan to do that work multiple times because the code framework is still unsettled.
So I got involved with making a new Titan texture in the meantime
Bye Fridger
I think it makes sense to let people know that my binary star development work is stagnating since several weeks, since I am still waiting for Chris' promised detailed response to a number of necessary code changes that I proposed to him.
I have resent my original letter to him, but Chris seems to be busy with other things since some time.
So from my side, I am all set and ready to proceed. But I don't plan to do that work multiple times because the code framework is still unsettled.
So I got involved with making a new Titan texture in the meantime
Bye Fridger