Page 1 of 3

ASCC

Posted: 17.01.2011, 01:10
by ThinkerX
I am not certain if this is the correct subforum or not for this.

For the past few months, I have been working on using averaged, individually adjusted V, J, and H absolute magnitudes to provide distances to dwarf stars in the Hip/Tycho based ASCC catalogue based on those stars spectral type (F5 down to K7). The newest release of the ASCC included several hundred thousand spectral types (mostly from the HD Catalogue) and assigned J, H, and K band data to each star.

Previous estimates I have seen indicate there is something on the order of half a million dwarf stars in the ASCC. I estimate that when finished, this current list will have between 35,000 - 50,000 stars. If the results of the callibration tables hold true, then about 75% of those stars will have distances accurate to within 20%; and about half should be accurate to within 10%.

I am still not done yet, nor do I expect to be for another couple of months (this is my project this winter).

Would anybody here be interested?

Re: ASCC

Posted: 17.01.2011, 02:35
by selden
I'm not quite sure what you're asking -- if anyone would be interested in the results (of course!) or interested in helping (sorry, too many other projects :( )

Bear in mind that some people can be quite critical of the accuracy of spectrographic distances, so you'll have to provide a detailed description of your reduction methods.

Re: ASCC

Posted: 17.01.2011, 05:35
by ThinkerX
My apologies - I should have been clearer.

I am asking if anybody would be interested in the final (more or less) result, not for help.

As to the methods...

Yes, I am only too well aware of how inaccurate photometric/spectroscopic distances can be - and usually are. A few years ago, I spent a great deal of time comparing the photometric/spectroscopic distances determined back in the 1980's and early 90's with later distances from Hip. I checked probably 2000 or more in this way; for the most part the photometric/spectroscopic distances were out by more than 20% about half the time. There were a few exceptions, though.

The most notable of these was the work of a Dr Edward Weis, whose major work was determining distances to some 2000+ NLTT stars, among others. (Even now, his NLTT distances are the only ones for about half those stars). He also provided distances to over 600 of the VYS (MCC in Simbad, I think) almost all of which eventually ended up with Hip parallaxes.

I began my current project by comparing Weis's VYS distances with the Hip ones. There were 254 VYS stars where the Hip parallax (or on a few occasions a Yale or Gliese parallax instead) were within 5%. Weis's distances to those stars were accurate to within 15% for 189 of them, or a hit rate of 75% - vastly better than the norm for photometric distances.

Weis did two things that were a bit different than the norm:

First, he did extensive callibrations against similiar stars with well established parallax distances - most of his papers feature such a table.

Second, he didn't take just one distance; he took three - in the V, R, and I bands, and then averaged the distances together. By doing this, any problems with one distance tended to be cancelled out by the other two.

That was Weis.

What I am doing involves the V, J, and H bands (because of the 2MASS data incorporated into the ASCC). I found a table on Vizier giving 'dead average' absolute magnitudes and B-V values for stars of all spectral types and for classes I, III, and V. I then set about determining dead average values for J and H magnitudes, along with V-J and J-H values for the stars of the spectral types I was interested in by downloading several thousand Hip stars of those types with parallax errors of less than 5% and doing a great deal of tedious arithmetic.

I then took those 'dead average' absolute magnitudes and tweaked them, using B-V, V-J, and J-H respectively.

Consider: You are looking at two G2 type stars that share the same visual magnitude - 5.7 for the sake of argument here. One of these stars has a B-V of 0.55; the other has a B-V of 0.6. Hence, it is more likely - though far from certain that the star with the B-V of 0.55 is the brighter star, and thus further away. The same also applies to V-J and J-H.

Initially, what I did was to subtract the 'dead average' B-V (for the V Magnitude) from the 'dead average' absolute magnitude for a star of a given spectral type, and then add the actual B-V to the result. This gave me a small range of numbers to work from instead of a single point - but it wasn't enough, so I added a small multiplier to both sides of the equation. The result looks like this:

Dead Average Absolute Magnitude - (5 * Dead Average B-V) + (5 * B-V)

That is for V magnitude; for J and H magnitudes the multiplier (determined via a lot of trial and error) is always 2.5 and it is V-J or J-H respectively. For stars of F5 - G3 the multiplier for V is 3 instead of 5.

What all this accomplished was to give me photometric/spectroscopic magnitudes for V, J, and H that when averaged together are very close to the actual ones for plus or minus half a magnitude to either side of the dead average value. Within three tenths of a magnitude of the dead average value, the distance modulus is almost always within 10%, and the exceptions are almost always multiple star systems or variables (and much of the time this system will get those to within 20%). For stars with absolute magnitudes of 0.3 - 0.5 greater or lower than the dead average, the distance modulus's are usually accurate to within 10 - 20%, though there are a few where the distance modulus is almost dead on, and also a few more misses - though usually not by all that much (20-30% instead of 15-20%).

Apart from Variables and Multiple stars, the biggest problem with this system is super luminal stars of the same spectral type and class - stars with average absolute magnitudes a full point or more 'above' the dead average values I am using. Despite my best efforts, I was not able to identify more than a small number of these stars with the available information.
Were it not for these stars, my accuracy rate would be close to 90%. (I do remove the ones I can identify, though). Subdwarfs or population II stars - or at least much fainter than normal stars are also a bit of a problem. Generally, my distances for the superluminals are about half of what they should be; for the fainter ones they are about double what they should be, though this can and does vary.

Another problem was contamination of subgiants (Class IV) stars. I used B-V, V-J, and J-H filters to weed out the vast majority of class III and IV stars, but some mimic the values for class V stars so closely they get in anyhow. Probably about 10% of the total star count falls into this catagory (this is a problem with all photometric systems, apparentlY). However, weirdly enough, this system will actually provide accurate distances to these stars about half the time - usually to within 10%, according to the callibrations.

Re: ASCC

Posted: 17.01.2011, 17:57
by t00fri
Hi,

your project sounds VERY similar to the one by Pascal Hartman > 7 years ago.
Here is the summary of his data sets

http://pascal.hartman.free.fr/

and here a detailed description of his method.

http://pascal.hartman.free.fr/newcat.html

He also used ASCC-2.5 data.

There are certainly people in the Celestia community that don't care much about accuracy and inherent uncertainties. However, the (astro) physicists among us (including myself) certainly care a lot about that!

So unless you found a method that really allows to estimate inherent (systematic) uncertainties of your star distances, I think Pascal Hartman's star data files would be sufficient. I am sure you know that systematic errors do not in general become smaller by averaging...

Fridger

PS: Actually, Pascal Hartman's original data sets were incompatible in format with the more recent versions of Celestia. If you want to experiment with these stars by Pascal H., you have to download the respective data sets from the Celestia Motherlode. Grant Hutchison fixed the incompatibility.

http://www.celestiamotherlode.net/catal ... eator_id=6

Re: ASCC

Posted: 17.01.2011, 18:11
by starguy84
Sounds like a monumental undertaking... I know you aren't asking for help with the project, but I think a Reduced Proper Motion diagram will help with the superluminous stars. (my apologies if you've already thought of this)

Basically, it's an HR diagram where the 'absolute magnitude' is a 'reduced proper motion' made with the proper motion instead of the parallax. Basically, instead of MV = V + 5 log (parallax)+5, you're solving HV = V + 5 log (proper motion) + 5. There's a pretty good explanation of it (including graphs) in Lepine's first LSPM paper (starting on page 37 of this - http://arxiv.org/pdf/astro-ph/041207v1). Fortunately, the ASCC v3 has reliable proper motions in it, so you have an available source.

The way this works is that distance is related to proper motion: the farther away a star is, the slower its apparent motion across the sky. The giants, subgiants, dwarfs, subdwarfs and white dwarfs will separate out into moderately distinct areas on the graph, because (for the same color and apparent magnitude) the giant will be distant and thus have low proper motion; the dwarf will be nearby and thus have high proper motion. White dwarfs will be much closer than an A star with the same apparent magnitude, and should therefore have higher proper motion. And so on...

Obviously this isn't perfect; if nothing else the blobs merge together... but it's another way to weed out your giants.

Hope that helps... otherwise, it looks like a nice update to the huge ASCC-derived catalogs already up on the Motherlode.

Re: ASCC

Posted: 17.01.2011, 22:06
by ThinkerX
Yes, I am using a reduced proper motion cut as an additional means of weeding out the giants.

Actually, my first step after the initial download from the ASCC (I am doing this in blocks of 9999 stars) is to impose a cut that gets rid of stars with proper motion errors of greater than 50%. The last step involves getting rid of stars that have proper motion way too small for the distance I came up with. However, I am cautious here; stellar proper motion does vary quite radically. Or to put it another way, a slower speed does not always mean a greater distance.

So unless you found a method that really allows to estimate inherent (systematic) uncertainties of your star distances, I think Pascal Hartman's star data files would be sufficient. I am sure you know that systematic errors do not in general become smaller by averaging...

All I can say here is that in the callibrations, the combined averaged V, J, and H distance moduli provided distances accurate to within 20% about 10% more often than any of the individual components. Hmmm...if the V, J, and H distance moduli were accurate for about 100 stars each (much overlap, but not always the same), their combined averaged value would provide accurate distances for about 110 stars. This appeared to be fairly consistent. Much of the time, the combined value would still be accurate - even to within 5%! - if just one of the components was correct. (And about 1% of the time, I would end up with an accurate (within 10%) combined averaged distance where *none* of the three components were even close - though I am inclined to regard this as some sort of bizarre statistical fluke).

Apart from that, with the ASCC downloads, I kept only the stars with B, V, J, H, and K (though I didn't use K) errors of less than 0.1. With considerable hesitation - and only because it appeared to be compensated for - I accepted a magnitude scatter as high as 0.6 (this remains one of my bigger concerns - but I am assuming this applies to and was compensated for in the Tycho B and V magnitudes, and the 2MASS J, H, and K magnitudes appear to be of much higher quality).

On top of that, I also check the distances against others, when such exists. The big one so far is the photometric distances employed in the Geneva-Copenhagen survey (though I have severe doubts about them). Thus far (8000 stars, give or take), when at least semi reliable cross checks exists, I seem to be running fairly close to those numbers. Others include Carney (lot of population II stars back in the 90's), Weis (though his work was mostly confined to red dwarfs), the old Yale Parallax catalog (though their parallaxes are only slightly better than the Tycho ones), and Lee (spectroscopic distances to many stars of great proper motion back in the 80's) plus others. For what its worth (not much) about 10% of the time my distances are fairly close to the Tycho ones.

Another issue was the low quality or 'roughness' of the bulk of the ASCC spectral types. Hmmm...from my first download (7910 stars out of 9999)

Spect-----------Start------------Finish*
F5----------------997-------------576
F6-----------------7----------------N
F7-----------------2----------------N
F8---------------1564-------------944
F9---------------1------------------N
G0---------------802--------------393
G1---------------0------------------N
G2----------------48---------------24
G3----------------3----------------N
G4----------------1-----------------N
G5---------------1465--------------442
G6----------------1-----------------N
G7----------------2-----------------N
G8--------------103-----------------3
G9----------------2------------------N
K0---------------1830--------------138
K1-----------------1-----------------N
K2-----------------821-------------106
K3-----------------8-----------------N
K4-----------------0-----------------N
K5----------------249----------------2
K6-----------------0------------------N
K7-----------------1------------------N
K8-----------------2------------------N

*subject to further proper motion cuts.

'N' means I didn't process those stars, either because there were too few of them of or because I didn't have callibrations for them.

Also looks like the number of Giants just explodes past G5.

(I could not find enough G4V, G7V, G9V, or K8V stars with accurate enough distances to make reliable tables. I did make - but am less than thrilled with the tables for G6 and K7 (I had to do those with only about 50 stars each, less than half normal). I've never been entirely certain I solved all the systemic error problems with the F8, G0, G1, and G2 tables - despite my best efforts, I think their accuracy is probably right around 72-74%. The giants seemed easier to weed out for the K's for whatever reason (and there were far fewer superluminal stars to mess things up); with the callibration tables my accuracy with them was close to 90% (it was also pretty good with the G8V's, but those seem to be nearly absent in the ASCC so far). F5, G3, G5 all stand at around 75%, give or take a point or two.

Overall, it looks to me like a great many stars were 'folded' into neighboring spectral types. From what I've seen in subsequent downloads, the above ratios tend to hold fairly true. If the true spectral type is only one notch different (say G1 instead of G2) then my distances are probably still good. I ran a number of tests towards the end of the callibrations where I would apply the numbers for stars of say G0 on the F8 table (with average absolute magnitudes about 0.4 higher) and still came up with accurate distances for about 75% of what the numbers for that table would. Now...if the star in question is G0 in the ASCC and the real spectral type is G5...that would be more serious.

Re: ASCC

Posted: 19.01.2011, 19:31
by starguy84
I think we can make a rough crack at estimating your individual and systematic errors... It can be done based on your results or by error propagation, and because my statistics are not strong I prefer the after-the-fact results method. Besides, if we compare results, we don't have to track down and quantify every single source of error in your calibration tables.

As Fridger pointed out, systematic errors do not disappear when averaged, but individual ones do.

So, if you take the standard deviation of (HIPPARCOS MV - your MV) for every star in ASCC that has a good parallax (10%?), that should give a measure of the systematic errors, or at least some idea of the spread of the data around your fit. Convert that magnitude standard deviation to a percentage difference and you have your systematic error... Another (easier?) way to calculate the systematic errors would be to take the standard deviation of abs(hipparcos distance-your distance)/hipparcos distance. Anyway, I get the impression you've already done this.

As for individual errors, Henry et al. 2004 take the standard deviation of the different photometric distances (in their case, 12 color relations; in yours, 3?) as the error; I think Weis 1984 quotes a similar number for his two relations.

The total resulting error on a distance should be sqrt((distance*systematic%)^2 + individual^2).
I believe, with my limited knowledge of statistics, that the last equation is correct as long as the systematic errors are independent of the individual errors. In theory, they should be; the systematic error describes how well your fit works given perfect data, and the individual error describes the errors in the measurement of the particular star. In practice, they probably aren't, but you should be able to test this too; your distances should be within 2-sigma of the Hipparcos distances 95% of the time: abs(hipparcos distance - your distance) < 2 * (hipparcos error + total resulting error).

As for those spectral types, I don't think Cannon defined any more than F5, F8, G0, G2, G5, G8, K0, K2, K3 and K5... any other types were interpolated and developed by Morgan, Keenan, and Kellman in 1943 (that was the one that added I-V for luminosity classes) to mostly match Cannon's types, but they are not quitethe same. So, there will be some inherent uncertainty in the actual spectral type because your 'dead average' absolute V magnitudes from VizieR are probably all using the MKK sequence, while ASCC is apparently using a mix of MK(K) and HD types. There are still no K6 or K8 or K9 types though, K7 was defined by Keenan & McNeil (1976) as being halfway between K5 and M0.

There are also some limits to how far you can push spectral types, anyway. I have personally found them to only really be accurate to one defined type, but I work with M stars that are a lot more confusing to get spectral types for.

Re: ASCC

Posted: 19.01.2011, 20:24
by starguy84
Oh, and obviously, comparing YPC parallaxes work too, if you've got them. There's a wide range of accuracy in the YPC, but they were compiling 160 years of parallaxes.

I also think your assumption that G0 stars won't turn out to be G5V stars is reasonable, and your tests that change the spectral types are reassuring. There will always be some oddballs (maybe the telescope was pointed at the wrong star, who knows?) but by and large I'd agree. The giants were probably easier to weed out of the K stars because there's more magnitudes of separation between K dwarfs and K giants (and thus more proper motion difference)... the giant branch turnoff is indeed near F/G stars, so what you're seeing there is also real.

My biggest problem with your method is the scope- by limiting yourself to only stars with spectral types, you're limiting yourself to only a few thousand stars, where Pascal Hartman already did over a million with purely photometric criteria. It'll be worth it if you can prove your distances are more accurate than his.

Re: ASCC

Posted: 20.01.2011, 05:03
by ThinkerX
Thanx, Starguy. I am very interested in this.

So, if you take the standard deviation of (HIPPARCOS MV - your MV) for every star in ASCC that has a good parallax (10%?), that should give a measure of the systematic errors, or at least some idea of the spread of the data around your fit. Convert that magnitude standard deviation to a percentage difference and you have your systematic error... Another (easier?) way to calculate the systematic errors would be to take the standard deviation of abs(hipparcos distance-your distance)/hipparcos distance. Anyway, I get the impression you've already done this.

I'm not sure. What I ultimately did - out of sheer frustration - was to directly match the distance modulus from the Hip parallaxes in the callibration tables with all four of the distance moduli from my averaging scheme. I took the Hip based Distance Modulus (which I assumed to be fully accurate), subtracted the spectrocopic DM's, and then because I didn't want to deal with negative numbers, added one to the result. Hence 1.0 would be a dead match, and each 0.1 of difference thereafter would be a 5% difference in distance. I counted 1.0 to 1.1 and 1.0 to 0.9 as being accurate to within 5%, 0.8 -0.9 and 1.1-1.2 as being accurate to within 5-10%; ect. To keep things visually simple (and avoid debilitating eyestrain) I actually inserted fields with little `t's' or 'h''s or what not in them: one such letter if the match was within 15-20%, 2 letters if the match was within 10-15%, ect. I did this for each of the V, J, and H moduli, along with the combined value.

Does that constitute calculating 'standard deviation''? My observation, then and now, was that as long as the star in question was not a close double, variable, underluminous, or superluminal then 90% of the time my distance would be accurate to within 20%, and 75% of the time (give or take a point or two) would be accurate to within 15%. (And within 5% well over half the time.)

The superluminal stars in particular perplex me. How can they be a over a full magnitude brighter than the 'average' and still be class V stars? Isn't that verging on subgiant territory? And are there really that many of them (something like 10 - 15% of the total).

One weird thing I did note which baffles me: both the really underluminous stars (the ones my scheme places at about double their actual distances) and the superluminous stars (which my scheme puts at half or less of their actual distance) tended to have almost but not quite identical J, H, V-J, and J-H values. So if I didn't already know the distance, I could look at the J, H, V-J, or J-H info for such a star and possibly tell it wouldn't work, but I wouldn't be able to tell if it was superluminous or underluminous. That in turn meant that I couldn't attempt a photometric distance - though with the addition of a proper motion evaluation...hmmm...

In a few of the tables I experimented with adding in the K's (which I dropped because the K magnitudes are so close to the H Magnitudes it actually skewed the whole table). I have wondered now and again if the K magnitude might have been a better choice instead of H for these calculations, but by the time that issue arose I already had a great deal of work done involving the H magnitudes, so I stuck with them. I also tried - and gave up on - a number of other schemes, some involving averaging, some not. The only one I actually retained applies to stars of F5 - G3; with those stars I count the J and H Distance Moduli twice instead of once and divide by five instead of three (because they tended to be more accurate for at least the bottom end of the superluminals, and still reasonably close for the more normal stars. This resulted in maybe a 5% improvement in the quality of the distances to some of those pesky superluminal stars).

As for individual errors, Henry et al. 2004 take the standard deviation of the different photometric distances (in their case, 12 color relations; in yours, 3?) as the error; I think Weis 1984 quotes a similar number for his two relations.

The total resulting error on a distance should be sqrt((distance*systematic%)^2 + individual^2).
I believe, with my limited knowledge of statistics, that the last equation is correct as long as the systematic errors are independent of the individual errors. In theory, they should be; the systematic error describes how well your fit works given perfect data, and the individual error describes the errors in the measurement of the particular star. In practice, they probably aren't, but you should be able to test this too; your distances should be within 2-sigma of the Hipparcos distances 95% of the time: abs(hipparcos distance - your distance) < 2 * (hipparcos error + total resulting error).

I will have to give this some thought. I would really like to be able to attach some sort of error bar to my distances, and had thought I would have to settle for just flagging the ones most likely to be in error. I will have to give some thought as to just exactly what keys to push on the computer to get a decent error bar scheme to work.

Secondary note here: You are familiar with Weis's work then? I was under the impression that nearly all memory of his work had vanished, even though he probably provided distances to at least as many stars as Gliese, and possibly more accurate distances much of the time as well.

As for those spectral types, I don't think Cannon defined any more than F5, F8, G0, G2, G5, G8, K0, K2, K3 and K5... any other types were interpolated and developed by Morgan, Keenan, and Kellman in 1943 (that was the one that added I-V for luminosity classes) to mostly match Cannon's types, but they are not quitethe same. So, there will be some inherent uncertainty in the actual spectral type because your 'dead average' absolute V magnitudes from VizieR are probably all using the MKK sequence, while ASCC is apparently using a mix of MK(K) and HD types. There are still no K6 or K8 or K9 types though, K7 was defined by Keenan & McNeil (1976) as being halfway between K5 and M0.

This explains a great deal. I knew the I-V classes were a later addition, but was not aware that Canon's magnitudes were so limited. However, some of my later ASCC downloads do include K8 type stars.

There are also some limits to how far you can push spectral types, anyway. I have personally found them to only really be accurate to one defined type, but I work with M stars that are a lot more confusing to get spectral types for.

Something else I am only to well aware of. Burnham, in his Celestial Handbook also complained about the 'authorities' assigning sometimes radically different spectral types to the same star.

Oh, and obviously, comparing YPC parallaxes work too, if you've got them. There's a wide range of accuracy in the YPC, but they were compiling 160 years of parallaxes.

This gets into another project of mine from a few years ago. I started to wonder just how accurate or not some of the parallaxes with *really* high error bars were, so I went and rounded up several hundred of the old Yale parallaxes with high error bars and split them into groups based on that. (One group with errors of around 15%, another group with errors of around 20%, all the way up to and including groups where the errors were greater than the parallaxes themselves). I then went and dug up the best Hip parallaxes for those same stars - ones with errors of less than 5% where possible. It turned out as I recollect (I'll have to see if I can't find my notes on this somewhere) that once the error bar for a given parallax gets past 20%, then the odds of the parallax being correct drop pretty dramatically. Or...if a parallax has an error bar of 25%, there is a 50% chance the parallax is wrong - and sometimes wrong past the limits of the error bar - meaning instead of being off by 25%, it could be off by something like 30 - 40%. Once the error bar hits 50% of the parallax, the odds of the parallax being right drop to about one in three...but weirdly enough, that is about as low as it goes. Even if the error bar tops 100%, the parallax still has about a 30% chance (more or less) of being right (or at least to within 15% or so) - but that also means it has about a 70% chance of being wrong. At least that was the conclusion I reached back then - as I recollect, I was trying to decide how far to trust the Tycho parallaxes. (Answer - not very far).

I also think your assumption that G0 stars won't turn out to be G5V stars is reasonable, and your tests that change the spectral types are reassuring. There will always be some oddballs (maybe the telescope was pointed at the wrong star, who knows?) but by and large I'd agree. The giants were probably easier to weed out of the K stars because there's more magnitudes of separation between K dwarfs and K giants (and thus more proper motion difference)... the giant branch turnoff is indeed near F/G stars, so what you're seeing there is also real.

Thanx. I was wondering about that. It seemed that some of the...limits... I had to impose on V-J and J-H for F8-G2 stars were...not quite where they should be, compared to the rest. I spent a lot of time puzzling over that.

My biggest problem with your method is the scope- by limiting yourself to only stars with spectral types, you're limiting yourself to only a few thousand stars, where Pascal Hartman already did over a million with purely photometric criteria. It'll be worth it if you can prove your distances are more accurate than his.

Actually, this is just part one. I have played around with these B-V, V-J, J-H numbers and the rest for so long, and have defined ranges for them for so many spectral types, I think I could almost do this *without* knowing the spectral type. Basically, pick a spectral type, and then pull the stars that fall within my B-V, V-J, and J-H criteria for stars of that spectral type, and run the numbers from there. I actually did a few preliminary callibration tables on this; it seems that about one third to one half of the stars summoned in such a manner will be of the desired spectral type, and another to one third or so will be of the immediately adjoining spectral types to either side, which my prior tests show my system can handle relatively well. The big concern would be overlaps; for example many G5V and G3V stars fall within the same overlapping B-V, V-J, and J-H criteria, meaning the same star could appear in more than one set, with differing distance values. I'd probably get around this by making sure the B-V criteria at least did not overlap. Anyhow, I have a suspicion I could probably add another 40,000 - 50,000 stars to the total, with the added dubious bonus of very rough spectral types. But that will probably be a project for next winter at the earliest.

All that said, aside from Pascal Hartmans efforts (which I did not know about until coming across this site a week or so ago), I am familiar with two previous attempts to identify dwarf stars in the Tycho/ASCC. The first was by Turnbull and Tarter, as part of their second 'HabCat' Catalog. They used proper motion and B-V cuts to define and pull a couple hundred thousand stars they beleived to be dwarfs of F5 - K5ish, which they refined further using something called a Gausian Random number generator. Their initial proper motion cut reduced the number of dwarf stars by about half. However, they didn't try to determine actual distances.

The other effort (was it by Aemons or Timmons? can't remember right off) did try to determine distances, using proper motion as a proxy for distance. My personal view is it didn't work very well; the positive and negative error bars for the distances tend to run at about 90% of the distance itself. Because they did the whole ASCC - including the Hip catalogue - I was able to compare many of their distances with near perfect Hip ones; most of the time they were not even close. However, I agree with their results in the broad statistical sense; they claimed something on the order of 600,000 dwarfs in the Tycho/ASCC; when you compare that with Turnbull and Tarters work and allow for their more strident criteria the conclusion is about the same -

- there is something on the order of 3-4 giant stars to every dwarf in the Tycho/ASCC. (And my work so far tends to support this).

Pascal Hartman, though...provided distances to over a million Tycho/ASCC stars, on the assumption that the vast majority were dwarfs, which to me doesn't seem to hold up.

Re: ASCC

Posted: 20.01.2011, 08:29
by starguy84
Does that constitute calculating 'standard deviation''?

Not quite, although it's on the right lines. You're right on the basic idea that you don't want negative numbers, but scientists like to force the issue; the two ways to force a number to be positive are to square it, or take the absolute value. The Standard Deviation is the square root of the sum of the squares of the errors, divided by the number of measurements; the Absolute Deviation is the sum of the absolute values of the errors, divided by the number of measurements. They basically do the same thing, although apparently the mean absolute deviation is always larger than the standard deviation (clearly I didn't think the math through; I could have sworn standard deviations would be bigger- Wikipedia has good articles on the forms of these)

I use scientific software packages that have a stddev() function in them/written for them, so I usually just use that. I don't know what you're using, but Excel and OpenOffice Calc have stddev functions too.

How can they be a over a full magnitude brighter than the 'average' and still be class V stars?

There are a couple of effects that may cause your stars to be superluminous.
1.) The main sequence has width, from metallicity effects and age effects. See this: http://www.astro.wisc.edu/~sparke/ast103/hipparcos_HR.jpg. Note how tall the main sequence is at B-V = 0.6; a star with that color could be anywhere from V=2.5 to V=5.5; around B-V=1.5 it could be V=8 to V=12. Obviously B-V is not the best color to use if you want to uniquely determine a V magnitude.
2a.) Lots of stars are multiple (binaries/trinaries). Duquennoy & Mayor (1991) or Raghavan et al. (2009) show nearly half of all G-type stars are multiple, under careful inspection. That number goes to nearly 100% with O, and only 33% with M stars. An equal luminosity binary can appear 0.69 magnitudes brighter.
2b.) Possible blended doubles. They might be thousands of light years apart (and very different colors!), but happen to fall very close to each other on the sky.
3.) Bad photometry (either poorly measured, or someone mismatched the Tycho and 2MASS objects)
4.) Bad spectral type.
5.) Bad distance determination method
6.) Variable stars. Less common if they're dwarfs.

Isn't that verging on subgiant territory?
It depends on where you are in the H-R diagram. Notice the position of the giant branch on this chart. http://www.astro.wisc.edu/~sparke/ast103/hipparcos_HR.jpg.

both the really underluminous stars and the superluminous stars tended to have almost but not quite identical J, H, V-J, and J-H values
Odd, don't know what that one comes from, although it may just be an area where the main sequence is tall.

You are familiar with Weis's work then?
Not very, but I know the name from his work on M star photometry, which my research group still uses.

started to wonder just how accurate or not some of the parallaxes with *really* high error bars were
You'll find an overabundance of parallaxes with 15 mas errors; I suspect van Altena assigned 15 mas to any parallaxes published without error (as was sometimes done). Anyway, you'll find YPC most useful for calibrating the red end of your relations (K dwarfs, M dwarfs) which are usually dimmer than abs V ~11, where things started getting too faint for Hipparcos.

I think I could almost do this *without* knowing the spectral type
You probably can; photometry is often more descriptive than spectral type especially when you have multiple filters to choose from. If I were to try this I'd try to find combinations of colors and absolute magnitudes where the main sequence is thin and horizontal (such that any specific color corresponds to a NARROW range of absolute magnitudes, and slight differences in color won't change the magnitude much) and work with those.
Obviously that's a lot of work, hopefully you've got a giant spreadsheet you can just chart up differently.

Re: ASCC

Posted: 21.01.2011, 02:44
by ThinkerX
Not quite, although it's on the right lines. You're right on the basic idea that you don't want negative numbers, but scientists like to force the issue; the two ways to force a number to be positive are to square it, or take the absolute value. The Standard Deviation is the square root of the sum of the squares of the errors, divided by the number of measurements; the Absolute Deviation is the sum of the absolute values of the errors, divided by the number of measurements. They basically do the same thing, although apparently the mean absolute deviation is always larger than the standard deviation (clearly I didn't think the math through; I could have sworn standard deviations would be bigger- Wikipedia has good articles on the forms of these)

I use scientific software packages that have a stddev() function in them/written for them, so I usually just use that. I don't know what you're using, but Excel and OpenOffice Calc have stddev functions too.

Hmmm...I will have to give this some more thought. I went out of my way to select stars where the values for the B, V, J, H, and K magnitudes were all under 0.1, and had thought about either averaging all those errors together (end result probably under 0.05 for most of them) or simply adding them all up (except maybe for K, since I don't use it). In the second case, most of them would probably have a cumulative error of around 0.1 - 0.2...but I am not sure if or how to work that into a standard deviation. I was just going to flag the ones that topped 0.1 with the first method or 0.2 with the second method as being suspect, and let it go from there.

Also, my system is designed (by dumb luck and accident, apparently) to provide relatively accurate distances for stars with absolute magnitudes within 0.5 of the dead average value. If I limited the claim to this, I could probably claim an 85% accuracy or 15% error deal without lying too overly much. However, where I would get into problems is if I would have to fess up to the system not working very well, if at all, with stars that have a magnitude difference greater than 0.5 from the 'dead average'.

Your explanation brings to mind why I am so suspicious of the claims made by the folks that put the Geneva Copenhagen survey together. As part of that project, they did 'photometric corrections' to a bunch of Hip stars that had errors over 15% and provided photometric distances to a couple thousand stars not in the Hip, and claimed those results were accurate to within 15% as well. Being younger and more naive then, I accepted that at face value. Sometime after that, though, I came across a very good Gliese parallax (error bar less than 10%) for one of the stars they gave photometric distances to. It didn't mesh all that well, so I started wondering just how accurate the Geneva Copenhagen distances really were, so I made a systematic effort and dug up maybe three dozen reasonably good Gliese, Yale, and passable Tycho parallaxes for some of those stars. Turned out that fully half the time, the Geneva Copenhagen distances were out by something like 50% or more, not 15% as claimed. A little after that, the recalculated Hip was released, and most of their 'photometric corrections' for those stars didn't fare all that well either. Since then, they claim to have revamped their catalogue, presumably including the purely photometric distances, but I still have doubts. I have noted so far that many of their photometric distances I have run across so far tend to run very close to the better Tycho parallaxes, which makes me wonder if they are not using the Tycho parallaxes to check their photometric work 'on the sly'.

1.) The main sequence has width, from metallicity effects and age effects. See this: http://www.astro.wisc.edu/~sparke/ast10 ... cos_HR.jpg. Note how tall the main sequence is at B-V = 0.6; a star with that color could be anywhere from V=2.5 to V=5.5; around B-V=1.5 it could be V=8 to V=12. Obviously B-V is not the best color to use if you want to uniquely determine a V magnitude.
2a.) Lots of stars are multiple (binaries/trinaries). Duquennoy & Mayor (1991) or Raghavan et al. (2009) show nearly half of all G-type stars are multiple, under careful inspection. That number goes to nearly 100% with O, and only 33% with M stars. An equal luminosity binary can appear 0.69 magnitudes brighter.
2b.) Possible blended doubles. They might be thousands of light years apart (and very different colors!), but happen to fall very close to each other on the sky.
3.) Bad photometry (either poorly measured, or someone mismatched the Tycho and 2MASS objects)
4.) Bad spectral type.
5.) Bad distance determination method
6.) Variable stars. Less common if they're dwarfs.

Hmmm... I was aware of severe problems with B-V alone, which is one reason I decided to go with the three magnitude scheme. I had also considered both metallicity and double star issues, though I am under the possibly false impression that metal poor stars would also automatically tend to be variables as well. At any rate, few of the superluminals I looked at were obvious variables or doubles. I didn't consider 3, 4, or 5, because these were all Hip stars which I assumed meant those issues to be fairly well settled.

Here is a snippit of that average absolute magnitude/B-V chart I found at Vizier:

Spect----------AbV(III)----B-V(III)--------AbV(V)------B-V(V)
F8---------------1.034------0.568---------4.000------0.52
F9---------------0.958------0.609---------4.216------0.55
G0--------------0.900------0.65-----------4.4---------0.58
G1--------------0.874------0.692----------4.557-----0.607
G2--------------0.874------0.733----------4.7--------0.63
G3--------------0.886------0.772----------4.838-----0.641

ect....

....thing is, if you go by B-V alone, like Pascal Hartman did you can get into serious trouble real fast. Is that star with a B-V of 0.61 a G1V dwarf...or a F8III giant? A proper motion check might help...or it might not, both because the PM data for the ASCC has severe problems (errors higher than the values for some of them) and because you might be looking at an F8III star that is moving a bit faster than normal, or a G1V star that is moving a bit more slowly.

For myself, I found the parts of the chart I used to be accurate enough. (I spent a great deal of time tweaking all of the numbers in the various callibration tables, including these. I was somewhat suspicious of the average absolute magnitudes he gave for F8 and K0, but eventually went with them anyhow). I also found myself wondering just how he assembled enough data to put parts of the chart together (like hardly any G9 stars, for example, and just how many class I stars are there in Hip).

And because you are into M stars for a living:

Spect---------AbV(III)----------B-V(III)-------AbV(V)-----------B-V(V)
K9------------- -0.351----------1.551---------8.415-----------1.359
M0------------ -0.400----------1.570---------8.8--------------1.4
M1------------ -0.512----------1.601---------9.295-----------1.443
M2------------ -0.600----------1.644---------9.900-----------1.49
M3------------ -0.578----------1.693--------10.607----------1.539

Don't know if those numbers look right to you or not. I think it actually runs down to M6 or some such, but I didn't copy that much of it.

Not very, but I know the name from his work on M star photometry, which my research group still uses.

Hmmm...wild guess here (if I'm out of bounds, let me know) --- 'N Stars project' or 'Meeting the Cool Neighbors'? Anyhow, it is heartening to hear that Weis hasn't fallen into complete obscurity.

One other item that might interest you...so far I have turned up a large number of stars (triple digits), K0 to K7, not in any parallax or photometric distance catalogue I am familiar with except Tycho, that appear to be within 50 parsecs. A good twelve or fifteen appear to be within 25 parsecs (and a couple are within 15 parsecs). Many have proper motion a bit lower than what you'd expect for stars so close...but the callibration tables also had quite a few stars of the same spectral class at comparable distances with even lower proper motions. Even if those were discounted, I still have probably half a dozen at least within twenty parsecs that do have proper motions right where they should be, and ...call it several dozen more... within 50 parsecs.

Re: ASCC

Posted: 23.01.2011, 22:17
by starguy84
I went out of my way to select stars where the values for the B, V, J, H, and K magnitudes were all under 0.1
That's probably a wise and defensible position... As you'll see below, trying to include errors on the magnitudes makes the problem MUCH more difficult, and in any case, Henry 2004 did not consider errors on the magnitudes. I'm not saying that analysis there is 100% correct, but you're pushing into territory successful refereed journal papers haven't.

However, where I would get into problems is if I would have to fess up to the system not working very well, if at all, with stars that have a magnitude difference greater than 0.5 from the 'dead average'.
In a previous post, you suggested that your pure B-V color selection boxes for G3 and G5 overlapped... what if you just push those with differences greater than 0.5 into the next box? Does that work? Or do your relations not work that way?

I have to confess, I also hadn't read the Geneva-Copenhagen survey thorougly enough to notice that they did photometric corrections... I was always told (in classes, too!) that they used Hipparcos parallaxes of F and G dwarfs alone. Nevertheless, there it is... It does look like, past their 13% limit, Hipparcos parallaxes simply aren't reliable at ALL.

And because you are into M stars for a living:
I don't know about anything but the M dwarfs, but those numbers look about right; the limiting absolute V magnitude for an M dwarf is somewhere between 8.5 and 9 with B-V = 1.4, so that's ok.

Anyway, some math that will help you get hard numbers out of your data so you can concretely say "this is right 45.6% of the time" or even select out only the best data for your Celestia add-on. I hope my interpretation of your project is right. And that my math is right... this really pushes the bounds of my knowledge of statistics.

You have (input data):
B - Johnson B either from converted Tycho Bt, or other ASCC catalog
V - Johnson V either from converted Tycho Vt, or other ASCC catalog
J - 2MASS J from 2MASS
H - 2MASS H from 2MASS
K - 2MASS K-short from 2MASS
1-sigma errors on all the above magnitudes, quoted in magnitudes

SpectralType - usually in the HD system, apparently.
we can guess at an error of +/- 1 type as entered in your calibration tables

PMra - mas/yr, ICRS reference frame
PMdec - mas/yr, ICRS reference frame
1-sigma errors on those measurements, quoted in magnitudes

You have (based on calibration tables) (and correct me if I'm wrong):
distance1 as a function of (B, V, spectral type)
distance2 as a function of (V, J, spectral type)
distance3 as a function of (J, H, spectral type)
distance4 as a function of (filter7,filter8, spectral type)
hipparcosDistance high-quality trigonometric parallax distances from Hipparcos/YPC?
giant/nongiant as a function of a proper motion diagram (pmra,pmdec,filter9,filter10)

You need:
realDistance - The weighted mean average of the spectrophotometric distances for a particular star
realDistanceError - combined systematic error and individual error for that star.

-----------------------------------------------------------------------------------------------------

My previous suggestions on errors ran something like this:

1. Systematic error (this is only dependent on the accuracy of the fit itself, assuming the data is perfect):
Method 1: Calculate your realDistance(s) for all Hipparcos stars you can find and THEN find out how well it agrees with Hipparos parallaxes (one error calculation):

Code: Select all

realDistance = (distance1 + distance2 + distance 3 + distance4) / 4
systematicDistanceError =  stddev( (hipparcosDistance-realDistance) /hipparcosDistance )

(Note that hipparcosDistance and realDistance are actually a long string of values)

Method 2: Get spectrophotometric distances for each color (four calculations) for all the Hipparcos stars. This would allow you to weight each color-distance relation by how well it works:

Code: Select all

systematicDistanceError1 = stddev( (hipparcosDistance-distance1)/hipparcosDistance )
systematicDistanceError2 = stddev( (hipparcosDistance-distance2)/hipparcosDistance )
systematicDistanceError3 = stddev( (hipparcosDistance-distance3)/hipparcosDistance )
systematicDistanceError4 = stddev( (hipparcosDistance-distance4)/hipparcosDistance )

Now your calculation for realDistance looks like this:

Code: Select all

realDistance= 1/systematicDistanceError1*(distance1) + 1/systematicDistanceError2*(distance2) + ...
                       ------------------------------------------------------------------------------------------------------------------
                                      1/systematicDistanceError1 + 1/systematicDistanceError2 + ....


2. Individual error:
If using method 1 (above) the individual error (per star) is

Code: Select all

individualError = stddev( distance1, distance2, distance3, distance4)

and the final combined error is

Code: Select all

realError = sqrt( individualError^2 + systematicDistanceError ^2)


if using method 2 (also above) the individual error is a weighted standard deviation function, which I have to confess I just looked up here:
http://www.itl.nist.gov/div898/software/dataplot/refman2/ch2/weightsd.pdf. It looks like a bit of a beast.
Figuring out how to get the realError confuses me. My best guess is that you should do it as in method 1 using the smallest systematicDistanceError of the four, because we weighted the other three measurements less so that they all (maybe?) contribute the same amount of error to the final answer. This is basically where my knowledge of statistics completely breaks down.

Code: Select all

realError = sqrt( individualError^2 + systematicDistanceError# ^2)


-----------------------------------------------------------------------------------------------------------------------

If you really want to go the route of distinguishing the errors for stars with JHK errors > 0.1, it gets even more complicated.

We've thus far been assuming that distance1, distance2, distance3 and distance4 are infinitely precise, but they should have errors dependent on each of their two filters AND their spectral type. To do this right (I think) you would need to determine, for example, what effect changing only BError has on the distance determination; the same for VError, and SpectralType+/-1 (and while I'm at it, there's the question of how likely this thing is to be a giant given your reduced proper motion diagram... if distance error bars can even be attached to that). Hopefully it's a linear relationship such that BError = 1 has five times the effect of BError= 0.2... Add those errors in quadrature:

Code: Select all

distance1Error = sqrt(BDistError^2+VDistError^2+SptypeDistError^2)

Once that's done, you now have a distance1Error to deal with. It should affect the weights of the distances in the realDistance determination (so method #2 gets even more complicated) and the values of the individualError calculation.

The problems don't stop there, though; all the distance#Error are dependent on spectral type (and have some filters in common) so they are NOT independent measurements, and you can't add things in quadrature (like the final realError calculation). I'm not actually sure HOW to do it, really.

Re: ASCC

Posted: 24.01.2011, 08:10
by ThinkerX
That's probably a wise and defensible position... As you'll see below, trying to include errors on the magnitudes makes the problem MUCH more difficult, and in any case, Henry 2004 did not consider errors on the magnitudes. I'm not saying that analysis there is 100% correct, but you're pushing into territory successful refereed journal papers haven't

That is encouraging...I hope.

In a previous post, you suggested that your pure B-V color selection boxes for G3 and G5 overlapped... what if you just push those with differences greater than 0.5 into the next box? Does that work? Or do your relations not work that way?

There is some overlap, but maybe not quite in the way you are thinking of. B-V, V-J, and J-H, are used to modify average absolute magnitudes for V, J, and H, respectively. Hmm... a snippit of the master chart.

These are the values I use to fix the initial limits

Spect-----Min----Max-----Min------Max---Min----Max
Spect-----B-V----B-V-----V-J------V-J----J-H----J-H

G2--------0.57---0.685--1.01-----1.26---0.24---0.415
G3--------0.59---0.736--1.015----1.25---0.18---0.44
G5--------0.61---0.826--1.067----1.43---0.24---0.48

You'll notice that the values for G2 seem to be a bit off; all I can say is I reached those values through a great deal of trial and error and comparison. But - those numbers just set the limits. For the distances themselves, I use these dead average values in conjunction with the actual B-V, V-J, and J-H.

Spect-----V-------B-V-------J------V-J--------H------J-H

G2-------4.7------0.63-----3.52---1.145-----3.294---0.286
G3-------4.838---0.649----3.674--1.159-----3.378---0.294
G5-------5.1------0.68-----3.755--1.225-----3.498---0.323

What matters at this point are the magnitudes that are chosen, and those are chosen based on the stars spectral type. Getting these numbers involved a great deal of tedious arithmetic plus a great deal of trial and error.

So...for a G5 star, the formula I use would look like this:

Absolute V Mag = 5.1-(5*0.68)+(5*BmV)
Absolute J Mag = 3.755-(2.5*1.225)+(2.5*VmJ)
Absolute H Mag = 3.498-(2.5*0.323)+(2.5*JmH)

These numbers are then subtracted from the normal V, J, and H magnitudes, giving me a distance modulus for each value, which I then average together for the final, actual distance modulus.

The field names I use have an `m' instead of a minus sign as a sort of shortcut.

Yes, I can and have run the numbers for G5 on G3 - and gotten accurate distances for just about all the stars up to and a bit over the average absolute magnitude for G3.

I am contemplating a kind of rough and ready merging of the two charts to determine star spectral types for stars that do not have such listed. I would use the averaged B-V values from the second chart, playing a bit of leapfrog, so to speak. I would jump from say...the dead average B-V value of G1 to the dead average B-V value of G3, and then use the normal dead average magnitudes and V-J and J-H values for G2. The overlap is such that most of the stars picked by this approach should be G1, G2, or G3 - but as far as actual distance determination goes, I would treat them all as G2, and the vast majority of those distances - at least for the normal stars - would be accurate. The fainter G0's and brighter G4's would also be accurate, I suppose. However, I need to do quite a bit more testing and tweaking before actually going ahead with this, though my initial tests looked promising.

I have a suspicion this bears little resemblance to any other photo spectroscopic distance scheme you have seen. However, it did well enough in the callibrations - and so far its matched up reasonably well with what few alternative distances I've found for the stars in the ASCC.

I have to confess, I also hadn't read the Geneva-Copenhagen survey thorougly enough to notice that they did photometric corrections... I was always told (in classes, too!) that they used Hipparcos parallaxes of F and G dwarfs alone. Nevertheless, there it is... It does look like, past their 13% limit, Hipparcos parallaxes simply aren't reliable at ALL.

Actually...the Hip parallaxes are probably acceptable up to about an error bar of 15%. Past that point...well...a parallax with a 20% error still has maybe a 70% or 75% chance of being accurate. What got me was the radical shifts in distances some Hip stars went between the first and second editions of that catalogue: quite a few stars that had error bars in the old catalogue of maybe 15% or 18% relocated by much more than the distance allowed for in that error bar. The vast majority, though, seem to have shifted maybe a quarter parsec, if that, one way or the other, something I found more annoying than useful.

As I pointed out earlier, I also noticed something similiar with the old Yale Parallaxes.

1. Systematic error (this is only dependent on the accuracy of the fit itself, assuming the data is perfect):
Method 1: Calculate your realDistance(s) for all Hipparcos stars you can find and THEN find out how well it agrees with Hipparos parallaxes (one error calculation):

Code: Select all
realDistance = (distance1 + distance2 + distance 3 + distance4) / 4
systematicDistanceError = stddev( (hipparcosDistance-realDistance) /hipparcosDistance )
(Note that hipparcosDistance and realDistance are actually a long string of values)

Not quite. Distance 4 is actually the averaged value of the first three distances; hence

Real Distance = (distance1 + distance2 + distance3)/3

For stars of F5 to G3, that formula would look like

Real Distance =(distance1 + distance2 + distance2 + distance3 + distance2) /5

(This is an attempt to try to account for the bottom rung of the superluminous stars).

Ok...past my bedtime, and I gotta get up and go to work in the morning.

Re: ASCC

Posted: 24.01.2011, 11:18
by t00fri
1. Systematic error (this is only dependent on the accuracy of the fit itself, assuming the data is perfect):
Method 1: Calculate your realDistance(s) for all Hipparcos stars you can find and THEN find out how well it agrees with Hipparos parallaxes (one error calculation):
Code:
realDistance = (distance1 + distance2 + distance 3 + distance4) / 4
systematicDistanceError = stddev( (hipparcosDistance-realDistance) /hipparcosDistance )

I have to disagree here at least with the used nomenclature. What you called a systematic error is actually the usual statistical error, with the distance taken as a stochastic variable that is distributed according to a normal distribution (Gauss). In such case N independent measurements <-> 'distance_i', with average <distance>,

<distance> := sum^N_i (distance_i)/N

lead to a decreasing relative error as 1/sqrt(N) of the average.

As the name suggests, systematic errors have a completely different interpretation. Typically, they are associated with errors that are connected with the methods or hardware of the measurement under consideration. Unfortunately, systematic errors are NOT normal-distributed and thus one is NOT allowed to add statistical and systematic errors in square like so:

error = sqrt( statistical ^2 + systematical^2)

The systematical error is usually added linearly and does NOT decrease by making a large number of measurements! A systematical error would be e.g. a certain unavoidable misalignment of the measuring device of the star parallax etc.

In case you happen to have a certain mathematical background, I am happy to quote some standard literature about the matter.

Fridger

Re: ASCC

Posted: 24.01.2011, 17:57
by starguy84
What you called a systematic error is actually the usual statistical error, with the distance taken as a stochastic variable that is distributed according to a normal distribution (Gauss).

Ah... And I thought 'systematic' was the error inherent in the system; here, that the distances can never be more accurate than the width of the main sequence at a particular color.

As the name suggests, systematic errors have a completely different interpretation. Typically, they are associated with errors that are connected with the methods or hardware of the measurement under consideration. Unfortunately, systematic errors are NOT normal-distributed and thus one is NOT allowed to add statistical and systematic errors in square like so:

How would one test for systematic errors, then? The easiest means of testing for systematic errors is having a different, calibrated dataset to compare them to; as such the only thing that can really be checked is Hipparcos distance versus ThinkerX distance (with the expectation that the systematics of the Hipparcos distances are as controlled as possible- I've read van Leeuwen's validation paper where he attempts to prove this)

Would plots of V-J vs (hipparcosDistance-realDistance) /hipparcosDistance (and the same for other colors, magnitudes, or even magnitude errors) work to demonstrate systematic errors? ie, if there are none, there should be no trend in the data, whose average would be zero at every point?

I'm sure there are other ways to have systematic errors- scaling errors, say, if there's any overcorrecting the distance in the cases of large differences between the color and the dead average color for that spectral type... How are those dealt with?

Re: ASCC

Posted: 24.01.2011, 19:07
by t00fri
starguy84 wrote:
What you called a systematic error is actually the usual statistical error, with the distance taken as a stochastic variable that is distributed according to a normal distribution (Gauss).

Ah... And I thought 'systematic' was the error inherent in the system; here, that the distances can never be more accurate than the width of the main sequence at a particular color.

Typical systematic errors would be e.g. associated with errors in the calibration of your measurement devices. Such errors would induce systematical shifts in your data, for example. After many measurements the shift would still prevail rather than average out. Errors that come from a "normal" scatter of measurements are so-called statistical errors that do get smaller by fluctuating towards the best estimate which equals the average value of all measurements taken.

Simple Example: throwing a dice. What would be a systematic error and what the statistical uncertainty when you throw the dice N times?

How would one test for systematic errors, then? The easiest means of testing for systematic errors is having a different, calibrated dataset to compare them to; as such the only thing that can really be checked is Hipparcos distance versus ThinkerX distance (with the expectation that the systematics of the Hipparcos distances are as controlled as possible- I've read van Leeuwen's validation paper where he attempts to prove this)
Reliable estimates of systematic errors are usually the hardest part of physics or astro-physics experiments. There are few general rules, beyond what you mentioned: extract the observable in question with two (or more) different methods. Often one can make use of general properties of the resulting data. Suppose you know that your data have to be positive. Finding some negative data then points to a systematic error (shift). Negative star parallaxes are a typical example here ;-)
Suppose your data have to be generally invariant under some rotation by 180 degrees. Then you can repeat the experiment with a detector that has been rotated by 180 degrees. The difference of the two data sets gives directly the respective systematic error. All other systematic errors should cancel in the difference of the two data sets, if you only rotate the detector (no other changes).

Fridger

Re: ASCC

Posted: 24.01.2011, 23:55
by Fenerit
t00fri wrote:Typical systematic errors would be e.g. associated with errors in the calibration of your measurement devices.

Aside the errors in the calibration, in the case the resolution of the measurement devices were too coarse for the supposed measure we are in search for, is that the instrumental error, right? Sorry for the intrusion, but I'm very interested in the papers you were proposed to link.

Re: ASCC

Posted: 25.01.2011, 01:03
by ThinkerX
Attracting some interest here.

Keeping in mind this is really 'statistical error' and not 'systematic error':

1. Systematic error (this is only dependent on the accuracy of the fit itself, assuming the data is perfect):
Method 1: Calculate your realDistance(s) for all Hipparcos stars you can find and THEN find out how well it agrees with Hipparos parallaxes (one error calculation):

I have 'ThinkerX' distances for something on the order of a thousand plus stars that also have Hip parallaxes accurate to within 5%. I suppose I could subtract my distances from the Hip distances, convert the result to absolute numbers (the statistical error), and then tally them up (ye Gods thats a lot of arithmetic!) and average them out for some sort of overall statistical error. I suspect that number would be on the order of 10-15%, which is about what the Geneva Copenhagen people claim for their photometric distances. (And the Geneva Copenhagen papers don't give individual errors). I don't know if that would be good enough, though...

As to actual systemic error...

Typical systematic errors would be e.g. associated with errors in the calibration of your measurement devices. Such errors would induce systematical shifts in your data, for example. After many measurements the shift would still prevail rather than average out. Errors that come from a "normal" scatter of measurements are so-called statistical errors that do get smaller by fluctuating towards the best estimate which equals the average value of all measurements taken

There is one item I was concerned about which caused me considerable hesitation: the ASCC gives a value for 'scatter' - light lost due to instrumentation problems, if I understand correctly. The values given for this in the ASCC can be pretty high - up to a full magnitude or more, which is about twenty times the norm - but it also seems to have been compensated for somehow; when I compared the Tycho B and V magnitudes with magnitudes from other sources they matched close enough; and scatter doesn't look to have been much of an issue with the 2MASS J, H, and K magnitudes. Ultimately, I decided to allow scatter as high as 0.6 (still way more than I'm comfortable with), even if it is compensated for.

Systemic error in my source data (though I was actually thinking of it in terms of 'scatter') is why I retained only stars with B, V, J, H, and K errors of less than 0.1.

Apart from that, if I understand correctly, any actual systemic (not statistical) error I have would be as a result of flaws in the formula I use to determine approximate absolute magnitudes. Hmm...here I can say is that if the true absolute magnitude is within 0.3 either way of the 'dead average' value, then the final combined distance is almost always accurate to within 10%, barring variables or close doubles. For magnitudes varying from the dead average value by 0.3 - 0.5, the distance is usually, though not always accurate to within 20%. Past 0.5, my system breaks down, and only rarely will get a distance accurate to within 20%. So...would that mean my formula has a systemic error that starts kicking in past 0.3 magnitudes? Or is that just an inherit limitation of the system itself?

Re: ASCC

Posted: 25.01.2011, 01:20
by t00fri
Fenerit wrote:
t00fri wrote:Typical systematic errors would be e.g. associated with errors in the calibration of your measurement devices.

Aside the errors in the calibration, in the case the resolution of the measurement devices were too coarse for the supposed measure we are in search for, is that the instrumental error, right?

In generality, there are only the two main cathegories: systematic and statistical errors.

Too coarse resolution may well contribute a statistical error, since in image manipulation we know that we can improve the resolution/contrast by superimposing N independent images of the same subject. This is also a well-known technique to improve contrast of astronomical pictures.

Sorry for the intrusion, but I'm very interested in the papers you were proposed to link.

Here is the standard (practical) reviews from the "Particle Data Group"/Univ. California, in Berkeley

http://pdg.lbl.gov/

that we use as a reference for this "bread and butter" subject of statistics/probability.

http://pdg.lbl.gov/2010/reviews/rpp2010 ... bility.pdf
http://pdg.lbl.gov/2010/reviews/rpp2010 ... istics.pdf

and for related MonteCarlo methods

http://pdg.lbl.gov/2010/reviews/rpp2010 ... niques.pdf

Despite being somewhat mundane, statistics/probability techniques are nevertheless very important in my professional research field of (astro-) particle physics.

Fridger

Re: ASCC

Posted: 25.01.2011, 01:48
by Fenerit
Thanks. 8)