Over the past few weeks, I've attempted to determine just how accurate the ThinkerX ASCC distances are, and I've reached some tentative conclusions (although ThinkerX privately told me he can't quite replicate my results... we're not entirely sure why. I will attempt to highlight where I think the points of contention are). Error analysis is not my forte, so Fridger may have lots of issues with this, but here goes...
In summary: - Used as described (3 relations selected by spectral type, one per each of three colors): 1-sigma errors are 19.9% (for dwarf stars)
- Relaxing the requirement that ALL three relations have to work: 1-sigma error is 22.3%.
- Used as a purely photometric method (using any relation that's valid for the particular color range, ignoring spectral type): 1-sigma error is 20.3%.
There are problems with both the cutoff used to weed out giants, and some kind of systematic color-dependent error where blue stars (V-J < 1) are predicted to be closer than they really are.
What I did:ThinkerX sent me his 'key' document, which lists all 57 color relations (3 colors, 19 spectral types). For each color relation, there's a dead-average absolute magnitude, a dead-average color, a slope, and a range of colors it's valid for. This let me transform them into linear functions.
Seen here, I have plotted them in red on top of my test dataset (described next)
These graphs were made from the results of the purely photometric method.
Blue=More than 1 valid relation, dwarf
Green=More than 1 valid relation, giant
Purple= 1 valid relation, dwarf
Ochre = 1 valid relation, giant
Red= no relations worked
The black lines connect the dead-average colors and magnitudes for each spectral type.
I then wrote a program to apply them to a set of stars, and downloaded all the stars in the revised Hipparcos catalog (van Leeuwen 2007) with parallaxes greater than 10 mas, and errors less than 2 mas (ie, a maximum of 20% error at a maximum of 100 pc). I also pulled the ASCC entries for those stars so I would have the same B, V, J, H colors and spectral types that he had.
(There may be two points of error here: I used revised Hipparcos values, not Hipparcos 1997 values; AND my error is up to 20% unlike his strict 5% limit, which may have inadvertently made ThinkerX's relations only look that accurate).Throughout this analysis, I will mention calculating the error. What this generally means is that I removed all the stars identified as giants, and then sorted the output table by error to find the percent distance error that 68.2% of the distances were within. (68.2% of normally/gaussian distributed measurements should be within the standard quoted error, ie 1-sigma; 95.4% will be within twice that, ie 2-sigma) This is less robust than, say, actually fitting a gaussian distribution to the data, but I figure it's roughly representative.
Method 1: (spectro-photometric)The program runs through the list of stars; if there is a valid spectral type, I calculate the distances using the correct color relations for that spectral type. Because ThinkerX gave ranges for which each color relation is valid, I won't calculate it if the color is outside those ranges. For instance, a G0 star with a B-V color of 1.5 is bizarre, so we assume something in the photometry or spectral type is wrong (reddening, wrong star, bad photometry) and don't bother.
Each of the up to 3 distances is assigned a representative 1-sigma error; I use 1/error^2 as a weight and take the weighted average of the values.
I calculated this 1-sigma error by shutting off all but one distance relation, running the code on ALL the stars, and seeing how accurate the distances produced by just that color were. For the spectrophotometric method, V-J vs MJ was the most accurate at 21.6%; then J-H vs MH at 22.0%, and finally B-V vs MV at 23.4%; using weights meant that the probably-less-accurate B-V distance counted less in the final result.
(this may be another point of contention, if you take the straight average versus a weighted average, your final distances will be different... it doesn't explain why my distances had HIGHER errors)Finally, a photometric distance relation's downfall is that it cannot (easily) distinguish between nearby dwarfs and distant, extremely luminous, giants. One popular way to get around this is to use proper motion. All things being equal, a distant star should seem to move slower across the sky than a nearby star. ThinkerX gave values of proper motion (for various distances) he considered too low to be a dwarf; I applied this cut to my stars.
With this done, I found that the weighted average distances for dwarfs were accurate to 22.3% (1-sigma), based on comparing them to the van Leeuwen HIPPARCOS trigonometric parallaxes. Interestingly, when I took the weighted standard deviation (which is SUPPOSED to get you a 1-sigma error) of those three distances I got 9.4% distance errors, which is too small and means there's some systematic effect going on to make them agree so well internally. I added a fudge factor of 20.2% of the distance ( ie, sqrt(20.2^2+9.4^2) = 22.3%) to all the standard deviations to make them more reasonable overall. This DOES mean that the minimum quoted error coming out of my code is 20.2% regardless of how good it is.
For well-behaved stars where all three color-relations worked, the weighted average distances were accurate to 19.9%, and I had to add a 17% fudge factor to make them work. This make sense, because if a relation didn't work, the star has strange photometric colors and probably isn't a normal main-sequence star anyway.
Method 2: (Purely photometric)This was my own bias, and may be useful if ThinkerX decides to publish a catalog for the
entire ASCC, because most of the ASCC catalog does not have spectral types.
Anyway, the program worked as before, except without a spectral type, I could not choose between the different B-V vs MV relations, or any of the others. I therefore attempted to apply ALL of them, and ended up with as many as 30 distance estimates. As before, I shut off the other colors to find out how accurate (for instance) B-V was on its own. This time, I could actually calculate the standard deviation of multiple B-V (etc.) distances for each star, but they underestimated how far they were from the van Leeuwen HIPPARCOS distance to the star, and I had to add more fudge factors.
Once I set the fudge factors for each color, I calculated errors (all relations using a particular color got the same error, and therefore weight) and then the weighted average distance and weighted standard deviation. As before, this means that the colors that are less precise, count less.
Overall, THIS method had a 20.3% accuracy; I had to add a 14.1% fudge factor to the weighted standard deviation to make the code produce a 1-sigma error of 20.3% to match. If ThinkerX uses something similar to get distances to ALL of ASCC, they will likely be accurate to 20.3%.
Final thoughts:This technique compares pretty well to actual scientific photometric distance relations. Henry et al. 2004 (VRIJHKs relations for K and M dwarfs) includes a standard fudge factor of 15.25% and that fudge factor contributes most of the error quoted for any given main-sequence star; Weis was accurate to 20%; Breddels et al. 2010 actually quotes larger (50%!) errors using more information than ThinkerX had. I wish I was more familiar with the PMSU methods (Reid et al. 1995, Hawley et al. 1996) but I believe they aren't as accurate as the Henry et al. 2004 relations.
I assumed the van Leeuwen HIPPARCOS values were absolutely correct, even though they quoted up to 20% errors. This means my errors may be overestimated, ie, I was trying to match a HIPPARCOS distance that was
also wrong. ThinkerX used 5% distance errors in his analysis, which may be more reasonable.
Apart from removing everything with a Hipparcos component # greater than 1, I did not filter out any binaries; ThinkerX spent a lot of time doing that.
The purely photometric distances suffer from a problem that M stars legitimately have the same J-H colors as K type stars (notice how the main sequence curls back on itself). Thus, Barnard's Star was placed 16 pc away, not 1.8... If you cut by spectral type, Barnard's Star is discarded because ThinkerX did not define any relations for stars cooler than K7.
The giant cutoff (or at least my implementation of it) accidentally removes a lot of main-sequence F stars. I'm less worried about it leaving a few giants as main sequence stars; they might genuinely be faster moving than average and there's really no way to fix that apart from spectral typing or trigonometric parallax.
What you should see here is the giants (the blob around B-V=1.0, Reduced V = -5) as yellow/green, and nothing else as yellow/green.
(Data from the spectro-photometric method)
'Reduced proper motion' is what you get when you use proper motion instead of parallax in your absolute magnitude formula. It explicitly assumes faster moving things are closer. Note that you can still almost see the main sequence.
Blue=More than 1 valid relation, dwarf
Green=More than 1 valid relation, giant
Purple= 1 valid relation, dwarf
Ochre = 1 valid relation, giant
Red= no relations were valid
The bluer stars (V-J < 1) are systematically predicted to be closer than they actually are. They should cluster around the 1:1 line here, and they tend to be on the upper (closer) side. Ignore the green/ochre points, they're giants and it's not a problem if they're predicted to be a lot closer- they are bright but have the same colors.
Blue=More than 1 valid relation, dwarf
Green=More than 1 valid relation, giant
Purple= 1 valid relation, dwarf
Ochre = 1 valid relation, giant
Red= no relations worked
Somehow removing this bias (another fudge factor?) would probably reduce the calculated 1-sigma errors.
This also shows up in the overall bias toward closer distances visible in this figure (I did not fit the gaussians, I just drew them in with the proper standard deviations. Note how the gaussian is centered, but the real distribution tends to negative percent offset, ie, closer).
Photometric everything-goes method:
Spectro-photometric method:
Percent offsets are (ThinkerX-van Leeuwen Hipparcos)/(van Leeuwen Hipparcos)x100