The results of the estimation done on 8 June were:
Model data 34 markers, all in the 111-marker test
Sources Ballantyne et al, Burgarella & Vascues, YHRD
No of mutations 888
No of meioses 240417
SS of error term (34 markers) 0.00019511
R-squared (34 markers) 0.426
F-statistic (34 markers) 11.521
Sum of mutation rates (111 markers) 0.432183
Results 1 July 2011
SS of error term (34 markers) 0.00017187
R-squared (34 markers) 0.505
F-statistic (34 markers) 15.845
Sum of mutation rates (111 markers) 0.409573
Results 24 July 2011
SS of error term (34 markers) 0.00016189
R-squared (34 markers) 0.541
F-statistic (34 markers) 18.25
Sum of mutation rates (111 markers) 0.413878
This is starting to look encouraging and a satisfactory fit may be possible in perhaps a month’s time. What seems to have happened is that the fit has improved as more 111-marker results have become known. Nine of the 34 markers included in the fit are in markers 68-111.
A file with the latest fit is at
http://dl.dropbox.com/u/2733445/MOD2307.xlsx
If you have an old version of Excel, you may have to use this link:
http://dl.dropbox.com/u/2733445/Copy%20of%20MOD2307.xls
The columns are
A DYS/DYF code
B Marker order in 111-marker series
C Observed M222 vriance
D Observed L21 variance
E Model mutation rate
F Observed mutation rate
G Square of error term
What’s interesting is that the estimated mutation rates of the much maligned markers CDYa,b are far lower than is commonly suggested, with a higher rate suggested for DYS710 that for CDYa,b.
It would be premature to use these modelled rates for any serious work at this stage.
Thrice.