mch said:A random selection of 150 profiles of users who have posted over 100 times (ie 3.3% of registered users - not really a statistically significant figure I know)
kay said:mch said:A random selection of 150 profiles of users who have posted over 100 times (ie 3.3% of registered users - not really a statistically significant figure I know)
An interesting topic- glad you started this. But you can't wave a statistic under the nose of a statistician and expect them not to go off on a discourse that will have everyone else dying of boredom ... ;-)
The 3.3% isn't the real problem - contrary to popular belief, it's the size of the sample rather than what proportion it is of the population that governs the confidence you can have in the result - a sample of 150 from a population of 100,000 is as good as a sample of 150 from a population of 1000.
There's more effect from whether your sample is truly random or not. You took a sample from those who've posted over 100 times, so that's clearly not a random sample of registered users, so you can't regard your result as a reliable estimate of the average age of registered users. On the other hand, you may have wanted to look at regular posters only, in which case your "population" was posters of over 100 posts. How reliable your average age is as an estimate of the average age of posters of over 100 posts depends on your sampling technique and the reasons behind the very high number of missing values (eg it may be that older posters are less willing than younger ones to reveal their age on a forum, or vice versa) - and of course you can't then extrapolate to the different population of "all regular users".
2xw said:kay said:mch said:A random selection of 150 profiles of users who have posted over 100 times (ie 3.3% of registered users - not really a statistically significant figure I know)
An interesting topic- glad you started this. But you can't wave a statistic under the nose of a statistician and expect them not to go off on a discourse that will have everyone else dying of boredom ... ;-)
The 3.3% isn't the real problem - contrary to popular belief, it's the size of the sample rather than what proportion it is of the population that governs the confidence you can have in the result - a sample of 150 from a population of 100,000 is as good as a sample of 150 from a population of 1000.
There's more effect from whether your sample is truly random or not. You took a sample from those who've posted over 100 times, so that's clearly not a random sample of registered users, so you can't regard your result as a reliable estimate of the average age of registered users. On the other hand, you may have wanted to look at regular posters only, in which case your "population" was posters of over 100 posts. How reliable your average age is as an estimate of the average age of posters of over 100 posts depends on your sampling technique and the reasons behind the very high number of missing values (eg it may be that older posters are less willing than younger ones to reveal their age on a forum, or vice versa) - and of course you can't then extrapolate to the different population of "all regular users".
I was thinking of writing something in Rcrawler for this sort of analysis to sample the entire user base when I have spare time (although I'd have to ask Badlad nicely and I suspect they might be able to do this sort of analysis themselves...)
kay said:mch said:A random selection of 150 profiles of users who have posted over 100 times (ie 3.3% of registered users - not really a statistically significant figure I know)
An interesting topic- glad you started this. But you can't wave a statistic under the nose of a statistician and expect them not to go off on a discourse that will have everyone else dying of boredom ... ;-)
The 3.3% isn't the real problem - contrary to popular belief, it's the size of the sample rather than what proportion it is of the population that governs the confidence you can have in the result - a sample of 150 from a population of 100,000 is as good as a sample of 150 from a population of 1000.
There's more effect from whether your sample is truly random or not. You took a sample from those who've posted over 100 times, so that's clearly not a random sample of registered users, so you can't regard your result as a reliable estimate of the average age of registered users. On the other hand, you may have wanted to look at regular posters only, in which case your "population" was posters of over 100 posts. How reliable your average age is as an estimate of the average age of posters of over 100 posts depends on your sampling technique and the reasons behind the very high number of missing values (eg it may be that older posters are less willing than younger ones to reveal their age on a forum, or vice versa) - and of course you can't then extrapolate to the different population of "all regular users".
Nice one teabag! It will be interesting to see how many of the 4,000+ users respond.teabag said:I've made us a poll.