...
Currently our mailing list is used to send a weekly update (Stats News), where we include everything we find that may be of interest to researchers at CAMH. We have a Workshpo section where we inform about upcoming workshops that we offer, among other events. We have a Resources section where we add things we stumble upon in the internet (papers, blogs, videos) that are related to statistics or research and may be of interest for you. We have a Statistical Education section where we talk briefly about a new topic each week, usually related to the type of biostatistics that researchers at CAMH use. We have a Software section where we try to inform you about software related stuff, maybe some nice script, for example, or a nice R package.
You can subscribe to our mailing list by going to this website!
Below we have a list of the topics that appeared in Stats News.
056 (12/May/2020) – The meanings of Statistical Significance in clinical trials
Today we will comment a paper that came out in JAMA this past week. We encourage you to take a look at the paper because they have some nice examples of real trials that help getting the point across.
I though this was a nice paper to comment on because it makes cleat that the interpretation of a significant or not significant result may vary widely. That is just another way to say that being significant or not means very little by itself.
The key point that is emphasized in the paper is the need to take into account the cost related considerations always involved when stating that something is significant or not (which is interpreted as we are making a decision about the effectiveness of the intervention). Not only monetary cost is implied here, but all sort of costs from which important ones are possibility of harm to patients and suffering, as well as use of resources in general (time, equipment, bed that could be used by other patients…).
I think we can summarise it by saying that statistical significance offer you a certain amount of evidence, but the actual amount of evidence you want for a given treatment depends on a diversity of costs. We should demand more evidence from treatments that we know cause harm, or tie down lots of resources, because we don’t want to risk the incurrence of a high cost in exchange for no benefit.
In that sense a non-significant result may be enough evidence for us to change practice, if the treatment we are looking at is already known to cause some harm. We may then stop using such treatment. On the other hand, a non-significant result may still be inconclusive, and we may want to not disregard the treatment if it is something very cheap for which we see no unintended consequences, particularly if the trial design is not very good.
There are two important points that deserves further discussion in the paragraph above. The first is the considerations about using something like a treatment for which statistical evidences has not been shown. It sometimes has to be done because we don’t have strong statistical evidence. These considerations are things like ethics and the right of the patient to being informed. We need to consider all of that, the patient must be informed about the evidence to the extent of possible and we need to be careful that the use of such treatment is not interpreted as the treatment being backed by statistical evidence.
The other point is that the meaning of a significant result also depends on the trial design. And here we can get ourselves into a long discussion which is not the goal today. So, we will stick to a summary, which is to say that the statistical status in terms of the significance of a result may mean very little if the data quality is not good, if the power is low, if the trail was not properly conducted, if the measures are not the best ones, if the pre-registration does not exist or was not followed, if the statistical analysis was improperly conducted, etc. Yes, there are lots of things you must look at before you interpret that p-value.
Besides the cost and the technical aspects of trials, another important aspect is the amount of external evidence, particularly if the threshold used for declaring significant is 0.05, which is a very low bar. You can interpret 0.05 as the probability that our effect is some sort of random noise. The thing is that 0.05 is still a high probability for things that are consequential, a risk we don’t want to take. As an example, you can think that the probability of a fatality given Covid infection has been put at around 0.005 (0.5%). That is 10 times lower than our beloved 0.05 (5%), but for us, since the cost is too high (loss of life), 0.5% is not a low risk and so we will not take that chance by avoiding infection as much as we can.
So, yes, 0.05 is too high for consequential decisions and we will want more evidence which many times will come from external sources. This external evidence can be anything, ultimately, particularly for those into Bayesian Statistics, but I would say that the best external evidence is usually replication with independent teams and experiments. However, I can also see situations where the researcher is very confident about the treatment working, based on experience, related literature, animal experiments, theory and anecdotes. Sometimes you see a significant result and you say “well, no big deal, not surprising at all that this has a positive effect…”. You say that because you have some sort of external evidence, and sometimes that is very important. Other times it will be clear that the significant result is weird, and you will be skeptical. That is lack of external evidence for you. Obviously, you can have some distorted, biased view of things, which has to be factored in too.
So, again, maybe you should take a look at the paper and the examples, but the take home message in my view is that you must always consider a lot more than whether something is significant or not, and making decisions solely based on the significance is going to be associated with poor science practice and poor understanding or statistics
055 (05/May/2020) – Models are part art, part science
Last week we pointed you to some sources that looked at models for forecasting the spread of Covid, and then it occurred to me that we have an interesting example here of the fact that the modeling endeavour is not really objective. Otherwise, why would we have so many models? So, I just wanted to explore a little bit this point today as I think it is relevant for how we build models.
The first point is that there is no ready recipe out there for how to model your data. Although we will find the more traditional models to be frequently used in a sort of step-by-step way of model building, you not only don’t need to cling to any specific model as if it is the right model for your data, but also it may be counterproductive to do so in the sense that it impairs your ability to do good, creative modeling.
In a more statistical-centric view we may say that there are bad models, and some models are clearly bad, but it is usually unclear what the best model is. And what happens is that experience and even creativity will play some role in the model selection and in our ability to find a good model.
As I said, there are not recipes as far as I know, and that is why I think we still don’t have algorithms doing a good job in predicting the spread of Covid, or algorithms selecting models in general, for that matter. It is not that simple. In my view you will need a substantial amount of experience and knowledge of the different alternatives and of the subject matter in order to be able to build good models. The experience relates to the ability of visualizing the important aspects of the data, model and research question, all at once, and putting that together in a reasonable model.
The subjectivity in model building is also in the assumptions that we have to make, as is also well exemplified in the many Covid related models out there. For example, if a given model demands a specific epidemic spreading parameter, different researchers may reach different values for the parameters depending on their personal view of which data is best used for that end. Some folks may think that the available data is low quality to the extent that their personal guess may be more accurate, taking things to the extreme of subjectivity.
Another major source of assumptions and subjectivity in these models comes up when we must estimate or build models for the effect of physical distancing. Any mathematical definition of physical distancing will end up simplifying things to the extent that one must make decisions as to which simplifications are less harmful to the precision of the model.
Two weeks ago we talked about ARIMA models here in this section. These can be very objective models, and they are the kind of model that computers don’t have much trouble fitting to the data themselves, without human help. However, I have seen zero ARIMA model for the Covid trend! It is important to say that all models we have seen are actually modeling a series of cases, so we could use ARIMA, why not?
Well, you probably could try to use ARIMA models to fit the seasonal flu series, because it is long enough by now, and more importantly, stable. But in the case of Covid, the lack of information is so much that objective and simple models like ARIMA would have no chance to do a good job. Experienced researchers don’t even try it. There is also the fact that the ARIMA models are not traditional part of the epidemiological toolbox, so epidemiologists may not think of it because of that, but there are a lot more folks out there trying models than epidemiologists.
Those of us who are deep into math may not be very happy when we talk about adding subjectivities to our models. There are still folks against Bayesian statistics because it includes the subjectivity of the priors. But I guess we must recognize that the world out there is messy and our minds are able to add information to models that are not in the data. Information that comes in the choice of model and assumptions and our brains. And so the computer depend on us. For the time being.
054 (28/April/2020) – Centering predictors in Regression Models
...