Judy Stone. Conducting Clinical Research. Available at http://www.conductingclinicalresearch.com/index.php.

]]>Your student, I suspect, is using the term “reliable” in a generic sense meaning “trustworthy” or “persuasive.” The real question, I suspect, is not whether systematic reviews are reliable, but rather are they valid?

There are a few empirical studies that could possibly help resolve the question of validity.

Jacques LeLorier, Genevieve Gregoire, Abdeltif Benhaddad, Julie Lapierre, Francois Derderian. Discrepancies between Meta-Analyses and Subsequent Large Randomized, Controlled Trials. N Engl J Med. 1997;337(8):536-542.

What these authors did was compare a large randomized trial to an earlier published meta-analysis. The results were so-so, but read the article yourself and decide.

Ian Shrier, Jean-Francois Boivin, Robert Platt, et al. The interpretation of systematic reviews with meta-analyses: an objective or subjective process? BMC Medical Informatics and Decision Making. 2008;8(1):19.

This study presented a series of meta-analyses and found that different people had different interpretations of identical results.

Laura Rosen, Michal Ben Noach, Elliot Rosenberg. Missing the forest (plot) for the trees? A critique of the systematic review in tobacco control. BMC Medical Research Methodology. 2010;10(1):34.

This study looked at how much inconsistency there was in how different researchers conducted their meta-analyses and found that there were a lot of problems.

These empirical studies are not such a great endorsement, perhaps, but you have to put things in perspective. Compared to what preceded it, the expert editorial, a systematic overview is a vast improvement. Nothing makes this point better than the following satirical article.

A Caveman. The invited review – or, my field, from my standpoint, written by me using only my data and my ideas, and citing only my publications. J Cell Sci. 2000;113(18):3125-3126.

Does the systematic overview deserve a high spot in the hierarchy of evidence? You could argue that a systematic overview is no better than an observational study. One analogy that I cite is that a systematic overview is like a multi-center study where every center gets to choose its own protocol.

But another argument is that a systematic overview, by synthesizing multiple clinical trials, has to be better than any single clinical trial. If you have multiple clinical trials with potentially conflicting interpretations, how do you handle this in an evidence based fashion without a systematic overview? Do you chose the single study that has the best quality score or the one with the largest sample size and ignore everything else? That seems like a recipe for disaster.

Systematic overviews also should have better generalizability. If you find ten studies, each conducted in different geographical location, and they all say more or less the same thing, then you have better proof that the research will generalize to your geogrpahic setting than you would with any single study.

Another big advantage of systematic overviews is that they represent a repeatable and reproducible form of research. If you lay out your methods carefully, anyone else could repeat your work and get the same results. They can tweak your analysis if they don’t like your search strategy or your inclusion/exclusion criteria, or your method of pooling results.

Systematic overviews are also self-critical. If there is a limitation due to publication bias, heterogeneity, or lack of good quality studies, the tools of a systematic overview will identify those limitations.

Finally, most of the empirical evidence on the value of blinding, concealed allocation, and other aspects of clinical trials comes from systematic overviews. If you say that you believe in blinding but you don’t believe in systematic overviews, then your belief in blinding must be faith based and not evidence based.

Most statisticians that I know put very little faith in systematic overviews, but I am generally positive on them, especially on the systematic overviews conducted by the Cochrane Collaboration. They’re not perfect, but nothing in evidence based medicine is perfect.

]]>Julia Silge, David Robinson. Text Mining with R (online version). Available at http://tidytextmining.com/.

]]>As I understand it, a cohort is a group of patients defined by an exposure. A case is a patient who has a particular outcome. So the phrase “case cohort” is ambiguous. If you are mining the electronic health records and select every child who received aspirin to control a fever and compare them to a control group of children who received acetaminophen to control a fever with the hypothesis that Reye’s syndrome is more likely in the aspirin cohort, that’s a retrospective cohort design. If you are mining the same electronic health records and select every child who came down with Reye’s syndrome and compare them to a control group of children who did not experience Reye’s syndrome with the hypothesis that aspirin was taken more often among the Reye’s syndrome group, that’s a case control design.

It gets a little bit tricky because context is important. Low birth weight is an outcome to an obstetrician, but it is an exposure to a neonatologist. But if there is a temporal order where A precedes B in time and A is thought to be a cause of B, then defining your controls as not A makes it a cohort study and defining your controls as not B makes is a case control study.

I suspect you are using the word “cohort” in its generic sense meaning a group of people who share something in common. Or maybe you are using the word “case” in its generic sense meaning an instance. But if you are defining a case cohort from an Epidemiologic perspective, I would argue that it represents the intersection of A and B (all children who take aspirin AND develop Reye’s syndrome). If you think about this long enough, you realize that a “case cohort” makes no sense from a statistical perspective because no matter how you choose your controls, you’re going to end up missing one of the four cells in the two by two classification of A and B.

Another way of thinking about it is the direction of travel. If you retrospectively select on B and not B and go further back in time to see who does or does not have A, you have a case control design. If you retrospectively select on A and not A and then move forward in time (but not all the way back to the present) to see who does or does not have B, you have a retrospective cohort design.

Now there’s such a thing as a nested case control design. So you study a cohort versus a control (usually prospectively) and you find that some of your cohort develop your outcome and some do not. You’d like to run some additional tests (e.g., genetics) on all of the cases within your cohort, but it’s too expensive to run the tests on all the non-cases within your cohort. So you get a subset of non-cases from your cohort and because your sample size is smaller, you do some matching. So you have a case control design nested within a cohort design. Your hypothesis is that a positive result on the expensive test is more likely among cases than controls. Note that both cases and controls are drawn from the exposed cohort–the unexposed control cohort patients do not participate in a nested case-control study.

It’s also possible to nest a case control design within a randomized trial. Select all of the patients randomized to the active arm AND who experience the outcome of interest. Run a super expensive test on these patients. Now pick a small but carefully matched set of patients in the active arm who did not experience the outcome of interest and run those same tests.

To make things even more complicated, there is a case-cohort design and a case-crossover design. See 7.2 – Advanced Case-Control Designs | STAT 507 for details. I think I’ll stop here before my head explodes.

]]>Nagaraj Venkatesan. R and SQL Server 2016. Recorded and posted on the SQL Presented December 2016 and posted on the SQL Server Central blog. Available at: http://www.sqlservercentral.com/blogs/sql-and-sql-only/2016/12/22/r-and-sql-server-2016/.

]]>American Statistical Association. TSHS Resources Portal. Available at: https://www.causeweb.org/tshs/.

]]>Janice Donahoe, i2b2 Design Document: Ontology Management (ONT) Cell. Available at https://www.i2b2.org/software/files/PDF/current/Ontology_Design.pdf.

]]>A research team took ten scientists and placed them inside an fMRI. The fMRI shows which parts of your brain are active as your brain processes different types of information. The scientists were shown a variety of graphs taken from actual peer-reviewed publications.

As you might expect, the part of your brain that activates first when you are presented an image of a graph is your visual cortex. For most graphs, this was quickly followed by an activation of the parietal lobe, the part of your brain responsible for numerical computations.

But some graphs showed a different pattern. If the graph included a p-value, activation of the visual cortex is followed by activation of an area of your amygdala that is as yet poorly understood. The research team called this portion of the amygdala the p-value receptor.

If your p-value receptor is activated and the p-value is larger than 0.05, the p-value receptor sends strong signals to the pain centers of the brain. This is clearly an adaptive behavior. Scientists who routinely produce p-values larger than 0.05 will not survive and reproduce.

If the p-value receptor is activated and the p-value is 0.05 or smaller, the p-value receptor sends strong signals to the pleasure centers of the brain. Again this is an adaptive behavior. But the interesting finding is that there is a dose response effect. The p-value receptor produces about the same level of pleasure stimulation for p-values of 0.05, 0.04, 0.03, and 0.02. But p-values of 0.01 show an increase in stimulation that becomes even strong for p-values of 0.0099 and smaller. Perhaps there is some pattern associated with p-values that have two zeros to the right of the decimal place that is stronger then a p-value with just a single zero.

The scientists also examined the effect of p-values reported in scientific notation. There was an increase in latency when the p-value receptor is fed a p-value in scientific notation. This probably represents an attempt to decode the scientific notation. But p-values in scientific notation with exponents of -4 or smaller showed an eventual spike in activation of the pleasure centers of the brain that are comparable to those achieved during orgasm.

The research also noted a second important effect of the p-value receptor. Once the p-value receptor is stimulated, the entire cereberal cortex, the portion of your brain associated with logic and complex thinking, is immediately shut down. This insures that a scientist’s brain will focus only on the pleasure or pain associated with the p-value and will ignore the power of the study, the magnitude of the treatment effect, and other unimportant issues.

The researchers suggest that statisticians who want to earn more consulting income and insure repeated business should do their best to produce only p-values that stimulate the pleasure centers of the brain.

]]>You can find the code and some supporting text files at

https://github.com/pmean/bayesian-longitudinal

I usually start with a very simple Bayesian analysis because if something is awry, I’ll notice it faster with a simple example. The code below implements a simple beta-binomial model.

Make sure that you have jags installed on your computer

http://sourceforge.net/projects/mcmc-jags/files/

as well as the rjags package

https://cran.r-project.org/web/packages/rjags/index.html

Assume you have a beta prior distribution with alpha=4 and beta=16. You collect 60 observations and observe 18 successes and 42 failures.

You don’t really need jags for this. The posterior distribution is beta with alpha=4+18=22 and beta=16+42=58. The posterior mean is 0.275 and the posterior standard deviation is 0.05.

`library("rjags")`

`## Loading required package: coda`

`## Linked to JAGS 4.2.0`

`## Loaded modules: basemod,bugs`

```
fnm <- "jags_beta_binomial.txt"
mon <- c("pi")
out <- list(description="jags output")
dat <- list(a=4, b=16, x=18, n=60)
mod <- jags.model(fnm, data=dat, quiet=TRUE)
out$bb1 <-
coda.samples(mod, variable.names=mon, n.iter=1000, by=1000, progress.bar=NULL)
summary(out$bb1)
```

```
##
## Iterations = 1:1000
## Thinning interval = 1
## Number of chains = 1
## Sample size per chain = 1000
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## 0.273157 0.049814 0.001575 0.001575
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## 0.1811 0.2367 0.2718 0.3086 0.3698
```

Now, try to run one of the examples in the BUGS manuals that is reasonably close to your problem. For our problem, the very first example in volume 1 works well.

http://www.openbugs.net/Examples/Rats.html

I had to fix two things to get this example to run properly in jags. First, I had to remove some lines of code that used functions like post.p.value. Second, I have to transpose the matrix of data. But with these changes it ran just fine.

```
dat <- list(x = c(8.0, 15.0, 22.0, 29.0, 36.0), xbar = 22, N = 30, T = 5,
Y = structure(
.Data = c(151, 199, 246, 283, 320,
145, 199, 249, 293, 354,
147, 214, 263, 312, 328,
155, 200, 237, 272, 297,
135, 188, 230, 280, 323,
159, 210, 252, 298, 331,
141, 189, 231, 275, 305,
159, 201, 248, 297, 338,
177, 236, 285, 350, 376,
134, 182, 220, 260, 296,
160, 208, 261, 313, 352,
143, 188, 220, 273, 314,
154, 200, 244, 289, 325,
171, 221, 270, 326, 358,
163, 216, 242, 281, 312,
160, 207, 248, 288, 324,
142, 187, 234, 280, 316,
156, 203, 243, 283, 317,
157, 212, 259, 307, 336,
152, 203, 246, 286, 321,
154, 205, 253, 298, 334,
139, 190, 225, 267, 302,
146, 191, 229, 272, 302,
157, 211, 250, 285, 323,
132, 185, 237, 286, 331,
160, 207, 257, 303, 345,
169, 216, 261, 295, 333,
157, 205, 248, 289, 316,
137, 180, 219, 258, 291,
153, 200, 244, 286, 324),
.Dim = c(5,30)))
dat$Y <- t(dat$Y)
init <- list(alpha = c(250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250,
250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250),
beta = c(6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6),
alpha.c = 150, beta.c = 10,
tau.c = 1, alpha.tau = 1, beta.tau = 1)
fnm <- "jags_rats.txt"
mon <- c("alpha0", "beta.c", "sigma")
mod <- jags.model(fnm, dat, init, quiet=TRUE)
out$rats <-
coda.samples(mod, variable.names=mon, n.iter=1000, by=1000, progress.bar=NULL)
summary(out$rats)
```

```
##
## Iterations = 1:1000
## Thinning interval = 1
## Number of chains = 1
## Sample size per chain = 1000
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## alpha0 106.283 3.5884 0.113476 0.125482
## beta.c 6.196 0.1072 0.003391 0.004276
## sigma 6.108 0.6291 0.019893 0.033978
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## alpha0 99.495 104.008 106.248 108.670 113.344
## beta.c 5.996 6.128 6.193 6.268 6.399
## sigma 5.205 5.735 6.060 6.402 7.227
```

Now let’s see if we can modify this example to fit our particular setting.

Our longitudinal setting has two measurements: pre and post. So strip the last three columns from the matrix Y. There is one within subject factor, time, which is represents by X1. There is a between subjects factor, treatment group, which is represented by X2. Let’s pretend for now that that the first twenty observations represent your treatment group and the last ten observatitons represent your control group.

```
dat <- list(X1 = c(0, 1), X2 = rep(0:1, c(20, 10)), N = 30, T = 2,
Y = structure(
.Data = c(151, 199,
145, 199,
147, 214,
155, 200,
135, 188,
159, 210,
141, 189,
159, 201,
177, 236,
134, 182,
160, 208,
143, 188,
154, 200,
171, 221,
163, 216,
160, 207,
142, 187,
156, 203,
157, 212,
152, 203,
154, 205,
139, 190,
146, 191,
157, 211,
132, 185,
160, 207,
169, 216,
157, 205,
137, 180,
153, 200),
.Dim = c(2,30)))
dat$Y <- t(dat$Y)
init <- list(alpha = rep(150, 30), beta1 = rep(50, 30),
beta1.c = 50, beta1.tau=1,
beta2 = 20, beta3 = 5,
alpha.c = 150,
tau.c = 1, alpha.tau = 1)
fnm <- "jags_lon.txt"
mon <- c("alpha.c", "beta1.c", "beta2", "beta3")
mod <- jags.model(fnm, dat, init, quiet=TRUE)
out$lon <-
coda.samples(mod, variable.names=mon, n.iter=1000, by=1000, progress.bar=NULL)
summary(out$lon)
```

```
##
## Iterations = 1:1000
## Thinning interval = 1
## Number of chains = 1
## Sample size per chain = 1000
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## alpha.c 154.095 2.576 0.08145 0.4494
## beta1.c 50.186 1.136 0.03593 0.1414
## beta2 -5.756 4.172 0.13193 1.4509
## beta3 -1.593 1.723 0.05447 0.5767
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## alpha.c 148.583 152.538 154.2840 155.7871 158.7248
## beta1.c 47.995 49.452 50.1588 50.9269 52.4221
## beta2 -8.877 -7.483 -7.1730 -6.8746 9.7413
## beta3 -5.556 -2.352 -0.9216 -0.6312 0.9299
```

Okay, so far, so good. Your next step would be to put your own data in place of the data from the example. I won’t show that here because I don’t have your data. But I hope you can try the general outline suggested here.

- Fit the simplest model possible, a beta-binomial.
- Fit an example taken from the BUGS software that is reasonably close to your example.
- Modify the example to match some of the structure of your data set, but also keep some of the original data.
- Substitute your data for the example data.

You might want to iterate step 3, making a small change to match your data a bit more closely, then another small change, and so forth.

]]>