Friday, November 03, 2017

SAD and MAD: Single Ascending Dose and Multiple Ascending Dose first-in-human studies

The acronym is everywhere in clinical trials. Previously I mentioned that in 21st Century Cure Act, an acronym RAT was used for Regnerative Advanced Therapy designation – the term ‘RAT’ was criticized and later was changed to MRAT(Regenerative Medicine Advanced Therapy) in FDA’s implementations.

Now we have a pair of names SAD and MAD commonly used in early phase clinical trials. It does not mean anybody will be sad or mad. A sponsor should be happy (not SAD or MAD) when its development program can progress into the clinical trial stage.
SAD stands for single ascending dose and MAD stands for multiple ascending dose. SAD and MAD studies are typically the first-in-human (FIH) studies. They seek to gain information on safety and tolerability, general pharmacokinetic (PK), and pharmacodynamic (PD) characteristics, and identify the maximum tolerated dose (MTD). SAD/MAD study can also be used to test the cardiac safety and evaluate QT/QTc prolongations.

There may be a lot of dose escalation studies that belong to SAD and MAD studies even though the SAD/MAD terms are not used. For example, the popular 3+3 design is one type of the SAD/MAD study with focuses on safety and tolerability.  

SAD/MAD studies are usually conducted in healthy volunteers in clinical research unit (CRU) or phase I unit. But they can be conducted in patients when it is unethical to test the experimental drug (for example, the oncology drugs and plasma-derived drugs) in healthy volunteers. SAD/MAD studies can be combined into one study within the same study protocol or conducted as two separate studies.

For SAD studies, the starting dose is based on the pre-clinical and animal studies. For MAD studies, the starting dose is usually based on results from the SAD study.

From the PK assessment standpoint, in SAD studies, each subject receives a single dose and the series PK samples can be taken to evaluate the PK profiles after single dose. The study will be conducted on cohort basis. Subjects within each cohort receive the same level of dose. In MAD studies, each subject receives multiple doses. After the steady state is achieved, the series PK samples will be taken to evaluate the PK profiles at the steady state. The study is conducted on cohort basis. Subjects within each cohort will receive the same level of dose. With the PK results from SAD/MAD studies, dose linearity and dose proportionality can be evaluated.   

From the safety assessment standpoint, in both SAD/MAD situations, the first cohort of subjects receive the lowest dose (starting dose). Subjects are usually confined in Clinical Research Unit (CRU) with close safety monitoring. After each cohort, safety and tolerability will be assessed to determine if the next cohort with higher dose should be continued. The safety evaluation after each cohort is usually performed by the internal team within the sponsor, but can certainly be performed by the independent committee such as data and safety monitoring committee (DSMB). With the safety data, the maximum tolerated dose (MTD) may be identified.   

In SAD/MAD studies, within each cohort, placebo control can be added. Depending on whether there is a concurrent placebo control group, the SAD/MAD studies could have the following types.
  • SAD without placebo control
  • SAD with placebo control
  • MAD without placebo control
  • MAD with placebo control

When placebo group is added to the SAD/MAD study, to avoid too many subjects in placebo group for the final analysis, it is very common to use a n:1 randomization ratio within each cohort, For the final analysis, subjects in placebo group across all cohorts are pooled together.

Here are a couple of examples for SAD/MAD study designs – they are extracted from a presentation slide I made almost 20 years ago, but is still relevant:

Further Reading/References:

Thursday, October 26, 2017

NIH and FDA Release Protocol Template for Phase 2 and 3 IND/IDE Clinical Trials and e-Protocol Writing Tool

In previous article 'Clinical Trial Protocol Template',  the draft template by NIH/FDA was mentioned. 

NIH/FDA has now finalized the clinical trial protocol template.

NIH and FDA Release Protocol Template for Phase 2 and 3 IND/IDE Clinical Trials
The National Institutes of Health (NIH) and Food and Drug Administration (FDA) developed a clinical trial protocol template with instructional and example text for NIH-funded investigators to use when writing protocols for phase 2 and 3 clinical trials that require Investigational New Drug application (IND) or Investigational Device Exemption (IDE) applications.   In March 2016 a draft template was released for public comment generating nearly 200 comments from 60 respondents.  All comments were carefully considered and many were incorporated into the final template.  The agencies’ goal is to encourage and make it easier for investigators to prepare clinical trial protocols that are organized consistently and that contain all of the information necessary for the review of the protocol.  The template follows the International Conference on Harmonisation (ICH) E6 (R2) Good Clinical Practice and is available as a Word documentThe NIH also released a secure web-based e-Protocol Writing Tool that allows investigators to generate a new protocol using the NIH-FDA Phase 2 and 3 IND/IDE Clinical Trial Protocol Template. The e-Protocol Writing Tool fosters protocol writing collaboration by allowing multiple writers and reviewers to participate in the protocol development process. The e-Protocol Writing Tool allows the author to assign writers and collaborators and the tool assists the author with tracking progress and document version control.
The NIH expects to expand the development of the e-Protocol Writing Tool by adding instructional text and sample text for other types of studies, such as a behavioral and phase 1 trials. Future releases of this e-Protocol Writing Tool will have improvements and enhanced tool functionality. 

Saturday, October 21, 2017

CAR-T Gene Therapy Clinical Trials - Success Story and Beyond

Recently, two CAR-T (Chimeric Antigen Receptor T cell) gene therapies were approved by FDA. The first one is tisagenlecleucel manufactured by Novartis and is indicated for certain pediatric and young adult patients with a form of acute lymphoblastic leukemia (ALL). The second one is axicabtagene ciloleucel by Kite pharmaceutical (now part of Gilead) and it is indicated for treating adult patients with certain types of large B-cell lymphoma or certain types of non-Hodgkin lymphoma (NHL).who have not responded to or who have relapsed after at least two other kinds of treatment.

Before the approval of tisagenlecleucel, FDA summoned an advisory committee meeting for the safety concerns on July 12, 2017. The official approval came one and half months after advisory committee votes unanimously in favor of the approval.

The second CAR-T product approval came just one and half months after the first CAR-T approval and did not need to go through the advisory committee meeting process.

From the clinical trial design standpoint, both approvals were based on a pivotal study with relatively small, but sufficient sample size. Since the study was a single-arm with no control group, the results were compared with the historical control (or a commonly accepted criteria).

For Novartis’ tisagenlecleucel, the pivotal study is registered on as “A Phase II, Single Arm, Multicenter Trial to Determine the Efficacy and Safety of CTL019 in Pediatric Patients With Relapsed and Refractory B-cell Acute Lymphoblastic Leukemia”. The briefing book of the advisory committee meeting detailed the background information of the CART, the study design, and the results.

The primary efficacy endpoint is the overall remission rate (ORR) during the 3 months after tisagenlecleucel administration; ORR includes complete remission (CR) and complete remission with incomplete hematologic recovery (CRi), as determined by independent review committee (IRC) assessment.

The pre-specified primary efficacy endpoint tested the null hypothesis of the ORR being less than or equal to 20% against the alternative hypothesis that the ORR was greater than 20% at an overall one-sided 2.5% level of significance. The study met its primary objective if the lower bound of the 2-sided 95% exact Clopper Pearson confidence intervals (CI) for ORR was greater than 20% (note: I wrote an article about Clopper Pearson confidence interval).

The study results showed the remarkable results. ORR is 82.5% (52/63). The lower bound of the 95% exact confidence interval using Clopper Pearson method is 70.9% - way above the pre-specified criterion of 20%.

The approval of Kite’s axicabtagene ciloleucel was based on a similar study design. In, the study was registered as “A Phase 1-2 Multi-Center Study Evaluating the Safety and Efficacy of KTE-C19 in Subjects With Refractory Aggressive Non-Hodgkin Lymphoma (NHL)” – so called ‘ZUMA-1’ trial. The primary efficacy endpoint is ORR (Objective response rate) consisting of complete response [CR] plus partial response [PR] per the revised International Working Group (IWG) Response Criteria for Malignant Lymphoma.

The study results was announced in Kite’s press release. The results indicated that the ORR is 82% (83/101) – p value is less than 0.0001.  The 95% exact confidence interval for ORR was not provided, but with Clopper Pearson method, we can calculate the lower bound of the 95% confidence interval to be 73% - it is probably way above the pre-specified ORR (historical control if exist) for patients without CAR-T treatment. 

While we are excited about the advances in gene therapy, the long-term safety should still be followed. Both approved products carry the black box and the short term safety issue (could be life-threatening) is mainly the cytokine release syndrome (CRS).

It will be interesting to see how the cost of these CAR-T therapy will be managed and accepted by the community. The Novartis tisagenlecleucel (brand name Kymriah) is priced at $475,000 for a treatment course. Kite’s axicabtagene ciloleucel (brand name Yescarta) is priced at $373,000 for a treatment course.

The development of therapies like CAR-T requires great collaboration between the industry and the academic and government. The Novartis’s tisagenlecleucel was mainly developed by the University of Pennsylvania. Kite’s axicabtagene ciloleucel was developed by the National Cancer Institute.

There is an article in New England Journal of Medicine by Dr. Rosenbaum “Tragedy, Perseverance, and Chance — The Story of CAR-T Therapy”.

CAR-T is now very popular in China. A lot of investigational clinical trials using CAR-T therapy are ongoing in China. Someone did a search in and found more CAR-T clinical trials in China than US.

Both CAR-T successes are hematologic cancers, the next challenge will be to find the successful CAR-T therapy in solid tumors.

The success in CAR-T should bring us a hope that someday we can transplant the pig organs into human - an area my company is a pioneer. See the article in New York Time "Gene Editing Spurs Hope for Transplanting Pig Organs Into Humans".

Wednesday, October 18, 2017

Efficient orphan drug development - clinical trial designs in rare disease area

For a long time, drug development in rare diseases is a niche area for biotechnology companies. With the rapid advanced in genetics and ‘-omics’ and in the precision medicine era, researchers are continuously identifying new diseases or disease variants. A prevalent disease can be dissected into many small pieces, each subset could become a rare disease.
Rare diseases pose challenges in clinical trial designs. Some of the challenges are:
  • Small number of patients affected
  • Small number of patients for clinical trials
  • Heterogeneity of the disease
  • Limited understanding of the disease’s natural history
  • Lack of well-defined study endpoints
  • Limited early phase clinical trial data

There are several initiatives in US and in European countries that are focusing on the clinical trial designs and methodologies in rare disease area.

In US, the NORD (National Organization for Rare Disorders) is a patient advocacy organization dedicated to individuals with rare diseases and the organizations that serve them.  NORD, along with its more than 260 patient organization members is committed to the identification, treatment, and cure of rare disorders through programs of education, advocacy, research, and patient services. NORD’s visions include a culture of innovation that supports basic and translational research to create diagnostic tests and therapies for all rare diseases and a regulatory environment that encourages development and timely approval of safe, effective diagnostics and treatments.
“Robert Temple, deputy center director for clinical science at CDER, said that one of his long held interests at FDA has been making clinical trials more efficient.
"This is particularly important in orphan territory because there are often few patients in total and there are usually very few near the centers that want to do the studies," Temple said.
Because of the challenges inherent in identifying and enrolling patients with rare disorders in studies, Temple said that having detailed natural history data "can make a tremendous difference in identifying the manifestations you want to try to treat and identifying the patients you should include" in a study.
Another important consideration, Temple said, is whether there are design features that can be built into a trial to make it more efficient.
For instance, Temple said that cross-over studies could be done in situations where enrolling patients is difficult and the disease being studied has a transient effect.
This isn't a new idea, Temple said, pointing to a 1976 cross-over study of the synthetic steroid danazol to treat hereditary angioedema that enrolled only nine patients. In that study the patients were randomized to the drug or a placebo until they had an attack, at which point they were moved to the other study arm.
Temple also said that FDA has seen some success with doing randomized withdrawal studies, particularly in situations where a placebo-controlled arm is not feasible.
"Sometimes in the course of development you'll have a lot of people who for one reason or another have been put on the drug because it's the only game in town…with a lot of people on the drug you can sometimes, if they're willing, do a randomized withdrawal study," Temple said.
But Temple emphasized that these types of studies are only appropriate in certain circumstances, and stressed that sponsors should consider approaches early on.
Billy Dunn, director of CDER's division of neurology products, said that especially for rare diseases FDA and industry need to look at novel approaches to studying drugs.
"We're not the cardiovascular division, we're not accustomed to having these multi-thousand patient trials…many things that might seem controversial or unusual in other settings, sometimes—not always, but sometimes—those can be more run-of-the-mill for us," Dunn said.
Dunn also said that different considerations must be made for studies of progressive diseases, where different stages or forms of a disease can vary considerably. In such situations, Dunn said that sponsors need to consider progression when determining enrollment criteria and, "while difficult to operationalize, having patient individualized outcomes where we attempt to assess, for a given stage or form of a disease, those aspects…that are most impairing."
Wilson Bryan, director of the Office of Tissues and Advanced Therapies at the Center for Biologics Evaluation and Research (CBER), said that one of the challenges for his office is working with academic sponsors and small biotech companies that do not have extensive experience in drug development.
This is an especially challenging issue in the rare disease space, Bryan said, because there are very few patients with a particular disease to begin with and any inefficiency in a study can waste precious time and resources. This is also true for advanced therapies as many are disease modifying and have prolonged or permanent effects, making it difficult for patients to enroll in additional studies in the future.
"Too often we have folks come in with [investigational new drug applications] INDs and they haven't started their natural history study yet, and they haven't started thinking about what their outcome measures are going to be for Phase III," Bryan said.
When should sponsors begin thinking about these things? According to Bryan, at the very early stages of drug development and well before a product is set to begin human studies.
"When I say early on in drug development I don't mean at Phase I, I mean when you first start to do preclinical studies, think about what is the target. The target is not getting into clinical trials … the target is getting a product on the market that's of use to patients," Bryan said.
Julia Beitz, director of CDER's Office of Drug Evaluation III, said that sponsors should also make sure to establish a baseline for patients' clinical, cognitive and developmental status before beginning a study and take repeat measurements of those features throughout the trial.
Beitz also said that sponsors should try, "to the extent that is possible," to stabilize patients' conditions before starting them on a drug, especially when patients are expected to need another intervention, such as surgery or a medical device.
"If these measures are instituted during the trial it becomes difficult to assess the independent effect of the new treatment on the patient," Beitz said.

In the social media era, it is important to have a platform for patients and caregivers to communicate and engage. INSPIRE is such an organization and provides the most authentic platform for patient engagement.

A group of people are forming a DIA working group called NEED (the Nature and Extent of Evidence Needed for Decision) engaging in the discussions about the clinical trial design in rare diseases including the natural history study, historical control, innovative study designs such as adaptive design and Bayesian design,…

In European countries, there are several initiatives focusing on the clinical trial design aspects of the rare diseases or small population group trials.

Saturday, September 16, 2017

Randomized withdrawal design and delayed start design in rare disease clinical trials

Last week at the 4th Duke-Industry Statistics Workshop, I participated in a session "Innovative clinical trial designs" organized by the UNC Professor, Dr Anastasia Ivanova.

I discussed 'randomized withdrawal design and delayed start design in rare disease clinical trials'. The presentation slides are included in the link below.

In rare disease clinical trials, we should think about more when designing the clinical trial and should

go beyond the typical, conventional RCT (randomized, controlled trials).

In the presentation, the following materials were referenced:

Randomized Withdrawal Design or Randomized Discontinuation Trial:

Monday, August 07, 2017

Another Way for Constructing Stopping Rule for Safety Monitoring of Clinical Trials

In my previous post "Constructing stopping rule for safety monitoring", I discussed the use of exact binomial confidence interval as a way to construct a stopping rule, but it was for a single arm study. 

For randomized, controlled study, the similar way can be used, but we have to calculate the exact confidence interval for the difference of two binomial proportions. We can then make a judgment if there is an excessive risk or elevated risk in the experimental arm for stopping the study for the safety reason. 

I recently read a oncology study protocol and noticed the following languages to describe the stopping criteria:  
An independent DMC will review accumulating safety data at scheduled intervals  with attention focused on the percentage of subjects with SAEs, AEs of particular concern, Grade 3 or 4 toxicities, and any Grade 5 toxicity considered at least possibly related to study treatment. Excess risk will be determined according to the lower 97.5% exact lower confidence bound on the difference between incidence rates for Group B minus Group A; a lower bound greater than 0% will be flagged as a possible reason to stop the trial. Incidence calculations will depend on the respective numerators and denominators at the time of each interim look. Wilson scores method will be used to calculate confidence limits.

To use this approach for stopping rule, we will need to calculate the exact confidence interval on a continuous basis. While Wilson scores method is mentioned for calculating the exact confidence interval, there are other methods for this calculation too.

In a paper by Will Garner (2007) Constructing Confidence Intervals for the Differences of Binomial Proportions in SAS, total 17 methods were discussed for calculating the confidence interval for two binomial proportions where a couple of methods could calculate the exact confidence interval. In SAS, proc freq can be used to calculate the exact confidence interval based on the method by Santner and Snell and the method by Chan and Zhang.

In the example below, I constructed a data set with two scenarios:
Scenario #1: 4 out of 10 patients in group A having an event and 0 out of 10 patients in group B having an event.
Scenario #2: 5 out of 10 patients in group A having an event and 0 out of 10 patients in group B having an event.

The lower bound of 95% confidence interval for scenario #1 and scenario #2 will be -0.0856 and 0.0179 respectively based on Santner-Snell exact method. Since the lower bound of 95% confidence interval for scenario #2 is greater than 0, the stopping rule for safety will be triggered.

data testdata;
 input trial treat $ x n alpha;
 1 A 4 10 0.05
 1 B 0 10 0.05
 2 A 5 10 0.05
 2 B 0 10 0.05
data testdat1;
 set testdata;
 by trial;
 if first.trial then treatn = 1;
 else treatn = 2;
 y = n - x; p = x/n; z = probit(1-alpha/2);
data testdat2a(keep=trial x y z rename=(x=x1 y=y1));
 set testdat1;
 where treatn = 1;
data testdat2b(keep=trial x y rename=(x=x2 y=y2));
 set testdat1;
 where treatn = 2;
data testdat2;
 merge testdat2a testdat2b;
 by trial;
proc transpose data = testdat1 out = x_data(rename=(_NAME_=outcome COL1=count));
 by trial treat;
 var x y;
/* Methods 1, 6 (9.4 only), 10, 12, and 13 (9.4 only) */
ods output PdiffCLs=asymp1;
proc freq data=x_data;
 by trial;
 tables treat*outcome /riskdiff (CL=(WALD MN WILSON AC HA));
 weight count;
data asymp1;
 set asymp1;
 length method $25.;
 if Type = "Agresti-Caffo" then method = "13. Agresti-Caffo";
 else if Type = "Hauck-Anderson" then method = "12. Hauck-Anderson";
 else if Type = "Miettinen-Nurminen" then method = " 6. Miettinen-Nurminen";
 else if index(Type,"Newcombe") > 0 then method = "10. Score, no CC";
 else if Type = "Wald" then method = " 1. Wald, no CC";
 keep trial method LowerCL UpperCL;

/* Method 5: MEE (9.4 only) */

ods output PdiffCLs=asymp2;
proc freq data=x_data;
 by trial;
 tables treat*outcome /riskdiff(CL=(MN(CORRECT=NO)));
 weight count;
data asymp2;
 set asymp2;
 length method $25.;
 method = " 5. Mee";
 keep trial method LowerCL UpperCL;

/* Method 3: Haldane */
data asymp3;
 set testdat2;
 by trial;
 length method $25.;
 method = " 3. Haldane";
 p1 = x1/(x1+y1);
 p2 = x2/(x2+y2);
 psi = (x1/(x1+y1) + x2/(x2+y2))/2;
 u = (1/(x1+y1) + 1/(x2+y2))/4;
 v = (1/(x1+y1) - 1/(x2+y2))/4;
 w = z/(1+z*z*u)*sqrt(u*(4*psi*(1-psi)-(p1-p2)*(p1-p2)) + 2*v*(1-2*psi)*(p1-p2) +
 theta = ((p1-p2)+z*z*v*(1-2*psi))/(1+z*z*u);
 LowerCL = max(-1,theta - w);
 UpperCL = min(1,theta + w);
 keep trial method LowerCL UpperCL;
/* Method 4: Jeffreys-Perks */
data asymp4;
 set testdat2;
 by trial;
 length method $25.;
 method = " 4. Jeffreys-Perks";
 p1 = x1/(x1+y1);
 p2 = x2/(x2+y2);
 psi = ((x1+0.5)/(x1+y1+1) + (x2+0.5)/(x2+y2+1))/2; /* Same as Haldane, but +1/2
success and failure */
 u = (1/(x1+y1) + 1/(x2+y2))/4;
 v = (1/(x1+y1) - 1/(x2+y2))/4;
 w = z/(1+z*z*u)*sqrt(u*(4*psi*(1-psi)-(p1-p2)*(p1-p2)) + 2*v*(1-2*psi)*(p1-p2) +
 theta = ((p1-p2)+z*z*v*(1-2*psi))/(1+z*z*u);
 LowerCL = max(-1,theta - w);
 UpperCL = min(1,theta + w);
 keep trial method LowerCL UpperCL;
/* Method 16: Brown and Li's Jeffreys Method */
data asymp5;
 set testdat2;
 by trial;
 length method $25.;
 method = "16. Brown-Li";
 p1 = (x1+0.5)/(x1+y1+1);
  p2 = (x2+0.5)/(x2+y2+1);
 var = p1*(1-p1)/(x1+y1) + p2*(1-p2)/(x2+y2);
 LowerCL = max(-1,(p1-p2) - z*sqrt(var));
 UpperCL = min(1,(p1-p2) + z*sqrt(var));
 keep trial method LowerCL UpperCL;
data asymp;
 set asymp1
/* Methods 2 and 11 */
ods output PdiffCLs=asymp_cc;
proc freq data=x_data;
 by trial;
 tables treat*outcome /riskdiff(correct CL=(wald wilson));
 weight count;
data asymp_cc;
 set asymp_cc;
 length method $25.;
 if index(Type,"Newcombe") > 0 then method = "11. Score, CC";
 else if index(Type,"Wald") > 0 then method = " 2. Wald, CC";
 keep trial method LowerCL UpperCL;
/* Exact methods: Methods 14 and 15 (Exact) */
ods output PdiffCLs=exact_ss;
proc freq data=x_data;
 by trial;
 tables treat*outcome /riskdiff(cl=(exact));
 weight count;
 exact riskdiff;
data exact_ss;
 set exact_ss;
 length method $25.;
 method = "14. Santner-Snell";
 keep trial method LowerCL UpperCL;

data exact;
 set exact_ss;

/* Combine all of the outputs together */
data final;
 set asymp asymp_cc exact;
/* Sort all of the outputs by trial and method */
proc sort data = final out = final;
 by trial method;

proc print data=final;
 title "Methods and 95% Confidence Interval for Difference between two rates";

Tuesday, August 01, 2017

Steroid Tapering Design Clinical Trials

In the most recent issue of New England Journal of Medicine, Stone et al published the results from "Trial of Tocilizumab in Giant-Cell Arteritis". The study used a steroid tapering design with the primary efficacy endpoint of "the rate of sustained glucocorticoid-free remission at week 52 in each tocilizumab group as compared with the rate in the placebo group that underwent the 26-week prednisone taper."

There are some chronic diseases where the effective treatment is the high dose of steroid (corticosteroid, prednisone,...). To control the symptoms, the patients are usually put on the long-term use of the high dose steroid. While the steroid treatment may be effective, it can cause serious, irreversible side effects.

The list of side effects of long-term steroid use includes, but not limited to:
  • mood changes 
  • forgetfulness 
  • hair loss 
  • easy bruising 
  • a tendency toward high blood pressure and diabetes 
  • thinning of the bones (osteoporosis)
  • suppression of the adrenal glands
  • muscle weakness
  • weight gain
  • cataracts 
  • glaucoma

It will be useful to develop an alternative treatment that can replace the long-term steroid use or at least minimize the steroid dose required. To investigate the effect of the alternative treatment, clinical trial can be designed to demonstrate if the alternative treatment can taper down the steroid dose to very low or zero level while maintaining the stabilized symptoms – we call this as steroid tapering or steroid sparing design.

In a steroid tapering design, the purpose of the study is not to pursue the further improvement in disease symptoms. The steroid tapering design will have a study endpoint based on the reduction in the steroid dose while maintaining the stabilized symptoms. The possible efficacy endpoints could be the following:
  • Steroid dose reduction at Week xx from baseline
  • Percent of subjects with zero steroid dose at Week xx
  • Percent of subjects with steroid dose less than xx mg at Week xx
  • Percent of subjects with steroid dose reduction greater than and equal to 50%
  • AUC for steroid dose between week x to week y

In one of studies investigating the steroid tapering effect of IGIV in generalized myasthenia gravis, FDA confirmed during the pre-IND meeting that the treatment effect in reducing the steroid dose is meaningful.  This study is sponsored by Grifols and is currently ongoing. as indicated in, the sponsor chose "the percent of subjects with steroid dose reduction greater than and equal to 50%" as the primary efficacy endpoint. 
Efficacy and Safety of IGIV-C in Corticosteroid Dependent Patients With Generalized Myasthenia Gravis
When designing a steroid tapering trial, the following issues need to be addressed:
  • Steroid tapering design has a wash-in, wash-out feature. With the effect of new treatment kicking in (if the active treatment is effective), the dose of the steroids will be reduced.
  • The purpose of the study is not the improvement in disease symptoms. The purpose is to maintain the symptoms (no deterioration) while the steroid dose is reduced. 
  • Considering the withdrawal effect of the steroid, steroid tapering design will therefore include a run-in period – the early period when the new treatment added, but steroid tapering has not started yet. To ensure the patient safety, the steroid dose tapering will only start at the end of the run-in period. During the run-in period,
  • Changes / reductions in steroid dose could influence outcomes; The treatment effect of steroid reduction must be established on the maintenance of the disease symptoms. There should be a rule to define the worsening of the clinical symptoms when the tapering must be slowed or stopped. There must be a standardized steroid tapering regimen and standardized rescue measure when disease symptoms exacerbated due to the steroid tapering.
  • Subjects who entered into the study and before the randomization should have a stable steroid dose. If the patients are not on stable steroid dose while entering the study, at the end of the study, it is not possible to tease out if the steroid dose reduction is due to the fluctuation of the steroid dose itself or due to the effect of the new treatment.
  • The stratified randomized can be used to include the baseline steroid dose category as a stratification factor to ensure that within each steroid dose category, equal number of subjects are randomized into active treatment or placebo control. Patients on higher steroid dose at baseline are more likely to have steroid dose reduction. The stratified randomization can minimize the biases due to this.
  • If the endpoint is “the mean change from baseline in steroid dose”, the magnitude of the steroid reduction between two treatment group needs to be clinically meaningful.
  • In steroid tapering design, there must be a rescue plan in the case of symptom worsening / deterioration (or exacerbation) due to the decrease in steroid dose.  
  • At the end of the study, there should be a safety follow-up period. 

There is a FDA Guidance for Industry Systemic Lupus Erythematosus — Developing Medical Products for Treatment where the steroid tapering design is proposed. 
d. Reduction in concomitant steroids Reducing corticosteroid use is an important goal in treatment of patients with SLE if it occurs in the context of a treatment that effectively controls disease activity. Therefore, for a medical product to be labeled as reducing corticosteroid usage, it should also demonstrate another clinical benefit, such as reduction in disease activity as the primary endpoint. In an add-on trial to test the steroid-sparing potential of a new medical product, patients should be enrolled during a flare and randomized to the addition of the new medical product or placebo to induction doses of corticosteroids. In both study arms, when patients achieve quiescent disease, the corticosteroid dose should be tapered to a maintenance dose that is not usually associated with major toxicities while still maintaining quiescence. The induction steroid dosage and duration of induction therapy and taper schedule should be based on the severity of disease activity in the dominant organ system involved.8 The evaluation of efficacy should be based on the proportion of patients in treatment and control groups that achieve a reduction in steroid dose to less than or equal to 10 mg per day of prednisone or equivalent, with quiescent disease and no flares (see definition above) for at least 3 consecutive months during a 1-year clinical trial. For a result to be clinically meaningful, the patient population should be on moderate to high doses of steroids at baseline. Trials should also assess the occurrence of clinically significant steroid toxicities.

The steroid tapering design can be used in various disease areas, the following examples are the application of steroid tapering design in severe refractory asthma, myasthenia gravis, systemic lupus erythmatosus, and giant cell arteritis (GCA). 

The primary measure of efficacy in our study will be the nine-month prednisone AUC (months 3–12), which measures the total prednisone doses of each patient in nine months. A reduction of prednisone AUC demonstrates that patients improved on clinical grounds so that the prednisone dose could be decreased. If the patients receiving MTX have a smaller prednisone AUC compared to the placebo patients, this will have demonstrated the efficacy of MTX. 
Based on pre-IND discussions with FDA and consultants, it was decided that the primary efficacy variable for the corticosteroid reduction study should be, for patients who were corticosteroid dependent, a reduction of the patients’ current prednisone dose to 7.5 mg/day (upper limit of physiologic levels) or less, without worsening of SLE.
The design of the steroid sparing study was a forced titration; i.e., the patient’s steroid dose at each monthly visit was to be reduced, by algorithm, if her disease activity was stable or improved. However, when a patient worsened or flared, the associated increase in corticosteroid dose, if any, required to treat the patient’s exacerbation was at the physician’s discretion and not by algorithm. The steroid reduction algorithm was based on the patient’s disease activity improving or being stable, which was defined as no change in or a decrease in SLEDAI score in comparison to her previous visit. As such, one of the issues discussed at the pre-study investigator meeting was whether patients with low SLEDAI scores, and especially those with SLEDAI scores of 0, should be enrolled into the study. There was concern that those patients with low SLEDAI scores had inactive disease, and therefore would not be affected by steroid reduction, i.e., might not be steroid dependent. However, some investigators and consultants felt that if patients were truly dependent on steroids, their low SLEDAI scores represented active disease suppressed by corticosteroids, which would worsen or flare as soon as their corticosteroids were reduced. Therefore, because there was no experience with such trials, it was decided not to exclude patients with low SLEDAI scores. The concern regarding enrollment of potentially inactive SLE patients was revisited prior to unblinding of the study. In addition it was recognized that because of the forced downward titration of steroid dose as the patients’ disease improved or remained stable, other evaluations of disease activity such as SLEDAI, etc., would not be expected to improve.
The pivotal study was designed as a double-blind, randomized, placebo-controlled, parallel group trial to evaluate GL701 100 and 200 mg/day versus placebo in female patients with mild to moderate prednisone-dependent systemic lupus erythematosus (SLE).
The study included two primary efficacy variables. The first one was responder rate. A responder was defined as a patient with the achievement of a decrease in prednisone dose to 7.5 mg/day or less sustained for no less than three consecutive scheduled visits, including the termination visit (i.e., two consecutive months), on or after Visit 7. The second primary efficacy variable was percent decrease in prednisone dose determined by comparing the prescribed prednisone (or steroid equivalent) dose at Baseline (Qualifying Visit) and the last visit prednisone dose using the physician prescribed prednisone dose recorded on the Medication Record Form.
  • Elimumab May Have Potential As A Corticosteroid-Sparing Drug When Added To Standard-Of-Care Treatment For SLE, Research Suggests.
Research suggests “the monoclonal antibody belimumab (Benlysta) may have potential as a corticosteroid-sparing drug when added to standard-of-care treatment for systemic lupus erythematosus (SLE).” Investigators found, “in pooled data from two large randomized controlled trials,” that “this blocker of B-lymphocyte stimulator was moderately associated with a higher probability of corticosteroid dose reduction and a greater average dose reduction over” one year. The findings were published in Arthritis & Rheumatology.
This paper described the design and operationalization of a blinded corticosteroid-tapering regimen for a randomized trial of tocilizumab in giant cell arteritis (GCA). The study design is sketched in the diagram below. The primary efficacy endpoint is “Proportion of patients in sustained remission at week 52 following induction and adherence to the protocol-defined prednisone taper regimen”

Monday, July 24, 2017

Excel spreadsheet to calculate the p-value for Fisher Exact test

A colleague of mine wants to have a small program to calculate the p-values during the course of the study for a randomized, open label study. For those who don't want to use any statistical software such as SAS, an excel spreadsheet can serve the purpose.

                 Calculating p-value for Fisher exact test

However, this is not a good practice and should be discouraged or prehibited. Even though this is a randomized, open label study, the study team should be restrained from continuously performing the testing based on the cumulative data.

Monday, July 10, 2017

Age Group in Pediatric, Perinatal or Preterm, and Geriatric Subjects

In a previous blog article, I discussed the “Pediatric use and geriatric use of drug and biological products”:
For Pediatric population: according to ICH guidance E11 "Clinical Investigation of Medicinal Products in the Pediatric Population", the pediatric population contains several sub-categories:
  • preterm newborn infants
  • term newborn infants (0 to 27 days)
  • infants and toddlers (28 days to 23 months)
  • children (2 to 11 years)
  • adolescents (12 to 16-18 years (dependent on region))
Notice that in FDA's guidance "General Considerations for Pediatric Pharmacokinetic Studies
for Drugs and Biological Products
for Drugs and Biological Products", the age classification is a little bit different. I am assuming that the ICH guidance E11 should be the correct reference.
Geriatric population:
Geriatric population is defined as persons 65 years of age and older. There is no upper limit of age defined.
Recently, I run into several studies where the age group needs to be further split.

Pediatric population can be further divided into 2-5 years old, 6-11 years old, and 12-16 years old.
This grouping of the pediatric population was used in a pharmacokinetic study in primary immunodeficiency patients in child. Per FDA’s request, the pediatric patients are further divided into groups 2-5, 6-11, and 12-16 years old. FDA asked that the study should contain subjects in each of the sub-groups so that the assessment can be made to see if the pharmacokinetics are consistent across different pediatric sub-groups (or at least there should be no obvious difference among these sub-groups).
An Open-label, Single-sequence, Crossover Study to Evaluate the Pharmacokinetics, Safety and Tolerability of Subcutaneous GAMUNEX®-C in Pediatric Subjects With Primary Immunodeficiency
The age definition in perinatal period is much trickier. In a policy statement by Committee on Fetus and Newborn “Age Terminology Duringthe Perinatal Period”, various definitions of age are described:
  • Gestational age (or “menstrual age”) is the time elapsed between the first day of the last normal
  • menstrual period and the day of delivery
  • “Chronological age” (or “postnatal” age) is the time elapsed after birth
  • Postmenstrual age (PMA) is the time elapsed between the first day of the last menstrual period and birth (gestational age) plus the time elapsed after birth (chronological age). Postmenstrual age is usually described in number of weeks and is most frequently applied during the perinatal period beginning after the day of birth.
In clinical trials with pre-term babies, the study endpoints are usually defined based on the post-menstrual age (PMA). For example, in an article by Bassler et al “Early inhaled budesonide for the prevention ofbronchopulmonary dysplasia”, the primary outcome was a composite of death
or bronchopulmonary dysplasia at 36 weeks of postmenstrual age. While the study treatment will not be given until after the birth, 36 weeks of postmenstrual age is counted from the first day of the last menstrual period. For different babies, the chronological age or observation period (when the death or BPD event is observed) will be different depending on the actual gestational age.  

It can be illustrated in the following diagram.

  • Gestational age = birth date – the first day of the last normal menstrual period
  • Chronological age = assessment date or event date – birth date
  • Postmenstrual age = assessment date or event date – the first day of the last menstrual period
  • Postmenstrual age = gestational age + chronological age

Geriatric population can be further divided: 
In a study with Interstitial Lung Diseases (ILDs) in elderly patients, there are substantial number of patients with age >= 80 years old. The traditional definition of geriatric population using 65 years old as a cut point will not be sufficient. We end up further dividing the geriatric population into 65 - < 80 years old and >= 80 years old. We think this will provide more meaningful sub-grouping to assess the impact of the age group in ILD indication. 

Monday, July 03, 2017

(Bayes) Success Run Theorem for Sample Size Estimation in Medical Device Trial

In a recent discussion about the sample size requirement for a clinical trial in a medical device field, one of my colleagues recommended an approach of using “success run theorem” to estimate the sample size. ‘Success run theorem’ may also be called ‘Bayes success run theorem’. In process validation field, it is a typical method based on a binomial distribution that leads to a defined sample size.  

Application of success run theorem depends on the reliability of the new process (or new device). In medical device trials, the reliability is the probability that an item (i.e. the device) will carry out its function satisfactorily for the stated period when used according to the specified conditions. A reliability of 95% means that a medical device will be functional without problem for 95% of times.

With the success run theorem, we will calculate the sample size so that we have 95% confidence interval to run the device without failure (reliability). Usually, people use 95% confidence interval to achieve 95% reliability. With ‘success run theorem’, the sample size can be calculated as:

                                 N = ln(1-C)/ln( R)

Where N is the sample size needed, C is the confidence interval, and R is the reliability.

With typical 95% confidence interval to achieve 95% reliability, a sample size of 57 will be needed. An excel spreadsheet is built for calculating the sample size using success run theorem.   

The website below contains the explanation how the success run theorem formula is derived. With C = 1 – R^(n+1), we would have N = [ln(1-C)/ln(R)] – 1, slightly different from the formula above.
 How do you derive the Success-Run Theorem from the traditional form of Bayes Theorem?
This derivation above is based on uniform prior for reliability (a conservative assumption) which assumes no information from predicate devices and the same weight to every reliability value to fall anywhere between 0 to 1.

In medical device field, devices evolve and they are constantly being improved. When we evaluate a new device or next generation device, there is usually some prior information that can be based on. Therefore, instead of uniform prior for reliability, Bayesian technique with mixture of beta priors for reliability can be applied. Using mixtures of beta priors for reliability, we will be able to incorporate historical information from predicate device to decrease the sample size requirement.

We have seen this application in the field of automotive electronics attribute testing, but have not seen any application in FDA regulatory medical device testing.


Saturday, June 17, 2017

Statistical Analysis Plan in Clinical Trial Registries

Recently, a question comes up when I search the – the EU clinical trial registry website – the counterpart of in US. Should the clinical trial registries include the statistical analysis plan (for primary and secondary efficacy endpoints)? The statistical analysis plan could include the statistical methods for primary and secondary endpoints, missing data handling, stopping rule for early termination of the study, justification for sample size estimation, and so on.
For in US, Protocol Registration Data Element Definitions for Interventional and Observational Studies requires the inclusion of some details about statistical analyses:
Detailed Description Definition:
Extended description of the protocol, including more technical information (as compared to the Brief Summary), if desired. Do not include the entire protocol; do not duplicate information recorded in other data elements, such as Eligibility Criteria or outcome measures. Limit: 32,000 characters. 
For Patient Registries: Also describe the applicable registry procedures and other quality factors (for example, third party certification, on-site audit). In particular, summarize any procedures implemented as part of the patient registry, including, but not limited to the following: 
  • Quality assurance plan that addresses data validation and registry procedures, including any plans for site monitoring and auditing.
  • Data checks to compare data entered into the registry against predefined rules for range or consistency with other data fields in the registry.
  • Source data verification to assess the accuracy, completeness, or representativeness of registry data by comparing the data to external data sources (for example, medical records, paper or electronic case report forms, or interactive voice response systems).
  • Data dictionary that contains detailed descriptions of each variable used by the registry, including the source of the variable, coding information if used (for example, World Health Organization Drug Dictionary, MedDRA), and normal ranges if relevant.
  • Standard Operating Procedures to address registry operations and analysis activities, such as patient recruitment, data collection, data management, data analysis, reporting for adverse events, and change management.
  • Sample size assessment to specify the number of participants or participant years necessary to demonstrate an effect.
  • Plan for missing data to address situations where variables are reported as missing, unavailable, non-reported, uninterpretable, or considered missing because of data inconsistency or out-of-range results.
  • Statistical analysis plan describing the analytical principles and statistical techniques to be employed in order to address the primary and secondary objectives, as specified in the study protocol or plan. 
In EU, the is mainly based on the EudraCT database. As part of the clinical trial application (similar to IND in US), the sponsor needs to provide the clinical trial protocol information to be entered into EudraCT database.

In the guidance “Detailed guidance on the European clinical trials database (EUDRACT Database)”, it asks for the information regarding the clinical trial design, but there is no mention of the statistical analysis plan.

As a matter of fact, all clinical trial registries across different countries are supposed to meet the requirements by International Clinical Trials Registry Platform (ICTRP) from World Health Organization. In the list of elements for WHO Trial Registration Data Set , there is no mention of statistical analysis plan as part of the registration elements.

No matter what, there seems to be different understanding about the details of the clinical trial to be posted in clinical trial registries. Some companies posted very detail information including how the clinical trial data would be analyzed. Other companies were very restraint and posted as little information as possible.

In terms of the elements regarding the statistical analyses, there are actual more studies in with some details than studies in even though the requirement regarding the inclusion of the statistical analysis plan is mentioned in, not in For example, in a study “A Multicenter, Randomized, Double-Blind, Phase 3 Study of Ramucirumab (IMC-1121B) Drug Product and Best Supportive Care (BSC) Versus Placebo and BSC as Second-Line Treatment in Patients With Hepatocellular Carcinoma Following First-Line Therapy With Sorafenib”, a lot of details about the statistical analyses are provided in the

When I try to see if the interim analysis and its corresponding boundary method are mentioned in, I can clearly see the inconsistencies across different trial sponsors.

Here are some studies that the interim analysis and boundary method are mentioned.
Here are some studies that the interim analysis is mentioned, but the boundary method is not.

Sunday, June 04, 2017

Calculating exact confidence interval for binomial proportion within each group using the Clopper-Pearson method

Clopper-Pearson confidence interval is commonly used in calculating the exact confidence interval for binomial proportion, incidence rate,... The confidence interval is calculated for a single group, therefore Clopper-Pearson method is not for calculating the confidence interval for the difference between two groups. 

In many oncology studies where there is no concurrent control group. For response rate, The exact confidence interval will be constructed (usually through Clopper-Pearson method) and then the lower limit of the 95% confidence interval is compared with the historical rate to determine if there is a treatment effect. 

Here are some examples that Clopper-Pearson method was used to calculate the exact confidence interval: 

Medical and statistical review for Venetoclax NDA:
"For the primary efficacy analyses, statistical significance was determined by a two-sided p value less than 0.05 (one-sided less than 0.025). The assessment of ORR was performed once 70 subjects in the main cohort completed the scheduled 36-week disease assessment, progressed prior to the 36-week disease assessment, discontinued study drug for any reason, or after all treated subjects discontinued venetoclax, whichever was earlier. The ORR for venetoclax was tested to reject the null hypothesis of 40%. If the null hypothesis is rejected and the ORR is higher than 40%, then venetoclax has been shown to have an ORR significantly higher than 40%. The ninety-five percent (95%) confidence interval for ORR was based on binomial distribution (Clopper-Pearson exact method). "
Motzer et al (2015) Nivolumab versus Everolimus in Advanced Renal-Cell Carcinoma
"If superiority with regard to the primary end point was demonstrated, a hierarchical statistical testing procedure was followed for the objective response rate (estimated along with the exact 95% confidence interval with the use of the Clopper–Pearson method)"
Foster et al (2015) Sofosbuvir and Velpatasvir for HCV Genotype 2 and 3 Infection
"Point estimates and two-sided 95% exact confidence intervals that are based on the Clopper–Pearson method are provided for rates of sustained virologic response for all treatment groups, as well as selected sub-groups."
Cicardi et al (2010) Icatibant, a New Bradykinin-Receptor Antagonist, in Hereditary Angioedema
"Fisher’s exact test, with 95% confidence intervals calculated for each group by means of the Clopper–Pearson method, was used to compare the percentage of patients with clinically significant relief of the index symptom at 4 hours after the start of the study drug. Two-sided 95% confidence intervals for the difference in proportions were calculated with the use of the Anderson–Hauck correction."
According to SAS manual, the Clopper-Pearson confidence interval is described as below:
The confidence interval using Clopper-Pearson method can be easily calculated with SAS Proc Freq procedure. Alternatively, it can also be calculated directly using the formula or using R function. 

Using Venetoclax NDA as an example, the primary efficacy endpoint ORR (overall response rate) is calculated as 85 / 107 = 79.4. 95% confidence interval can be calculated using Clopper-Pearson method as following: 

Using SAS Proc Freq:  
With proc freq, we should get 95% confidence interval of 70.5 – 88.6.

data test2;
  input orr $ count @@;
have 85
no 22

proc freq data=test2 order=data;
  weight count;
  tables orr/binomial(exact) alpha=0.05 ;

Using formula:

data test;
  input n n1 alpha;
  phat = n1/n;
  fvalue1 = finv( (alpha/2), 2*n1, 2*(n-n1));
  fvalue2 = finv( (1-alpha/2), 2*(n1+1), 2*(n-n1));
  pL =  (1+   ((n-n1+1)/(n1*fvalue1) ))**(-1);
  pU =  (1+   ((n-n1)/((n1+1)*fvalue2) ))**(-1);
107 85 0.05

proc print;


Using R: 
f1=qf(1-alpha/2, 2*n1, 2*(n-n1+1), lower.tail=FALSE)
f2=qf(alpha/2, 2*(n1+1), 2*(n-n1), lower.tail=FALSE)