Potencyassay.com | A blog about bioassays, immunoassays, and other potency assays

Jun/10

30

DOE for bioassays

A little blurb about using Design of Experiments (DOE) for bioassay development:

http://www.thefreelibrary.com/DOE+factors+into+bioassay+development,+says+USP+workshop-a0202645042

No tags Hide

Hello everyone,

I haven’t been posting much on the blog lately.  That’s because I’ve been busy writing the first chapter in a book on using Design of Experiments (DOE) in assay development.  I have published the first chapter for free here on this site.

Please leave me a comment if you like it, hate it, or have any idea for topics you’d like to see in future chapters.

Go to the DOE book page to download the first chapter!

Dan

No tags Hide

In a previous blog post I stated that a potency value doesn’t mean anything unless the shape of the curves of the reference standard and the unknown are exactly the same: 

screen

This condition is known as parallelism (or more correctly mathematical similarity). While that may sound completely logical and simple, the real world (as always) is more complicated.  When we estimate potency we have to rely on a statistical model of the underlying data.   The four parameter logistic model is a common choice in potency assays.  This model is fitted to our data using a regression algorithm of some sort. 

The raw data gathered in an assay and then used in the regression is a sampling of all of the possible data points at each dilution.  Because of this limitation, the model itself is only an estimate of the "true" underlying curves of our experimental system or assay. 

The consequence of this reality is that we have two estimated curves and we are trying to use them to tell if the underlying "true" data (which we don’t know) is really parallel.  That’s not an easy thing to do with certainty.  For example, are either of these pairs of curves parallel?  How do we know for sure?

EN1_screen

There are many different approaches available to answer this question.   The simplest method is just to look at the curves.  If you are doing investigational work and you’re fairly familiar with your assay, this may be all you need.  However, in a more regulated environment you will probably need something a little less subjective.  In general, there are two philosophies on how to measure parallelism using statistical methods:  difference testing and equivalence testing.  Let’s discuss each of these in more detail.

Difference testing

Difference testing relies on the creation of a metric for the measure of parallelism.  In theory, such a metric should scale with the degree of non-parallelism.  In other words, the less parallel the curves are, the larger the metric.

Let’s walk through how these metrics are derived.  In potency testing we fit two models to derive parallelism data, the full model and the reduced model.

In the full model, we fit independent parameters for our reference and sample curves.  If we use a four parameter logistic model to estimate the best fit for the upper and lower asymptotes, the slope, and the EC50 parameters for each curve independently, we have a total of eight different parameters.  This model is illustrated in the first graph below.  Notice how the curves pair have different shapes. In this graph the two curves clearly have different upper asymptotes and are therefore not parallel.  Also notice how the two curves fit their own underlying data fairly well.  This is the "full" model:

image

In the reduced model, we only allow there to be one common set of upper and lower asymptotes and slope.  Only the EC50 parameter is allowed to be unique to each curve.  This situation is illustrated in the graph below.  It is this model we use to estimate potency.  Notice how the two curves have the same shape, but they don’t fit the data as well:

EN1_image

We can use these two graphs to generate a metric for parallelism.  The first thing we can do is to calculate what’s called the residual or error for each data-point.  The residual is simply the distance from the data point to the curve:

EN2_image

If we compare the two graphs of the full and reduced model above, it becomes obvious that the residuals in the full model are always going to be smaller or equal to the residuals in the reduced model.  We can use this information to generate a metric. 

First, let’s square all of the residuals (to equalize positive and negative residuals) and sum those squares.  This number is known as the sum of squared errors (SSE).   If we do this for both curves, we have two sets of SSE, one for the full model and one for the reduced model.  Like I mentioned above, the SSE for the full model is always smaller than or equal to the reduced model.

One common use of these metrics is to use an F-test for parallelism.  The following formula is used to calculate an F-statistic:

EN2_screen

As its name indicates, this statistic is distributed according to the F-distribution.  We can therefore use this statistic to set up a hypothesis test of parallelism.  The null hypothesis is that the curves are not different (notice that I didn’t say that they are the same), and the alternate hypothesis is that they are different.  We then generate a p-value with a cutoff that help us decide if we should reject the null hypothesis (usually <0.05).  We would then say that the curves are different and therefore not parallel.

However, as many different authors before me have noted, there’s a weakness to this approach that’s hard to overcome.  In the equation above, the SSE for the full model appears in the denominator.  So what happens if you have a very precise assay and the full model has a very low SSE?  You are then dividing by a very small number and the F statistic gets very large.  This situation can lead to false positives for lack of parallelism.  In effect you are punished for having a very precise assay that follows the model very closely.  The differences between the curves may be small, but your good assay was able to detect it.  In a highly variable assay the opposite occurs, you will accept many more assays just because you don’t have the precision to tell if they are parallel or not.

This situation has been remedied by the use of a chi-square statistic.  The formula for calculating it is as follows:

EN3_screen

Again, this statistic follows the distribution its named after.  The same strategy we employed above can be used to set up a hypothesis test for parallelism using the chi-square metric.  Since this metric doesn’t rely on dividing by the SSE of the full model, it doesn’t suffer from the same issues with assay precision that the F stat does.

Unfortunately, there are still some potential problems with using this approach.  First the regression has to be perfectly weighted in order for this stat to be perfectly chi-square distributed.  Perfect weighing is difficult to achieve.

But beyond the weighting issue, there is also a philosophical problem with this approach.  These types of tests are measuring whether the shapes of the curves are different, but what we need to know for potency is whether they are actually the same.  Not being different is not the same as saying they are equivalent.  We may simply not have good enough information to tell that they are different.

Equivalence testing

Parallelism testing for potency assays has recently switched to focus on testing for curve equivalence rather than difference.  How does this work?  This approach requires us to set a limit on a specific assay parameter that we are willing to accept. 

For example, we can say that as long as the ratio of the slopes from two assays is between 0.8 and 1.25 we will accept the assay.  We can then fit the two curves independently (full model) and calculate a confidence interval on the slope ratio.  If the confidence interval on this metric is contained within the two limits, we say that the curves are equivalent based on our criteria. 

This type of test has two consequences.  First, we can say that the curves are actually equivalent instead of "not different".  Second, we are no longer punished if the assay is “too” precise, since all that will do is make our confidence interval shorter.   Let’s see what this looks like graphically:

EN4_screen

As you can see, this type of test makes intuitive sense since we can set our limits based on our knowledge of the assay system without a large data set for determining statistically derived limits.  It also prevents false positives.  In difference testing you have to accept that you will reject some runs based on chance alone what were truly parallel .  This is less likely in equivalence testing since you’re not doing a hypothesis test.

So why not use equivalence testing for everything?  In a simple, linear assay, I would encourage this approach since it’s easy to calculate the confidence intervals for each parameter in the regression.  However, in a non-linear regression the confidence intervals for the equation parameters can not be solved independently and the joint confidence regions have very complex shapes and in some cases extend to infinity. 

So for now, we are stuck with difference testing for more complex models.  I’ve recently heard about some interesting work being done that may solve this problem, but I’m sworn to secrecy…  As soon as this work is completed and published in a public forum, I will discuss it here on the blog.

I hope this has been a simple to understand introduction to parallelism testing.  If you want to read a little more about these topics, here are two journal articles I recommend to get you started:

http://www.ncbi.nlm.nih.gov/pubmed/15971545
PDA J Pharm Sci Technol.  2005 Mar-Apr;59(2):127-37.
Assessing parallelism prior to determining relative potency.
Hauck WW , Capen RC , Callahan JD , De Muth JE , Hsu H , Lansky D , Sajjadi NC , Seaver SS , Singer RR , Weisman D .

http://www.ncbi.nlm.nih.gov/pubmed/15920890
J Biopharm Stat.  2005;15(3):437-63.
Measuring parallelism, linearity, and relative potency in bioassay and immunoassay data.
Gottschalk PG , Dunn JR .

 

As always, thanks for reading!

Dan

No tags Hide

Here’s another good reference from the NIH for developing different types of assays:  Assay Guidance Manual

It’s a good read if you have the time.

No tags Hide

Jun/10

8

Online conferences

I just spent some time looking at some of the talks at BioConference Live.

I was originally registered to watch some of them live, but I had conflicts so I wasn’t able to.  I was actually surprised at the quality of the talks and I was pleased with the interactivity of the site.  In a time where travel budgets are getting smaller all the time, this format is not a bad substitute.

Of course, what you miss is the interaction and discussions that happen at face to face events.  These conversations are often some of the most valuable things you get out of a conference.  Next time I will make sure to join it live so that I can use some of the social features to see if it gets close to going to the real thing.

Organizers of bioassay conferences, how about trying something like this?  It would be great to have interactions online every few months to discuss any new trends or topics that come up.  I’ll even volunteer to give a presentation!

Thanks for reading,

Dan

No tags Hide

Design of Experiments (DOE) can be an extremely powerful way to quickly develop, optimize, and validate potency assays.

This slide deck describes a standardized approach for developing immunoassays using design of experiments, with a case study for one assay: Joelsson ELISA DOE

Thanks for reading!

Dan

No tags Hide

This is a slide deck describing the implementation of a standardized, DOE-based immunoassay development process using laboratory automation: Joelsson – Rapid Assay Optimization DOE
This study was also published as an article:

J Immunol Methods. 2008 Aug 20;337(1):35-41. Epub 2008 Jun 12.
Optimizing ELISAs for precision and robustness using laboratory automation and statistical design of experiments.
Joelsson D, Moravec P, Troutman M, Pigeon J, DePhillips P.

Abstract:

Transferring manual ELISAs to automated platforms requires optimizing the assays for each particular robotic platform. These optimization experiments are often time consuming and difficult to perform using a traditional one-factor-at-a-time strategy. In this manuscript we describe the development of an automated process using statistical design of experiments (DOE) to quickly optimize immunoassays for precision and robustness on the Tecan EVO liquid handler. By using fractional factorials and a split-plot design, five incubation time variables and four reagent concentration variables can be optimized in a short period of time.

Thanks for reading!

Dan

No tags Hide

Assessing parallelism in potency assays is a topic that is currently hotly debated.  In most cases, this assessment is performed using some sort of metric that is calculated for each assay.  A cutoff is then established on this metric to determine if two samples are parallel or non-parallel.  Establishing that cutoff can be a very difficult exercise. 

This is a slide deck of a presentation I gave on a novel method we developed for setting the cutoff:  Joelsson – Bootstrap parallelism method

Thanks for reading!

Dan

No tags Hide

We recently published an article on the automation of a cell based potency assay for a varicella vaccine.  Here’s the citation and abstract:

Journal of Virological Methods
Volume 166, Issues 1-2, Pages 1-110 (June 2010)

Rapid automation of a cell-based assay using a modular approach: Case study of a flow-based Varicella Zoster Virus infectivity assay
Pages 1-11
Daniel Joelsson, Irina V. Gates, Diana Pacchione, Christopher J. Wang, Philip S. Bennett, Yuhua Zhang, Jennifer McMackin, Tina Frey, Kristin C. Brodbeck, Heather Baxter, Scott L. Barmat, Luca Benetti, Jean-Luc Bodmer

Abstract:

Vaccine manufacturing requires constant analytical monitoring to ensure reliable quality and a consistent safety profile of the final product. Concentration and bioactivity of active components of the vaccine are key attributes routinely evaluated throughout the manufacturing cycle and for product release and dosage. In the case of live attenuated virus vaccines, bioactivity is traditionally measured in vitro by infection of susceptible cells with the vaccine followed by quantification of virus replication, cytopathology or expression of viral markers. These assays are typically multi-day procedures that require trained technicians and constant attention. Considering the need for high volumes of testing, automation and streamlining of these assays is highly desirable. In this study, the automation and streamlining of a complex infectivity assay for Varicella Zoster Virus (VZV) containing test articles is presented. The automation procedure was completed using existing liquid handling infrastructure in a modular fashion, limiting custom-designed elements to a minimum to facilitate transposition. In addition, cellular senescence data provided an optimal population doubling range for long term, reliable assay operation at high throughput. The results presented in this study demonstrate a successful automation paradigm resulting in an eightfold increase in throughput while maintaining assay performance characteristics comparable to the original assay.

If you have any questions or comments about this article, please use the comments button and I will answer them here.

Thanks for reading,

Dan

No tags Hide

Three new USP chapters on bioassays have been published on their website:

http://www.usp.org/meetings/workshops/bioassayGuidance.html

The chapters are:

  • 1032 – Design and Development of Biological Assays
  • 1033 – Biological Assay Validation
  • 1034 – Analysis of Biological Assays

If you’re in the bioassay field, these are pretty much required reading.

No tags Hide

Older posts >>

Find it!

Theme Design by devolux.org

Advertisements