PacBio is basing the new suit on two different U.S. patents, 9,678,056 "Control of enzyme translocation in nanopore sequencing" (issued June 13, 2017 with a priority date of Apr 10, 2009) and 9,738,929 "Nucleic acid sequence analysis" (issued Aug 22, 2017 with a priority date of Mar 28, 2008).
I won't dig into the '056 patent -- it's all biochemistry and I don't have a feel for the prior art here. The key concept is patenting the idea of controlling translocation of a substrate through a nanopore using a protein complex with two different kinetic modes. My quick take is that if there isn't prior art, this is likely to be a solid patent as that seems like a truly inventive step.
The '929 patent simply exasperates me on two different levels. First, there is the form of the patent and then there is the content of the patent, or at least the content which PacBio is trying to use.
What do I mean by the form of the patent? Patents are divided into several parts. There is the title, which is too often all-encompassing ambiguity. That includes many of my patents, but not that '056 patent which actually has an informative title. But "Nucleic acid sequence analysis" is about as generic as they come. Then there are images which help illustrate the patent and its claims. There is a lengthy description section. Then there are the actual claims.
My umbrage with the '929 patent arises from the schizoid nature of the substantial sections of the patent, as the images and description are consistent with each other but utterly divorced from the claims made. The images and descriptions are all about the abandoned approach called strobe sequencing, by which long fragments would be read in the SMRT system in bursts in which sometime the ZMWs are left dark. During the dark periods, the polymerase translocates but doesn't suffer photodamage -- but also can't generate data. This once seemed a promising approach to getting long-range information, but PacBio has made such spectacular progress on mitigating photodamage that it became irrelevant. But this is what these sections are about, covering hardware, biochemistry and computational approaches to handling such data. It's all about sequencing with an optical system.
But none of the claims have anything to do with optical sequencing. That's barely covered in the description -- a single paragraph though it is followed by an elastic clause that explicitly says the description is meaningless for establishing claims:
The scope of the invention should, therefore, be determined not with reference to the above description, but should instead be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled.
That just irritates me -- tons of text that is followed by "if you were sucker enough to read this, too bad chump!".
Okay, so what's in the claims?
Claim 1 describes a nanopore sequencing system which takes double-stranded DNA as input and uses an enzyme to translocate the DNA through the pore and electrical signal variation is converted to sequencing information. It specifically mentions sequencing redundant sequence information and computing a consensus from redundant sequencing information.
Claims 2,3,4 are the typical patent language further elaborating Claim 1 to cover protein nanopores, lipid bilayer membranes, and solid state membranes specifically.
Claim 5 goes for altering the rate of enzyme-mediated translocation by changing reaction conditions.
Claims 6 and 7 are the typical setting some thresholds in steps so that if one claim is invalidated the other might stand. Claim 6 refines claim 1 to sequences which are greater than 75% double-stranded DNA and claim 7 covers 90% or greater double-stranded sequence.
Claim 8 covers covalent linkage of two strands of input. Claim 9 goes for inputs which consist of repeats of the sequence of interest, such as that generated by Rolling Circle Amplification (RCA). Claim 10 covers linkers which contain a nucleotide and Claim 11 those which are an oligonucleotide. Claim 12 covers using the oligonucleotide in Claim 11 as a registration sequence to identify the ends of subreads whereas claim 13 adds the idea of using a nick in the linker. Claim 14 talks of using a synthetic linker to link the two strands and Claim 15 extends Claim 14 to carbon-based linkers.
Claim 16 does actually go back to the idea of sequencing with a period of data collection and non-collection, with translocation being sped up during the non-collection period, So that's a faint echo of what the description spends so much time elaborating. Similarly. the final Claim, number 17, extends Claim 16 to the idea of alternating periods of slower translocation with data collection interleaved with faster translocating non-collection periods.
In the lawsuit, PacBio is claiming that Oxford Nanopore is violating the patents in both the 2D kits and 1D^2 ("1D squared") kits. PacBio had previously sued over the 2D sequencing, which linked the two strands with a hairpin, and Oxford subsequently stopped delivering and supporting 2D sequencing. Going after the newer 1D^2 is potentially very troublesome for ONT, as without this they would be stuck with single strand sequencing. One dodge would be to use RCA, an approach published as INC-Seq, but that is covered by Claim 9. So PacBio appears to be going for a total block for Oxford of just about any means of increasing accuracy via redundancy. The two that this patent doesn't address are using multiple copies of templates with Molecular Identification Tags (MIDs) to mark sibling amplicons and the idea of "flossing" the DNA by moving the same molecule repeatedly through the nanopore (which could fall under Claim 5).
But does the patent actually cover 1D^2? This mode, according to ONT, does not involve covalent linkage of the two strands but rather on the propensity of their nanopore to follow completing one strand of a doubly-adapted library molecule by grabbing the opposite strand's motor protein. Indeed, the change from R9.4 to R9.5 pore was reported to be simply mutagenesis to enhance this effect. 1D^2 was discovered by serendipity in existing data. If the transition between strands is sufficiently fast, the basecaller sees both as the same read and you end up with a gigantic apparent inverted repeat. I've seen this phenomenon myself -- I called them "mirror reads" -- but it never dawned on me that they were something magical rather than a pain for sequence assembly.
So 1D^2 doesn't involve any sort of linker nor synthetic duplication of the region of interest nor any manipulation of the translocation conditions, so that would seem to clear Claim 5 as well as Claims 8-17. Claims 2-4 are core nanopore sequencing ideas and must be covered by the earlier patents and publications from Deamer, Branton and Church and others. Claim 6 and 7 (the fraction of double-stranded DNA) are just strange claims and again would seem obvious in light of earlier nanopore publications, but I lack the instant familiarity with that space to be sure.
So that leaves Claim 1. Again, a lot of claim 1 seems to be necessary setup for the later claims but would be covered by prior nanopore publications. Monitoring electrical signals during translocation through a pore is how the whole concept was first presented to me in 1992. So the core of PacBio's claim would seen to be around the idea of generating a consensus sequence from redundant information, perhaps from both strands of the same molecule. Which is explicitly detailed in Exhibit 4 of PacBio's filing.
That would seem to be flying into a mountain of prior art, much of which is cited in the references section. After all, computing such consensi with accuracy estimation was a hallmark feature of phrap, which debuted in the 1990s. A paper from 2003 makes this explicit as well and the idea is also in a 1993 paper. In my mind, trying to somehow claim that it is inventive to restrict this to two complementary strands of the same fragment is preposterous as well. Sanger works with cloned material, but two reads from the same fragment was standard practice in some EST libraries and I certainly saw it as obvious back at Codon in 2007, when I developed code to do this.
Watson and Crick's original 1953 double helix paper showed that the two strands contain redundant information, so one might make the claim that this establishes prior art. Of course, at that time reading sequence was a distant dream. But it certainly influenced the idea.
It is interesting that the patent would cover INC-Seq. The idea of using RCA to generate a repetitive template that is then sequenced by high-throughput methods and assembled into a consensus appears to have first been published in December 2013 as "circle sequencing", though it used the Illumina platform and so did not get long arrays of copies into a single read. That's well after the PacBio priority date, so probably isn't invalidating. So PacBio may have control of anyone using the INC-Seq approach to generate redundancy. INC-Seq is particularly attractive because first it relies on a highly processive polymerase and can therefore potentially replicate inserts that are tens of kilobases long. Second, RCA generates many, many replicates of the input sequence and so could potentially give very high levels of redundancy.
Obviously I have no official capacity in this case, but I would predict what I have explored above will constitute a key part of ONT's defense of this third lawsuit. ONT's lawyers will, I posit, strip the PacBio arguments away with prior art and obviousness arguments, and certainly in my mind they should succeed.