Thursday, October 27, 2016

Segmental Duplications and Deletions in Books of History and Life

I have long had a historical interest in the U.S. Civil War.  In fifth grade we had an all-day field trip to the Gettysburg Battlefield, which is very well preserved, and it enthralled me.  A few years later I would hike all over the battlefield with my Boy Scout troop, which is probably even closer to experiencing a taste of what it was like to be soldier. Of course, we had good hiking boots - a proximate cause of the battle occurring there was an attempt by the Confederates to raid a shipment of shoes which had just arrived in the town.  My great-grandfather served in the Union Army, though entirely on garrison duty.  However, his two older brothers saw action and one lost an arm at Chickamauga.  I just failed for arguably the fourth time to get to that site, but I've toured a few other key fields (Antietam and Petersburg).

Mostly I read about the Civil War at irregular intervals; I'm now diving in yet again.  It was truly a major force in shaping the United States, both for good and evil, and so deserves attention.  The Civil War is also the exception to the rule that victors write the history; a concerted effort by ex-Confederates to shape the interpretation of events echoes to this day.  I had very good grade school teachers spout absurd, but commonly accepted, ideas that flow from those re-writings of history. In particular, I might have been taught that secession was not about preserving slavery, but at the time the leaders of that act were extremely explicit that this was the crux of the matter.  As with any war, there are fascinating "what ifs" and missed opportunities.  If the war had ended early, what would have happened to the horror of slavery? How could a general possess his opponent's detailed plans and still not win a decisive victory? Could the South have forced a settlement?  What if Lincoln had lost the 1864 election? What if Lincoln had not been assassinated?

This isn't the place to go into depth on that or fight those battles.  What lands the topic here is a collection of essays on the Civil War by different professional historians, titled With My Face to the Enemy.  Please take a moment to see if your library system has it, because even if you don't enjoy reading about this was you might want to help me research an issue with the book

I don't remember how I discovered this; it might have been via the odd transition if I was reading the collection through, or might have been that I wanted to read a particular essay such as the one on George Thomas, the Rock of Chickamauga who saved the Union Army from total disaster there (and perhaps my great,great uncle from captivity?).  In either case, somehow I found my way to this part of the book:
 You'll note that the title of the chapter on the left page doesn't agree with the title listed at the top of the right page ; left is Grant and right is Thomas.  That triggered me to look some more and ultimately I found Grant again, but this time paired with Stonewall Jackson.

In the below shot, the first two pages are successive leaves of the book and the last two pages succeed each other as well, but between the middle two are a lot of pages.  Nothing else is out-of-order, the duplicated section appears in correct order.  Alas, pages 299 to 331 are nowhere to be found; a complete swap was made.

It turns out our library system has two copies of the book, and I had wondered in the past if both had the defect.  I conspired with TNG this week to obtain both copies from the MVLC (we both requested the title), and the exact same problem is in both.  If you are so inspired and can find a copy from another system, could you let me know if it is flawed as well?

In biology we would call this a segmental duplication with a reciprocal deletion.  Identifying sub-chromosomal insertions and deletions, particularly relatively small ones, has become a major by-product of having complete human genome sequences and dense maps of the genome.  Before such maps existed, the presence of small insertions and deletions was beyond the bounds of any detection technology.  Large changes could be seen on chromosome stains, but this is a challenging technique requiring very skilled labor and not suitable for high-throughput studies.  Copy number variants are proving to be major sources of genetic diversity and also are increasingly implicated in human genetic traits and conditions.  Copy number variation, particularly deletions of regions containing tumor suppressors and amplifications of regions containing oncogenes, are common in tumors.

Reliably detecting such copy number variants can be difficult with short read sequencing, particularly if one wishes to get the boundaries correctly.  Naive alignment of whole genome sequencing reads to a reference genome can lose unusual copy number variants; keeping these is one piece of the push to graph representations of genomes.  Finding novel segments missing from common references is often a highlight of papers performing whole genome sequencing on human populations that were previously absent or poorly represented in genome databases.

Because conventional short read sequencing can miss these regions, precisely reading out segmental duplications and deletions has been a target for companies with either long read sequencing technologies (PacBio) or high-throughput physical mapping approaches such as BioNano Genomics or Nabsys 2.0.  Better resolving such regions is also the promise of linked read ventures such as 10X Genomics and iGenomX. On the other end, ascertaining copy number variation in clinical samples quickly and reliably is one of the counting markets targeted by SeqLL; if you know what to look for then measuring it is a lot easier.

If I ever get over my fear of obsessing over my own genome, then I will get myself sequenced.  Odds are I'll have many copy number variants relative to the current reference genome, though by then I expect graph representations will dominate.  Obviously I haven't inherited anything particularly devastating, as I am here and free of any strongly detrimental phenotypes (though I suspect the extreme myopia and periodic acid reflux run in the family).  On the other hand, if I can't successfully find a complete copy of With My Face to the Enemy, I'll never get to read the first part of the essays on Grant and Stuart - and on the Rock of Chickamauga.


Rick said...

Not at any of the libraries around here -- both city and Purdue University. Many for sale on Amazon. I should check my local bookstore first.

Cliff Beall said...

The main library at Ohio State has it and it doesn't have the duplication.

I ran into an identical duplication recently with one of the Jo Nesbo Harry Hole books. Unfortunately someone had given it to me used so I wasn't able to return it, but it was good enough that bought a correct copy

Cliff Beall said...

Also, I recently read a book about the Chickamauga campaign called "Failure in the Saddle" - it was pretty good.

Unknown said...

We have a copy here in the main library of the University of Hong Kong with out the duplication. It is Berkley trade paperback edition/ May 2002.

Anonymous said...

Paperback edition, which use a glue binding (the so-called "prefect binding", which is as bad a piece of deceitful marketing as the "red delicious" apple), are unlikely to have a duplicated stretch of 32 pages—that is the sort of error you are more likely to find in a sewn binding, where one signature (a group of pages folded together and sewn in as a unit) gets put in twice.