Naturally I linked our monitor and our MDT via Bluetooth, and transmitted a 12-Lead from a rhythm generator. When the file landed on the MDT, I looked for an application to view the 12-Leads and rhythm strips, however, none appeared to be able to use the file as-is.
For the non-technical, the ECGs are shipped compressed--somewhat like a ZIP file--which contains all of your monitored vital signs, printed rhythm strips, and your 12-Leads. The format of the 12-Leads is an Open Standard; Philips Healthcare provides most of the details needed to use the files. The 12-Lead data is also compressed to save space. Unfortunately, there is no documentation which tells you how to decompress the 12-Lead data.
The technically-faint-of-heart should skip these next bits.
For the technical, the ECGs are contained in a Gzip'd TAR archive. The 12-Leads are stored inside in an XML format known as the Sierra ECG format (currently at version 1.03 or 1.04, as far as I can tell). Inside this XML format is Base64 encoded, XLI compressed data comprising the acquired leads during a 12-Lead (up to 16 leads appear to be able to be stored).
I searched for a description of the XLI compression format, however, I was only able to find a reference implementation for Microsoft Windows which simply decoded the files. No code or description was provided, and the implementation itself is not portable. (ed: it appears this may be a reference to the HP PageWriter XLi which Philips acquired)
At this point I decided my only option was to reverse engineer the XLI Compression format, and began with simple guesses. I tried decompressing the data using Deflate, Zip, and RLE without any progress. I was able to determine that the first 8 bytes of the compressed data included a compressed length, some uncompressed data, and that each of the 12 to 16 leads were stored in a chunk with one of these headers:
Once the simple guesses were ruled out, I began exploring the behavior of the reference implementation provided for the Sierra ECG format. Using OllyDbg I noticed certain code tells which made me believe the decompression algorithm read 10-bits at a time:offset 2 4 6 8 ... +--------+--------+--------+--------+--------+--------+--------+ | Size | Unk. | Delta? | Compressed data... | +--------+--------+--------+--------+ | | ... [Size bytes] | +--------+--------+--------+--------+--------+--------+--------+ | Next lead chunk ... |
The compressed data also did not appear to contain a compression dictionary referenced by the code. At this point I considered I was looking at a form of Lempel-Ziv-Welch, or LZW, compression. LZW is a popular, lossless compression scheme which creates its compression dictionary on the fly. It is used by the GIF and TIFF image formats, and was the subject of controversy when it was first introduced into the GIF format due to patent licensing requirements.SHR EAX, 16h ; reduce EAX to the 10-bit code word SHL ECX, Ah ; prepare to read 10 more bits from the input
In my quest to quickly reach a conclusion I found an excellent LZW implementation from Mark Nelson in C and it successfully decompressed the data. In fact, the structure of the C code was so familiar, I realized the reference implementation from Philips used the exact same code!
If you've reached this step while following along at home, you'll notice the decompressed data seems front-loaded with 0's. This is a case of intelligently streaming the data to the compression algorithm to take advantage of data duplication.
The uncompressed data represents 16-bit delta codes, of which the majority include 0x00 or 0xFF in their most significant byte (MSB). This is because they are either small and positive or small and negative, and as ECG data is rhythmic the delta codes are likely to retain the same sign for numerous samples.
To take advantage of this fact during compression, the delta codes are first deinterleaved into two halves. The first half includes each MSB and the second half includes each LSB. The pseudo-code for interleaving the decompressed data looks like the following:
At this point the delta compression scheme will need to be decoded to produce the actual signal data for each of the leads. The delta compression scheme is a simple recurrence relation (a second order difference relation) using the prior two delta codes:# input contains the decompressed data # output will contain the interleaved 16-bit delta codes fun unpack( input[], output[], nSamples ) for i <- 1..nSamples output[i] <- (input[i] << 8) | input[nSamples + i] endfor endfun
Now that you have the actual, per signal data all you need to do is recreate leads III, aVR, aVL, and aVF. This is done using the data from leads I and II as on most ECG machines. I've omitted the actual formulas for brevity.# output contains the 16-bit delta codes # first is the 16-bit delta code from the chunk header fun deltaDecompression( output[], nSamples, first ) x <- output[1] y <- output[2] prev <- first for i <- 3..nSamples z <- (2 * y) - x - prev prev <- output[i] - 64 # is -64 to 64 the range? output[i] <- z x <- y y <- z endfor endfun
Using my reference implementation of the decompression algorithm I was able to feed the original acquired 12-Lead to the Philips ECG to SVG converter, with the following results:
If you'd like to start playing with my code I welcome you to join my Github Project: sierra-ecg-tools. I am also working on a C implementation, and likely an Android implementation. Stay tuned, and apologies for the technical post.
The author has no financial ties to Philips Healthcare and received no compensation for this work.