Sunday, July 24, 2011

Validating CDA Documents

I received sample CCD files from a trading partner. I ran their sample files against the NIST validator and the first file had over 50 errors against the base CDA schema. They had used the data type PQ for all results in the results section. Some of those results were not physical quantities. Like so:

<value unit="ML/MIN/1.73M2" value=">60" xsi:type="PQ"/>

<value unit="MIU/ML" value="<2" xsi:type="PQ"/>

<value unit="UNK" value="///" xsi:type="PQ"/>

<value unit="UNK" value="NOT DETECTED" xsi:type="PQ"/>

<value unit="UNK" value="Negative" xsi:type="PQ"/>

<value unit="UNK" value="neg" xsi:type="PQ"/>

There is a LOINC code for Serum Cholesterol. All of the codes were either "0" or "UNK".

I went in and manually fixed the data type errors.

This is invalid:

<value unit="UNK" value="neg" xsi:type="PQ"/>

"neg" is not a physical quantity. Send this as a code, instead.

This is correct:


<value xsi:type="CD" code="260385009" displayName="Negative" codeSystem="2.16.840.1.113883.6.96" codeSystemName="SNOMED-CT"/>

This is invalid:

<value unit="UNK" value="pos" xsi:type="PQ"/>

"pos" is not a physical quantity. Send this as a code, instead.

This is correct:

<value xsi:type="CD" code="10828004" displayName="Positive" codeSystem="2.16.840.1.113883.6.96" codeSystemName="SNOMED-CT"/>

This is invalid:

<value unit="UNK" value="NOT DETECTED" xsi:type="PQ"/>

"not detected" is not a physical quantity. Send this as a code, instead.

<value code="260415000" codeSystemName="SNOMED-CT" displayName="Not Detected" codeSystem ="2.16.840.1.113883.6.96" xsi:type="CD"/>

This is invalid:

<value unit="MIU/ML" value="<2" xsi:type="PQ"/>

"<2" is not a physical quantity. Use interval of physical quantity as the data type, instead.

This is correct:

<value xsi:type="IVL_PQ">
<high unit="MIU/ML" value="2"></high>
</value>

This is invalid:

<value unit="ML/MIN/1.73M2" value=">60" xsi:type="PQ"/>

">60" is not a physical quantity. Use interval of physical quantity as the data type, instead.

This is correct:

<value xsi:type="IVL_PQ">
<low unit="ML/MIN/1.73M2" value="60"></low>
</value>

There are many undefined codes in the document. These need to be provided.

<code code="0" codeSystem="2.16.840.1.113883.6.12" codeSystemName="CPT-4" displayName="UNK">

<code code="UNK" codeSystem="2.16.840.1.113883.5.83" displayName="HDL CHOLESTEROL"/>

I asked the HL7 Structured Documents mailing list how they would deal with this document. Some vendors got defensive. It was an interesting exchange.

Saturday, July 2, 2011

Re-identification and Patient Consent

I am working with a customer that is a state-wide Health Information Exchange (HIE) to extract data from the state's clinical data repository (CDR) and send it off to a Business Intelligence (BI) system that will be used to conduct population based research into clinical outcomes.

This type of research has been difficult to perform today because most of the data is on paper charts and information has to be re-keyed into the BI system. We will be able to extract data from the state's CDR, normalize it and then create research "data marts" that can be further manipulated. This will give us actual data that can be used to determine what treatments work and what treatments are a waste of time and money.

The state has an "opt out" privacy policy, which means that patient's data will be shared unless they explicitly choose to "opt out" of data sharing. Other states that have adopted this type of consent policy have reported that only two to three (2-3) percent of patients choose to "opt out" of data sharing. These states also report that of those patients that have chosen to "opt out" of data sharing, seven percent of those choose to change their status to "opt in" each year.

Here is the tie in.

The state wants to send *all* patient data, after it has been de-identified to the researchers. I have cautioned them that we must exclude the data on patients that have "opted out" from the feed to the researchers.

The state responded, "well, the data is de-identified, so we haven't really shared the patient's data."

1. Some data cannot be de-identified enough to actually mask the patient's identity. There may only be one patient in a city or zip code with a rare disease. No amount of de-identification would truly hide that patient's identity.

2. Researchers have the ability to "re-identify" the data and find out who the patient is if they discover something unusual and decide that they would like to contact the patient to ask further questions.

Both of these cases mean that we would be sharing identified data about the patient against the patient's explicit wishes. Imagine if you will that you have chosen to opt out of sharing data and then recieve a phone call from a researcher asking to talk to you about the effectiveness of your herpes treatments?

I am advising the state to remove data from patients that have "opted out" of data sharing from the research project. If necessary, I will involve our lawyers. I don't know what the penalty is for disclosing a patient's data against their wishes is in this state, but I hope not to find out.