In the last weeks and months, I had a lot (e-mail)
discussions with Clem McDonald (Chief Health Data Standards Officer at NIH/NLM),
Daniel Vreeman (Regenstrief Institute, known from LOINC and NLM), and CDISC
representatives. Clem has been a pioneer in medical informatics: he was already working in the
field when I was still at high school and never heard about computers.
It was all about LOINC and UCUM.
CDISC has been late in endorsing LOINC. SDTM always had LBLOINC as a permissible variable, but essentially no sponsor was using it as it was (and still is) … permissible.
Everything changed when the FDA announced to start requiring LOINC coding forlaboratory results for new studies starting from 2020. This triggered CDISC, FDA and Regenstrief to regularly sit together and let to the document "Recommendations for the Submissionof LOINC Codes in Regulatory Applications to the U.S. Food and Drug Administration".
Also, CDISC is now finalizing the development of a mapping
between the most used LOINC codes and CDISC-CT for LBTESTCD, LBTEST, LBSPEC. As
soon as it is published, we will make a RESTful web service available for it,
allowing sponsors to automatically populate LBTESTCD, LBTEST and LBSPEC from
the LOINC code. In my opinion however, sponsors should NOT need to populate
LBTESTCD and such when LBLOINC is populated, as all the necessary background
information is in the LOINC code itself, and can easily be retrieved and
visualized using one of the many RESTful web services for LOINC.
Important to note here is that UCUM is a notation, whereas
CDISC [UNIT] codelist is a list (no systematics).
When comparing, one immediately sees that some of the
UCUM units are identical to CDISC units, but this is mostly not the case.
Whereas the CDISC [UNIT] codelist is limited in size (the CDISC CT version
2018-09-28 has 692 terms), the number of units in UCUM is in principle infinite
as UCUM is a notation.
The recent discussion between me, NLM, Regenstrief and
CDISC was about allowing UCUM notation in CDISC SDTM submissions. If you now
submit a blood pressure with "mm[Hg]" as the unit (e.g. in VSORRESU),
validation software used by the FDA will throw an error, as "mm[Hg]"
is not in the CDISC [UNIT] codelist, and the SDTM-IG expects you to map this to even when the blood pressure came from an
electronic health record (EHR) where UCUM notation is almost always used. The
same applies to the lab unit "international units", with CDISC unit
"IU", whereas the UCUM notation is "[IU]". From the SDTM-IG (3.2 and 3.3):
"The
variable, LBORRESU uses the UNIT codelist. This means that sponsors should be
submitting a term from the column "CDISC Submission Value" in the
published Controlled Terminology List that is maintained for CDISC by NCI EVS.
When sponsors have units that are not in this column, they should first check
to see if their unit is mathematically synonymous with an existing unit and
submit their lab values using that unit".
This means that the SDTM forces us to "translate"
"[IU]" to "IU" in an SDTM submission when the data point
comes from an EHR, which is not only tedious, but also error prone.
During the discussion with Clem McDonald, he asked me to
come with one or more use cases where the use of UCUM notation in SDTM would be
advantageous for FDA reviewers, as this could be a "tipping point"
for allowing UCUM notation in SDTM. I didn't have to think long …
The --STRESN variables (Numeric Result/Finding in Standard
Units) is such a use case. These variables are meant to "standardize"
results for the same test (this although SDTM has no good mechanism to define what "the same test" is)
to a single unit. For example, when (for the same study), one local lab
delivers the resuls in "mg/dL", another one on "pg/cL" and
yet another one in "mg/cL", these all have to be recalculated for
LBSTRESN to a single unit, e.g. "mg/dL". How is this done?
In the classic way, using CDISC units, converting –ORRES to --STRESN requires a lot of
programming for the SDTM developer. This is not only tedious work, but also extremely
error prone. The reason is that CDISC-CT does not deliver the conversion
factors like between "mg/dL", "pg/cL" and
"mg/cL". It all has to be programmed. So the programmer need to find
them out him/her-self and program them all (there can be hundreds of them).
If we know that all our "source" units (from
–ORRESU) are in UCUM notation, this can fully be
automated, as through the UCUM "ucum-essence.xml", conversions between units for the same
"property" (a concentration in this case), automated conversion is
easily possible.
So, the program that generates the SDTM data sets can simply
call that service to perform these conversions. This means hours or even days
of less programming effort for the SDTM programmer, and much much less errors.
In the above case (mg/dL), "this is our lucky
day"! In the given case, the UCUM notation overlaps with the CDISC list,
so we could also use the NLM RESTful web service to automate the conversion.
For the other example ("IU" versus "[IU]") we cannot use
it.
Interesting in this case is that the CDISC-CT did a big effort to try to cover a large number of units in their list in which "IU" is present: there are over 60 of them, from "100 IU/mL" to "uUI/mL/kg". However, if one would need to cover all suitable combinations, CDISC-CT would need to add thousands of "IU" units to their list.
Interesting in this case is that the CDISC-CT did a big effort to try to cover a large number of units in their list in which "IU" is present: there are over 60 of them, from "100 IU/mL" to "uUI/mL/kg". However, if one would need to cover all suitable combinations, CDISC-CT would need to add thousands of "IU" units to their list.
When using UCUM however, there is no need for all of this,
as UCUM is a notation. So "u[UI]/mL/kg is just as valid as "p[UI]/mL/kg"
as "m[UI]/dL/dg" and all other possible combinations of "[UI],
"L" and "g" or better said, of all possible combinations of
"International Units" and volume and mass. So also
"d[UI]/cm3/[lb_av]" (the last "[lb_av]" is the symbol for
"pounds") is a valid UCUM unit! And every value in
"d[UI]/cm3/[lb_av]" can be automatically be converted and
standardized to e.g. "u[UI]/mL/kg".
But how do we know whether a provided unit is in UCUM
notation? Also here, there is a RESTful web service provided by the NLM to validate whether a given unit is a valid UCUM notation unit.
Not very many sponsors have asked yet for better CDISC
support for LOINC and UCUM. One of them is SHIRE. Veena Nataraj and Diane Piper
gave a presentation at the 2018 CDISC International Interchange titled "Delivering
LOINC Codes in Future – Bridging
Gaps within clinical lifecycle". You can find their paper here, and
their slides here.Some interesting statements from their paper are:
concerning LBTESTCD, LBTEST, LBSPEC, …:
"Using CDISC conventions, all of these attributes are required to uniquely and accurately identify a given test and its results. This can be challenging, and the variability of how these attributes are managed and implemented make it difficult to determine if a given test conducted in one study is comparable to a test in another study, even within studies conducted by a single Sponsor company, never mind across the industry. In contrast, LOINC incorporates all of the SDTM attributes as well as others to properly and uniquely identify a test in a single term".
Corresponding to my statement (already years ago) that
only a LOINC code can uniquely define a lab test .
No comments:
Post a Comment