Wednesday, June 26, 2019

CDISC Validation: PointCross – Pinnacle21 v.3 comparison: Part 3: hot topics 1

In our third part, we make comparisons between Pinnacle21 Validator 3.0 and MySEND 1.0 for some "hot topics", i.e. topics that were highly problematic in earlier versions of the Pinnacle21 validator (for MySEND, we can't say, as there are no earlier versions).

Labels: PESTRESC

A "hot topic" coming back over and over again in the Pinnacle21 user forum are the labels for variables and datasets. The famous message "label mismatch" (35 "hits" in the forum) is very well known… One of the reasons is that in some cases, variable labels have been published that are longer than 40 characters (the limit for SAS-XPT), and that Pinnacle21 than took the freedom to define itself what the label should be. The most famous example is for the variable PESTRESC.
With the recent release of the "CDISC Library" which is (according to CDISC itself) the "CDISC truth", this issue should essentially be resolved. An overview of our test results is given below.


SDTM-IG version
PESTRESC Label according to
SDTM-IG and CDISC-Library
Validation Result MySEND
(FDA rules)
Validation Result Pinnacle21 v.3.0
(FDA rules)
3.1.2
Character Result/Finding in Std Format
OK
OK
3.1.3
Character Result/Finding in Std Format
OK
OK
3.2
Character Result/Finding in Standard Format
SDTM/dataset variable label mismatch
(FALSE POSITIVE)
OK*1
*1 Pinnacle21 v.3.0 seems to compare the variable label with the "ItemDef Description" from the define.xml when define.xml is provided. In case there is a mismatch between them, it gives an error with a clear error message.
In case no define.xml is provided, it does not give an error for the label "Character Result/Finding in Standard Format". In case the define.xml is present and the label is "Character Result/Finding in Standard Format" in both, no error is thrown.
In case the "label" is completely wrong in both the dataset as in the define.xml (e.g. using "test" for the label for PESTRESC) it gives an error "SDTM/dataset variable label mismatch".

So it looks as Pinnacle21 seems to have made progress here, not throwing an error anymore when the label for PESTRESC does not correspond to what they think it should be, whereas MySEND still seems to follow what Pinnacle21 did in earlier releases.

Remark that the 40-character limitation is an artificial one, due to that the FDA (and PMDA) still require the completely outdated XPT format to be used. In modern times, the transport format is independent from the content standard and does not limit it in what content can be. XPT is a disaster in this sense. HL7-FHIR however shows how it can be done: one standard, three transport formats (XML, JSON, Turtle).

Order of variables: EPOCH in SV ("Subject Visit")

Another hot topic that pops up over and over again is the correct order of variables in a dataset, especially when "timing variables" are added to an "observation" dataset.
The correct order for timing variables is:
SDTM 1.4 (base for SDTM-IG 3.2): VISITNUM, VISIT, VISITDY, TAETORD, EPOCH, --DTC, --STDTC, --ENDTC, --DY, --STDY, --ENDY, --DUR, --TPT, --TPTNUM. ELTM, --TPTREF, --RFTDTC, --STRF, --ENRF, --EVLINT, --EVINTX, --STRTPT, --STTPT, --ENRTPT, --ENTPT, --STINT, --ENINT, --DETECT: