Friday, May 3, 2019

CDISC Submission Validation Software: There is a new Kid in Town


A few days ago, I was informed that there is a new free validation software available for validating SDTM, SEND and ADaM submissions to the FDA and PMDA.
The company behind this free software is named PointCross Life Sciences, located in California, and the product is name "MySEND". The name may a bit confusing, as it suggests it is only for SEND submissions, but this is not correct, as one can validate SEND submissions as well as SDTM and ADaM (1.1 not supported yet) submissions. The latest release notes (2019-05-01) however suggest that there will be a name change to "eDataValidator".
The software is free but not open source. It is written In Java, but currently only runs on Windows systems.
In this blog, I will describe my first impressions and experiences with the software.
Until recently, there was only one submissions validation software freely available, the "Pinnacle21 Community" validation software. The last release (v.2.2.0) was in September 2016, with no intermediate bug fixes available. A new major version (3.0.0) has however been announced for the coming weeks. 

Most of you know my opinion about this software, well described in a number of my previous blogs, so I will not repeat anything here, but concentrate on the new PointCross MySEND software. Once the new Pinnacle21 release is available, I am intending to make a comparison: comparing the new MySEND with the old Pinnacle21 v.2.2.0 would not be fair.

After starting the software, the graphical user interface is displayed: 


For "Validation", there are three parts: the upper part allows to drag-and-drop the submission files, including the define.xml (it is automatically recognized, even when the name is different).
The second part allows you to set the validation parameters. This is how it looks today (May 5th):
 
 


With near the bottom:  
 


Currently, only FDA and PMDA rules can be worked with, CDISC rules (like for SDTMIG 3.2 and the ADaM Conformance Rules v2.0 (March 2019) are not (yet) available.
A nice feature I soon found out and liked, is that when a new version of the software or a bug fix has been released, a message appears when starting the software, and asking whether you would want to install the new version or bug fix. According to the release notes, there seems to have been 4 such updates in the last year. Timely bug fixes are so important in validation software!
 

Also, I was informed that a similar mechanism will apply for updates of controlled terminology files. That is of course a very good idea, especially as there have been discussions about changed CT files with the Pinnacle21 validator. 


After clicking the "Validate" button, validation starts (it is very fast - however, I prefer quality over speed) and an Excel report becomes available. A nice feature I personally would like to see is an option for making the validation report additionally available as an XML document, which can the be transformed in almost anything.


I did a very few validations, mostly on SEND and SDTM datasets, and my impression (no metrics yet) is that there are considerably less false positives than I am used to with the Pinnacle21 software. I am however planning to do some real metrics once the next Community version of Pinnacle21 is available.
 


One of the things that struck me is that the CDISC Controlled Terminology files come as binary files. This may have been done for efficiency reasons, but it makes it impossible to check whether the CT fileshave been manipulated, as has happened in other cases in the past. 


I cannot say anything about define.xml validation yet. When I only drag-and-drop a define.xml into the "files" field and validate, the define.xml seems not to be validated on its own. Also, whether and in which degree the define.xml is accepted by the tool as "the sponsor's truth", is not yet clear to me. This can e.g. be tested by having extensions to codelists in the define.xml, in which cases the validator should not complain when such extension values are found in the submission datasets.



A nice feature is also that one can generate an SDRG (Study Data Reviewers Guide) template from the file with validation messages. On the other side, this once shows again that the SDRG has become a "garbage can" for reporting and explaining everything that such validations tools found non-conformant (true positives and false positives). The results of the Phuse "SDRG in XML" project seem not to have flown into this software yet. 


For most of the features (I only used the "validation" feature) I did not find good documentation yet. That is a pity. Something I am also missing (but maybe it will come) is a "command line interface" (CLI). This would also allow to start the validator from within another software, such as from our SDTM-ETL mapping software and passing file locations as arguments.


You will probably ask yourself "Is the FDA aware of this additional validation tool" and "in the FDA using it". According to the authors of the software, the FDA is well aware of this alternative offering, and that the tool is being used by FDA reviewers.
 

What is still a sorrow however is that some of the FDA and PMDA rules are completely nonsense, like the famous rule FDAC154: "Missing value for --ORRESU, when --ORRES is provided". Whether this "nonsense" rule is implemented in MySEND and how still needs to be tested. Better were however that the FDA retreats this rule. Also as this software is not "open source", there is no way we can check. Once again another argument to work on "Open Rules for CDISC Standards".

In the following months, and after Pinnacle21 released the new version 3.0.0, I will do more extensive testing on both the tools, and let you know about my findings.


2 comments:

  1. I would like to see proof that the FDA is using this software...if they are I'd like to know whether its used for clinical or nonclinical.

    ReplyDelete
  2. Being located in Europe, I can of course not proof that. Please reach out to Mohit Mathew at PointCross.

    ReplyDelete