The second
own project I have been working on in the last days is extending the open
source "Smart Submission Dataset Viewer" with features using the
"CDISC Library API". Essentially, the "CDISC
Library" is the "CDISC single source of truth", especially about
the metadata of electronic submissions. The "define.xml" file that
one must also submit to the regulatory authorities is the "sponsor's
truth" about the submission. But of course, the contents of the define.xml
(the sponsor's truth) must also comply to the "CDISC truth", and that
is exactly what we have the "CDISC Library" for. So, I added some new
features to the "Smart Submission Dataset Viewer" that query the
CDISC library and compare the responses with what is in the define.xml.
In the new
version, when one clicks the "options" button, and then select the
"CDISC Library features" tab, this is now what the user will see:
In many
cases, you will want to check all checkboxes. The first checkbox takes care
that for each variable in your submission (SDTM, SEND or ADaM, as defined in
the define.xml), the software queries the CDISC library for the properties of
that variable and displays it as a tooltip when the user hovers the mouse over
the column header:
The two
next checkboxes take care that the variable properties as defined in the
define.xml and the codelists assigned to them are compared with the information
from the CDISC Library, and if they do not match, generates a
"discrepancy" and also displays that in the tooltip too:
We don't
use things like "Error" or "Warning", as we think it is not
up to us to judge what such a discrepancy means or how bad (or good) it is.
After all, there may have been a very good reason to deviate from the standard!
The last
checkbox allows to generate a report (as HTML and/or XML) in addition to having
the tooltips. An example of such a report is below:
For
example, for row 7 (VSORRESU), a discrepancy is found stating that the wrong
codelist was assigned to VSORRESU. In fact, the define.xml states that the
"UNIT" codelist (C71620) is assigned to VSORRESU, whereas the CDISC
Library states that the codelist with NCI code C66770 should be assigned.
But what is the codelist C66770? In first instance, when querying the CDISC Library for VSORRESU, it does not provide any details of the codelist C66770 (even not the name), but instead, delivers a link (reference) to where that information can be found, i.e. provides the link for the next possible query to the CDISC Library:
But what is the codelist C66770? In first instance, when querying the CDISC Library for VSORRESU, it does not provide any details of the codelist C66770 (even not the name), but instead, delivers a link (reference) to where that information can be found, i.e. provides the link for the next possible query to the CDISC Library:
In the
report, this reference is provided. So, if the user wants more details about
the codelist C66770 (the one the library says should be used for VSORRESU),
he/she clicks the hyperlink, and a new query is done on the CDISC library using
the link that was provided in the prior response.
We call this "chaining". It means that one response contains the links to any other information about that "object", including information about any previous versions of that "object" and later versions of that "object". This principle is known as "HAETAOS" principle by computer scientists. It is a great way to generate your own "network" of all the "things" in your submission, which is a partial copy of the "network of things" in the CDISC Library.
We call this "chaining". It means that one response contains the links to any other information about that "object", including information about any previous versions of that "object" and later versions of that "object". This principle is known as "HAETAOS" principle by computer scientists. It is a great way to generate your own "network" of all the "things" in your submission, which is a partial copy of the "network of things" in the CDISC Library.
What are
the limitations of using the CDISC Library in the "Smart Submission
Dataset Viewer"?
Essentially,
the only limitation is creativity! We only added a very small amount of
features that are using the CDISC Library. But there can be so much more. So,
if you have ideas about features you would like to see added to the "Smart
Submission Dataset Viewer", just drop me a mail, and we can implement it.
After all, developing and testing new methods using the "CDISC Library API"
usually takes only a few hours, as the CDISC Library API is so extremely easy
to implement.