Creation of ADSL Dataset - Day:02 of Onco-ADaM Project

 Introduction

Hey everyone, welcome to day 2 of the Onco-ADaM project! Today, I'll be working on creating the ADSL dataset, which is similar to DM for SDTM. Let's keep it short and simple.

ADSL stands for Subject-Level Analysis Dataset, containing one record per subject regardless of the clinical trial design. The main dataset for creating ADSL is SDTM.DM. We won't include every variable, as that's not the purpose of ADSL. It's designed to include variables necessary for analysis only.

ADSL includes demographic information, key date variables, randomization factors, planned and actual treatment variables, subject-level population flag, subgrouping variables, and baseline values.

Variables in ADSL

I am not including the general variables like USUBJID, SITEID, AGE, SEX, etc. Note that the variables I mention is not limited to this.

a. Key Date variables

  • RFICDT (Informed Consent Date)
  • RANNDT (Randomization Date)
  • TRTSDT (Treatment Start Date)
  • TRTEDT (Treatment End Date)
  • FVISDT (First Visit Date)
  • LVISDT (Last Visit Date)

b. Subject-Level Population Flag

  • SCRNFL (Screening Flag)
  • SCRFFL (Screen-Failure Flag)
  • ENRFL (Enrollment Flag)
  • RANDFL / ITTFL (Randomization Flag / Intend-To-Treat Flag)
  • SAFFL (Safety Flag)
  • FASFL (Full Analysis Set Flag)
  • PPROTFL (Per Protocol Flag)
  • PKFL (Pharmacokinetic Flag)

c. Treatment related variables

  • Parallel group study
    • TRTSDT (Treatment Start Date)
    • TRTEDT (Treatment End Date)
    • TRT01P (Planned treatment for Period 01)
    • TRT01A (Actual Treatment for Period 01)
  • Cross-over study (two period)
    • TRTSDT (Treatment Start Date)
    • TRT01P (Planned Treatment for Period 01)
    • TRT01A (Actual Treatment for Period 01)
    • TRT01SDT (Date of First Exposure in Period 01)
    • TRT01EDT (Date of Last Exposure in Period 01)
    • TRT02P (Planned Treatment for Period 02)
    • TRT02A (Actual Treatment for Period 02)
    • TRT02SDT (Date of First Exposure in Period 02)
    • TRT02EDT (Date of Last Exposure in Period 02)
    • TRTEDT (Treatment End Date)
    • TRTSEQP (Planned Treatment Sequence)
    • TRTSEQA (Actual Treatment Sequence)

d. Grouping variables

  • BMIGR1 (Pooled BMI group 1)-(Character)
  • BMIGR1N (Numeric)
  • AGEGR1 (Pooled Age Group 1)-(Character)
  • AGEGR1N (Numeric)
  • SITEGR1 (Pooled Site Group )-(Character)

e. End of Treatment & Study variables

  • EOSSTT (End of Study Status)
  • EOSDT (End of Study Date)
  • DCSREAS (Reason for Discontinuation of Study)
  • EOTSTT (End of Treatment Status)
  • EOTDT (End of Treatment Date)
  • DCTREAS (Reason for Discontinuation of Treatment)

f. Baseline variables

  • HEIGHTBL (Height at Baseline)
  • WEIGHTBL (Weight at Baseline)

Basic process

Begin by identifying the necessary SDTM datasets to create the ADSL dataset. In my case, these include DM, DS, PK, VS, and SV (there might be more, but these are the main ones).

Start by creating variables from the DM dataset, then proceed to the other datasets.

Finally, merge all the datasets and save them to the specified folder.

Steps

  1. Establishing a library.
  2. Merging all supplementary domains with their corresponding relational domain (typically SUPPDM).
  3. Creating variables:
    1. Converting dates.
    2. Establishing grouping variables.
    3. Defining treatment variables.
    4. Assigning population flags.
    5. Identifying baseline variables.
  4. Merging all datasets.
  5. Organizing required variables in the specified order and labeling them appropriately.

Challenges

  1. Sometimes, I forget to specify the length of certain variables, resulting in data truncation.
  2. Always cross-verify with the SDTM datasets to ensure consistency in variable values.
  3. Always try to identify the main trunk (variable) and its dependent branches (dependent variables), which makes your code efficient and easily understandable.

Comments