Search
Profile

Ask your question

Close

What Is Clinical Trial Data?

Information about clinical trials make up clinical trial data. Category subtypes include raw data, analyzable data, metadata, and summary data.

Raw data is the granular information about individuals in a single trial. Analyzable data comes from the conclusion of raw data collection: for example, the efficacy of a drug intervention. Metadata provides context to the clinical trial, organizing information into workable categories like study type or primary vs secondary outcomes. Finally, summary data includes the summaries of individual studies written for lay readers or publications.

Where Does Clinical Trial Data Come From?

Naturally, study authors and participants generate this data themselves. However, study authors publish their findings in scholarly journals and on university, pharmaceutical company, government, and other websites. In the past few years, there has been an increasing push for open-source publication of clinical trial data. The Bill and Melinda Gates Foundation is one such source.

What Types of Columns/Attributes Should I Expect When Working with This Data?

Common clinical trial metadata include trial phase, trial status, condition studied, trial location, drug type intervention, use of a placebo, and the age and sex of individual study participants.

Another important attribute, and one which impacts whether a study reaches completion or not, is the study’s funders or sponsors.

What Is This Data Used For?

Public health information, such as the efficacy of certain treatments for disease or injury, is in the interest of medical professionals, public policy makers, and individuals around the globe.

Additionally, pharmaceutical and medical device companies use the data to submit clinical study reports to regulatory bodies to receive certification to market their products within certain countries.

How Should I Test the Quality of Clinical Trial Data?

Researchers collect and transcribe raw data throughout the trial lifecycle. They then complete and cleanse the data, metadata, and summary data to the exacting standards required for publication or FDA/EMA approval. Therefore, there is little to testing to do unless your goal is to build a database of super-metadata.

If that is the case, simply bear in mind the principles of accuracy, relevancy, completeness, timeliness, and consistency for your dataset. Take care to select recent metadata that suits your needs and whose data remains consistent across studies.

Interesting Case Studies and Blogs to Look Into

The Clinical Trial Life Cycle and When to Share Data – Sharing Clinical Trial Data
How to Find Results of Studies

Tangible Examples of Impact

As companies were having to handle more data from more sources, locking databases in clinical trials was found to be taking far longer. “There was a 40% increase from 2017 to 2019 for the ‘Last Patient Last Visit to Database Lock’ cycle time metric for those companies using more than four data sources,” says Rocchio [CMO at eClinical Solutions].

Pharmaceutical Technology: Learning to handle disparate and complex data sources in clinical trials

Relevant datasets

IBM MarketScan Research Databases

by ibm-the-weather-company

IBM MarketScan Research Databases provides one of the oldest continually-updated collection of health claims data in the USA. Organizations use this data to prove their value to healthcare professionals, insurers, and private individuals.

The data includes drug claims, dental claims, lab results, hospital discharges, and EMR data for millions of people in the country. It also contains workplace productivity data, telling institutions how many workplaces absences they suffer and how many of their healthcare workers suffer disability due to their work. 

4 (1)   Reviews (1)