Small Data in Predictive Toxicology: Towards Automated Hypothesis Generation and Testing for Hepato- and Nephrotoxicity

Drug development is affected by high rates of failure due to unexpected toxic effects, some discovered even after drugs have been marketed. The difficulty of detecting all possible toxic effects of a drug lies in the inability to include in clinical trials all possible influences of life-styles and the person-to-person genetic variability.

In this project, we are exploring the use of social media as a promising source of clinical data that is currently largely untapped. Social media like Twitter offers a wealth of data where users report their own experiences. In this project, we are setting the basis to exploit social media to find hints of connections between drug use and side effects (the Small Data).

Texts indicative of adverse effects will be contrasted with toxicogenomics databases, which report the effects of drugs and pollutants in biological samples (human and animal cell lines, and animal organs). In particular, profiles following the gene expression changes of thousands of genes provide great detail of the effects of chemicals at the molecular level. We will focus on hepatotoxicity and nephrotoxicity where the interplay between the chemical, its metabolized derivatives, and the implication of other tissues (e.g. the kidney clearing metabolized derivatives), generates scenarios that are difficult to relate to the biological models used. Here we will use social media to find the optimal models and cohorts that address particular aspects of drug toxicity.