Solutions: Data Exploration

Data is the lifeblood of the scientific process. Data is used to prove or disprove hypotheses and to guide decision-making. Its lifespan starts with data generation and ends with achieving concrete insights. There is a step in the middle, however, that is often overlooked: data exploration - the stage of hypothesis generation and decision-making during which the researcher takes a high-level, exploratory view of the available data.


Data acquisition &rarr Data exploration &rarr Data analysis


This data exploration stage is often used to determine the availability of "interesting data" that can be used in further analysis. An example exploratory question could be:

How many patients are there in the various data sources in my enterprise that have been diagnosed with a given type of cancer, on a certain medication and had a certain therapy and outcome? From these patients, how many primary and metastatic tumor samples do we have? Do we have any matching primary and metastatic samples? Of these samples, what gene expression and genotyping data are available?

How do you answer questions like these with your current query tool? Do questions get "thrown over the wall" to database experts? How long does it take to get the answer? If your questions change, does the whole process restart from square one?

With Labmatrix, you can explore your data:

  • Easily - There is no need to understand database concepts. Without knowing how data is organized, or even what data is available, you begin by simply asking a question. Labmatrix will guide you through an exploratory process to get to the answers.
  • Intuitively - Labmatrix understands how researchers think. Constructing a query is as easy as drawing on a canvas. The format is familiar, similar to that of a pathway diagram. There are many "business intelligence" tools out there, which are just that, business tools designed for business people. Labmatrix provides scientific intelligence designed for researchers.
  • Independently - Due to its intuitive nature, Labmatrix allows you to explore your data on your own, without relying on expert IT support. Researchers at all levels of the enterprise (most of whom are software averse) tell us that "Labmatrix is so easy even I can use it".
  • Holistically - Labmatrix allows you to query across multiple domains using a single, secure interface. You determine what data will be shared and with whom. You no longer have to go to one application for clinical data, another for gene expression, yet another for genotyping, and so on. Everything is at your fingertips, pre-integrated.

BUILDING A QUERY USING LABMATRIX

1. Start with a question: Do we have samples for patients who have breast cancer and are on Herceptin?

2. Search the terms of interest: "breast cancer" and "Herceptin" Labmatrix guides you to the datasets that contain information about these topics.

Simply click on a term and Labmatrix begins building a graphical query automatically.

3. Drag-and-drop to combine multiple lines of inquiry: Once you have found the datasets of interest, you can query across them through a simple drag-and-drop operation. With built-in scientific knowledge, Labmatrix presents a few intelligent choices phrased in plain English (no need for any database knowledge!). For example, when you drag the "Breast Cancer" dataset onto the "Subjects" dataset, one of the options that appear is: Find all Subjects with a corresponding record in the specified "Breast Cancer" dataset. This produces a dataset containing all the subjects with breast cancer, as indicated:

4. Divide and conquer: Understanding steps (2) and (3) above is all that is required to master the Labmatrix Query Builder! To answer sophisticated operational or scientific questions, just repeat these steps in an iterative fashion. Complex inquiries can be deconstructed into a collection of simple questions. Labmatrix guides the user through a divide-and-conquer process that is similar to the way biological systems work - a pathway may be complex but each individual step in the pathway is very simple.

The following is an example of a more complex query that returns a set of patients conforming to a predefined clinical profile (breast cancer patients who are on Herceptin), and a set of molecular profiles (High Score 1, Low Score 2, and either High Score 3 or High Score 4).

5. Visualize/Analyze Results: Once the query is constructed, it can be run in its entirety or to any of the intermediary nodes in the diagram. Query results can be displayed immediately or exported in a variety of formats for further analysis.


LABMATRIX QUERY BUILDER CASE STUDY

The Challenge: A pharmaceutical company sponsored a large-scale collaboration with an academic research institute. The research institute provided de-identified clinical information and properly-consented patient samples. The company performed gene expression and other molecular analysis on these samples. The company wished to derive more value from its investment. With the existing solution, there were two main challenges: 1) there was no way to present clinical data to research scientists in a meaningful fashion, and 2) there was no robust and user-friendly system to enable integration of clinical and molecular analytical data. Excel is often the only option.

The Solution: Labmatrix was implemented at this company as "the Clinical Data Repository for research". In addition, Labmatrix was integrated with other existing research systems containing molecular data. This integration was accomplished through a CDISC SDTM messaging interface, as well as a data federation engine. Controlled vocabulary from SNOMED was applied to terms in the clinical dataset. Researchers were then trained to use the system in one or two short sessions.

The Result: Researchers were able to explore data from both the clinical perspective and the molecular perspective. There was a groundswell of support from individual researchers all the way up to corporate R&D leadership. Adoption of the system moved beyond the initial scope of academic collaboration to other therapeutic areas and data streams. Scientists are now able to explore available data easily, intuitively, independently and holistically.


Please send questions or comments to: info@biofortis.com

BioFortis provides the revolutionary Labmatrix™ software platform for bioscience discovery to biomedical, biotechnology, federal, and pharmaceutical life sciences laboratories. - Jian Wang, Ph.D., BioFortis CEO
©2009 BioFortis, Inc. All Rights Reserved.