Tutorials - Abstracts

 

• Visualization and Analysis of Biological Networks NEW

• Asking Translational Research Questions Using Ontology Enrichment Analysis

• Systems Biology in Epigenetic Regulation

• Bioinformatics Workflows on the Cloud with Unipro UGENE CANCELLED

• Integrated Genomic Analysis and Visualization Using the UCSC Cancer Genomics Browser

 


 

Visualization and Analysis of BiologicalNetworks - NEW

John "Scooter" Morris
University of California San Francisco

Abstract
Networks have long been used to represent important biological processes. Many of us remember memorizing the Krebs (TCA) cycle, which is usually shown as a directed graph, itself a type of network. Recently, however, the use of networks in biology has changed from purely illustrative and didactic to more analytic, even including hypothesis formulation. This shift has resulted, in part, from the confluence of advances in computation, informatics, and high-throughput techniques in systems biology. Today the analysis and visualization of biologically relevant networks has become commonplace, whether the networks represent metabolic, regulatory, or signaling pathways; protein-protein or genetic interactions; or more abstract connections between similar proteins or similar ligands. Networks are now routinely used to show relationships between biologically relevant molecules, and analysis of those networks is proving valuable for helping us understand those relationships and formulate hypotheses about biological function.

For the purposes of this tutorial, I will classify biological networks into two major categories: pathways and interaction networks. Pathways include metabolic, regulatory, and signaling networks. Consider, for example, a pathway containing the genes involved in glioblastoma multiforme, a major form of brain cancer. These genes were identified by a large-scale genetic analysis of copy number variation and genetic changes in 206 glioblastoma multiforme patients. The study was conducted as part of The Cancer Genome Atlas (TCGA) project. Notably, the study demonstrated that there was no single genetic defect responsible for glioblastoma multiforme, but that all of the cases showed significant pathway changes – strongly suggesting that this form of cancer is a “pathway disease.” From a visualization standpoint, the real power is the ability to map expression, mutation, or copy number variation data onto pathways to reveal (or suggest) how the pathway and its components function under different sets of conditions, including disease states. Thus, the ability to analyze a variety of data sources and types and to map that data onto pathways is crucial. There are also techniques for deriving putative pathways from expression data and for modeling the kinetics of biological processes that are beyond the scope of this talk.

Interaction networks comprise the second category. In these networks, nodes represent biological entities and edges represent some form of interaction or relationship. A common example of this type is a protein-protein interaction (PPI) network. Analogous networks have been generated based on ligand similarities, protein similarities, and drug-target networks. Generally, this class of biological networks can present as a “hair ball”, where there is so much information that the meaningful relationships are difficult to discern. There is good evidence that analysis of a PPI network to find highly connected “hubs” can be used to predict protein complexes, and clustering of protein similarity networks can provide clues to protein family (and hence functional) assignments.

A variety of analytical techniques can help to elucidate interaction networks. Clustering methods such as MCL have proven valuable, although several algorithms more specific to various types of interaction networks have also been developed. In addition to clustering, a variety of metrics can be applied to an interaction network or nodes within the network. The average density (node degree) of the network, average shortest-path distance, number of connected components, measures of centrality, and the extent to which the network fits a scale-free model are all useful descriptors for the analysis of an interaction network. Altering the layout and visual attributes of the network can also be helpful.

Cytoscape is an open-source application for the visualization and analysis of (biological) networks. During the tutorial, I will use Cytoscape to demonstrate some of the techniques for visualizing and analyzing biological networks. In addition, I will demonstrate some ways that biological networks can be combined with other data to help elucidate function or the possible implications of changes in biological function due to perturbation, mutation, or infection. If time allows, participants will be able to spend some time using Cytoscape by following a hands-on tutorial.

Intended Audience
This tutorial targets informatics researchers and computational biologists who are interested in an overview of the analysis and visualization of biological networks. On a practical level, this tutorial will also be of interest to bench and computational biologists who are interested in using Cytoscape as an analysis platform for biological data.

Goals and Outcomes
This tutorial will provide a basic understanding of the use of networks in biology and some of the common analytical techniques currently in use. At the end of this tutorial participants should have a basic understanding of the types of biological networks and some of the analytical and visualization techniques appropriate for each network type.
This tutorial will also provide a basic introduction to Cytoscape as a tool for analyzing and visualizing biological networks, including some usage tips and suggestions for useful plugins. I will bring installers for the latest released version of Cytoscape and attendees are encouraged to bring their laptops for the second half of the tutorial, which will include some time working with the application and a couple of the plugins.

Attendees: Downloadable materials for the session
Slides

 

Return to Top

 


 

Asking Translational Research Questions Using Ontology Enrichment Analysis

Nigam Shah
Center for Biomedical Informatics Research, Stanford University

Abstract
Advanced statistical methods used analyze high-throughput data such as gene-expression assays result in long lists of “significant genes.” One way to gain insight into the significance of altered expression levels is to determine whether Gene Ontology (GO) terms associated with a particular biological process, molecular function, or cellular component are over- or under-represented in the set of genes deemed significant. This process, referred to as enrichment analysis, profiles a gene-set, and is relevant for and extensible to analysis of data from other high-throughput measurement modalities such as proteomics, metabolomics, and tissue-microarray assays.

The canonical example of enrichment analysis is when the output dataset is a list of genes differentially expressed in some condition. To determine the biological relevance of a lengthy gene list, the usual solution is to perform enrichment analysis with the GO. We can aggregate the annotating GO concepts for each gene in this list, and arrive at a profile of the biological processes or mechanisms affected by the condition under study.

While GO has been the principal target for enrichment analysis, the methods of enrichment analysis are generalizable. We can conduct the same sort of profiling along other ontologies of interest. Just as scientists can ask "Which biological process is over-represented in my set of interesting genes or proteins?" we can also ask "Which disease (or class of diseases) is over-represented in my set of interesting genes or proteins?" For example, by annotating known protein mutations with disease terms from the ontologies in BioPortal, Mort et al. recently identified a class of diseases—blood coagulation disorders—that were associated with a 14-fold depletion in substitutions at O-linked glycosylation sites.

With the availability of tools for automatic ontology-based annotation of datasets with terms from disease ontologies, there is no reason to restrict enrichment analyses to the GO. We will discuss methods to perform enrichment analysis using any ontology available in the biomedical domain. We will review the general methodology of enrichment analysis, the associated challenges, and discuss the novel translational analyses enabled by the use of disease ontologies in such analyses.

Intended Audience
This tutorial targets informatics researchers wanting to use bio-ontologies in their research. Advanced graduate students, post-docs and faculty members transitioning to informatics from other disciplines (e.g. computer science, biology) would find this of value.

Goals and outcomes
Participants will develop a thorough understanding of the issues around "enrichment analysis," its limitations and strengths as well as how to apply this well developed methodology to diverse datasets using multiple biomedical ontologies.

Attendees: Downloadable materials for the session
Slides Part 1 PPTX
Slides Part 2 PPTX

 

Return to Top

 


 

Systems Biology in Epigenetic Regulation

Yunlong Liu, Pearlly Yan, Kun Huang
Center for Computational Biology and Bioinformatics, Indiana University School of Medicine

Abstract
The term “epigenetics” was originally coined to describe the development of phenotype from genotype. That term has evolved to describe all heritable gene regulatory events distinct from primary DNA sequence, and the field of epigenetics now encompasses DNA methylation, covalent modifications of histones, nucleosome-DNA interactions, small inhibitory RNA molecules (microRNAs), and most recently, chromosome looping. With the advance of high throughput technology such as next generation sequencing, genome-wide epigenetic profiles are available in many biological systems and developmental stages. This tutorial will be divided into three components, introduction of the biological theories of epigenetic regulation, reviews of high throughput technology for epigenetic studies, and summary of computational efforts in epigenetic research.

Intended Audience
This introductory tutorial is designed for both bench scientists and computational scientists.

Goals and Outcomes
For bench scientists, this tutorial will develop their understanding of computational science and the available computational tools for epigenetic research. It is anticipated that trainees will be aware of different types of computational tools. They may apply these approaches to study the role of epigenetic regulation in their research area.

For computational scientists, this tutorial will provide basic biological knowledge in epigenetic regulation. It is anticipated that they will increase the understanding of the complexity epigenome and will appreciate the effort of bench work needed to gather data for computational analyses. They will become proficient in using molecular biology terms for communicating with bench scientists.

Prerequisites
Conceptual understanding of programming language, probability/statistics, cell biology, molecular biology, genetics, and algorithms.

Attendees: Downloadable materials for the session
Slides

 

Return to Top

 


 

Bioinformatics Workflows on the Cloud with Unipro UGENE - CANCELLED


Mikhail Fursov, Ivan Efremov
Center of Information Technologies UNIPRO, Novosibirsk
Konstantin Okonechnikov
Novosibirsk State University

Abstract
Unipro UGENE (http://ugene.unipro.ru) is an open source multiplatform software suite, which is aimed to integrate most popular bioinformatics algorithms and analysis methods within single visual user interface. The project is constantly evolving and already includes more than 20 popular bioinformatics tools such as HMMER, Muscle, KAlign, BowTie, Phylip etc.

The key features of UGENE are:
• Visual and interactive process of analysis of biological data.
• Support for a wide range of popular data formats
• Algorithms are optimized for parallel execution on modern computers (CPU, GPU)
• User friendly access to “cloud” computing
• Connection with remote data sources
• A language to build reusable computational workflows with a few mouse clicks

One of the most significant UGENE features is an environment for building and executing custom workflows called Workflow Designer. Design and implementation of parallel algorithms and workflows using contemporary programming languages requires special skills and experience from programmers. Lack of such skills could become a barrier for effective usage of computational resources. Workflow Designer can be used by researches that are not familiar with modern parallel programming techniques to build computational workflows that can be executed in parallel both in local multicore system and in the remote cloud system.

This tutorial will cover remote execution of bioinformatics tasks in UGENE and introduce participants to Workflow Designer. In the first part of the tutorial the principles of cloud computing and use of bioinformatics workflows will be explained. Then the participants will be introduced to UGENE project. Concepts of workflows construction and remote computations with UGENE will be revealed. The power of Workflow Designer will be demonstrated in several general purpose scenarios.

Intended Audience
Molecular biologists interested in modern bioinformatics methods

Goals and Outcomes
The participants will be introduced to Unipro UGENE (http://ugene.unipro.ru) project and its powerful extension called Workflow Designer. They will get acquainted with cloud computing and gain understanding of how it is possible to increase performance of popular bioinformatics tasks execution using this technology. Principles of building custom workflows for solving multilevel tasks in molecular biology will be explained. Also some of common-used workflow scenarios will be demonstrated and worked out in detail.

Prerequisites
Some experience in molecular biology, genetics, and algorithms would be helpful. Additionally, conceptual understanding of structural biology is recommended.

Attendees: Downloadable materials for the session
Slides PDF
Software
Supplmentary video

 

Return to Top

 


 

Integrated Genomic Analysis and Visualization Using the UCSC Cancer Genomics Browser

Steve Benz
University of California Santa Cruz

Abstract
The UCSC Cancer Genomics Browser (http://genome-cancer.ucsc.edu) is a tool designed to allow hypothesis-generating exploration in an intuitive and real-time fashion. The browser was originally designed as an extension to the widely popular UCSC Genome Browser (http://genome.ucsc.edu) and currently benefits from much of the underlying technology developed for the genome browser. This browser displays a whole-genome-oriented view of genome-wide experimental measurements for individual and sets of samples alongside their associated clinical information. The browser also enables investigators to order, filter, aggregate, classify and display data interactively based on any given feature set including clinical features, annotated biological pathways, and user-edited collections of genes. Standard statistical tools are integrated to provide quantitative analysis of whole genomic data or any of its subsets. The browser is optimized to display multidimensional datasets such as the data generated by The Cancer Genome Atlas (TCGA), and is designed to allow users to manipulate and explore the data through a web browser.

In addition to learning how to visualize and analyze data on the Cancer Genomics Browser, participants will be given a brief tutorial on the UCSC Genome Browser and other bioinformatic tools and resources hosted at UCSC.

Intended Audience
Computational Biologists, Biologists

Goals and Outcomes
Participants will become familiar with visualizing data using the UCSC Genome Browser and UCSC Cancer Genomics Browser. In addition, techniques for analyzing whole genome high-throughput data will be presented and participants will participate in developing a gene signature that can be quickly applied across multiple studies.

Prerequisites
Conceptual knowledge of basic statistics, microarray data and sequencing data, and molecular biology is helpful.

Attendees: Downloadable materials for the session
Slides PDF

 

Return to Top