Please join the eScience Institute Thursday, October 24, 4:00 pm in CSE-305. Refreshments will be provided. *Valerie Daggett (UW):* Valerie Daggett obtained her BA from Reed College in Portland Oregon in 1983. A couple of years later she went to the University of California, San Francisco for her PhD (awarded in 1990) and was then a postdoctoral fellow at Stanford University. She joined the faculty at the University of Washington in 1993 and has been there ever since. She is a professor in the Department of Bioengineering, College of Engineering and School of Medicine. She also holds adjunct positions in the Biochemistry Department and the Biomedical Health and Informatics Program. She has published over 220 papers and maintains active research programs in protein dynamics and folding, simulation, protein misfolding diseases, and the general area of bioinformatics. She is on a Senior Editor of Protein Engineering, Design and Selection as well as a board member for numerous other scientific journals. *DIVE: Data Intensive Visual Engine for molecular simulation data* Data-driven research is a rapidly emerging commonality throughout scientific disciplines. Recently, with the proliferation of inexpensive commodity computing clusters, synthetic data sources such as modeling and simulation are capable of producing a continuous stream of terascale data. Confronted with this data deluge, domain scientists are in need of data-intensive analytic environments. Dynameomics is a terascale simulation-driven research effort designed to enhance our understanding of protein folding and dynamics through molecular dynamics simulation and modeling. The project routinely involves exploratory analysis of 100+ terabyte datasets using an array of heterogeneous structural biology-specific tools. In order to accelerate the pace of discovery for the Dynameomics project, we have developed DIVE, a framework that allows for rapid prototyping and dissemination of domain independent (e.g., clustering) and domain specific analyses in an implicitly iterative workflow environment. The information in the data warehouse is classified into three categories: raw data, derived data, and state data. Raw data are generated from simulations and models, derived data are produced through tools operating on the raw data, and state data constitute the record of the exploratory workflow, which has the added benefit of capturing the provenance of derived data. DIVE empowers researchers by simplifying and expediting the overhead associated with shared tool use and heterogeneous datasets. Furthermore, the workflow provides a simple, interactive, and iterative data-oriented investigation paradigm that tightens the hypothesis generation loop. The result is an expressive, flexible laboratory informatics framework that allows researchers to focus on analysis and discovery instead of tool development. Upcoming Seminars: * November 6, 4 PM (233 Sieg Hall) Clark Gaylord (Virginia Tech) Data Science Meets Infrastructure: Strategic Highway Research Program (SHRP 2)