Abstract
Recent development of genome and gene analysis technology enabled
rapid accumulation of biological data. To utilize such huge data, a
biologist needs to have resource-rich computing environment and
user-friendly analysis tool invocation. To response such requirements,
we designed and implemented a virtual lab, named Virtual Collaborative
Lab (V-Lab-Protein), using an efficient and flexible computing
resource management and workflow engine with a user-friendly graphical
workflow composer. Utility of our system is demonstrated by analyzing
sample protein sequence sets. This is the first system of its kind
that combines flexible workflow systems and on-demand compute and data
resources (Amazon EC2/S3 in this case). We be-lieve that this system
design principle will be a new and effective paradigm for small
biology research labs to handle the ever-increasing biological data.