Abstract
Recent development of genome and gene analysis technology enabled rapid accumulation of biological data. To utilize such huge data, a biologist needs to have resource-rich computing environment and user-friendly analysis tool invocation. To response such requirements, we designed and implemented a virtual lab, named Virtual Collaborative Lab (V-Lab-Protein), using an efficient and flexible computing resource management and workflow engine with a user-friendly graphical workflow composer. Utility of our system is demonstrated by analyzing sample protein sequence sets. This is the first system of its kind that combines flexible workflow systems and on-demand compute and data resources (Amazon EC2/S3 in this case). We be-lieve that this system design principle will be a new and effective paradigm for small biology research labs to handle the ever-increasing biological data.