Research
Home

 

Home
About Me
Research
Sri Lanka
Photo Album
Resume

 

 

Distributed Triplestores

Data is dynamically structured by nature and can be highly diverse and multifaceted. Often, such diverse and complex information needs to be linked. Conventional datastores, such as relational databases, do not conveniently accommodate dynamically varying structures, as frequently modifying database schemas is not feasible. RDF triplestores offer a flexible solution for handling such data, where any property about an entity can be described by a triple having a subject, a predicate, and an object. Also, data is inherently distributed due to origination points, ownership and many other reasons. Furthermore, storing data in triplestores gives rise to the need to distribute data due to the large number of triples that would result by migrating existing data from a database, for example. We present our work on designing index structures in order to facilitate efficient querying of a distributed triplestore.

Publications:
  • Index Structures for efficient querying with Distributed Triplestores. Tharaka Devadithya and Kenneth Chiu. In Proceedings of the Third IEEE International Conference on e-Science and Grid Computing (e-Science 2007). December 10-13, 2007. Bangalore, India. (To appear)

Fast Binary Serialization for Grid Systems with XBS

The efficient serialization and deserialization of data is a fundamental operation in many grid systems. Some serializers are message-based, while others are stream-based with no inherent message boundaries. Streaming serializers can be more scalable and flexible than message-oriented serializers, by promoting free form conversations not fixed to any particular static structure. XBS is a C++ binary serializing library. It is freely-available and provides an object-oriented API, based on generic programming techniques.

Publications:
  • Fast Binary Serialization for Grid Systems with XBS. Tharaka Devadithya and Kenneth Chiu. In Proceedings of Parallel and Distributed Computing and Systems (PDCS 2007). Cambridge, MA, November 19 – 21, 2007. (To appear)

Reflection for C++

Currently reflection is not available in languages commonly used in high performance computing. While there have been several attempts to incorporate reflection into C++, all of them are either intrusive or are not fully compliant with the C++ standard. We show how reflection can be efficiently and robustly implemented in languages such as C++. Our implementation uses code generation to add metadata, and is fully compliant with the standard C++ specification. The reflection library is open-source, and is available at http://www.extreme.indiana.edu/reflcpp.

Publications:
  • C++ Reflection for High Performance Problem Solving Environments. Tharaka Devadithya, Kenneth Chiu, and Wei Lu. In Proceedings of High Performance Computing Symposium (HPC 2007). Norfolk, Virginia, March 25-29, 2007. (pdf)

Binary XML for Scientific Applications (BXSA)

XML provides flexible, extensible data models and type systems for structured data, and has found wide-acceptance in many domains. XML processing can be slow, however, especially for scientific data, thus leading to the conventional wisdom that XML is not appropriate for such data. Instead, data is stored in specialized binary formats, and is transmitted via work-arounds such as attachments and base64 encoding. Though these work-arounds can be useful, they nonetheless relegate scientific data to second-class status within the web services framework; and they generally require yet another API, data model, and type system. An alternative solution is to use more efficient encodings of XML, often known as “binary XML”. Using XML uniformly throughout an application simplifies and unifies design and development. Binary XML for Scientific Applications (BXSA) is a binary XML format and an implementation for scientific data. It has been observed that performance is comparable to that of commonly used scientific data formats such as netCDF. These results challenge the prevailing practice of handling control and data separately in scientific applications, with web services for control and specialized binary formats for data.

Publications:
  • A Binary XML for Scientific Applications K. Chiu, T. Devadithya, W. Lu, A. Slominski. In proceedings of the IEEE International Conference on e-Science and Grid Computing (e-Science 2005). December 5-8, 2005. Melbourne, Australia. (pdf)
  • BXSA for Fast Processing of Scientific Data. Tharaka Devadithya, Zongde Liu, Nayef Abu-Ghazaleh, Wei Lu, Kenneth Chiu, and Stephane Ethier. In Proceedings of High Performance Computing Symposium (HPC 2007). Norfolk, Virginia, March 25-29, 2007. (pdf)

Common Instrument Middleware Architecture (CIMA)

Instruments and sensors and their accompanying actuators are essential to the conduct of scientific research. In many cases they provide observations in electronic format and can be connected to computer networks with varying degrees of remote interactivity. These devices vary in their architectures and type of data they capture and may generate data at various rates. The Common Instrument Middleware Architecture (CIMA) is a framework for making instruments and sensors network accessible in a standards-based, uniform way, and for interacting remotely with instruments and the data they produce. Some of the issues CIMA addresses include: flexibility in network transport, efficient and high throughput data transport, the availability (or lack of) computational, storage and networking resources at the instrument or sensor platform, evolution of instrument design, and reuse of data acquisition and processing codes.

Publications:
  • The Common Instrument Middleware Architecture: Overview of Goals and Implementation T. Devadithya, K. Chiu, K. Huffman, D.F. McMullen. In proceedings of the International Workshop on Scientific Instruments and Sensors on the Grid. December, 2005. Melbourne, Australia. (pdf)
  • Instrument Monitoring, Data Sharing, and Archiving Using Common Instrument Middleware Architecture (CIMA) Randall Bramley, Kenneth Chiu, Tharaka Devadithya, Nisha Gupta, Charles Hart, John C. Huffman, Kianosh Huffman, Yu Ma, and Donald F. McMullen. Journal of Chemical Information and Modeling. Vol. 46, No. 3 (May 2006). (pdf)
  • Integrating Instruments and Sensors into the Grid with CIMA Web Services D.F. McMullen, T. Devadithya, K. Chiu. In proceedings of the Third APAC Conference on Advanced Computing, Grid Applications and e-Research (APAC05). September 25-30, 2005. Gold Coast, Australia. (pdf)

Grid Assisted Image Guided Neurosurgery: Feasibility Study - Internship at San Diego Supercomputer Center (SDSC), Summer 2005

The grid provides scientific and commercial applications with access to high end resources, which are not generally available at many sites. While some jobs demand high throughput, some other jobs may demand quick response time. Sometimes, the usefulness of resources requested depends on their availability within a given time frame. The feasibility of using supercomputing resources accessed over the grid for image guided neuro surgery (IGNS) was examined in this project.

Publications:
  • On-demand High Performance Computing: Image Guided Neuro Surgery Feasibility Study T. Devadithya, K. Baldridge, A. Birnbaum, A. Majumdar, Dong Ju Choi, R. Wolski, S. Warfield, N. Archip. Second International Workshop on Scheduling and Resource Management for Parallel and Distributed Systems published in proceedings of the 12th International Conference on Parallel and Distributed Systems (ICPADS'06). July, 2006.  (pdf)

Middleware for In-Home Sensor Networks - Independent Study, Spring 2005

The purpose of the course was to develop a middleware for sensor networks that can be used in homes. The ultimate goal is to support a care network (family and friends) in trying to keep a loved one out of an assisted living facility. The middleware will allow the sensors in a home to be viewed and managed as a collection. This middleware will be used to specify what data should be gathered and stored, so that “privacy” can be tuned. (report)

 

Last updated on September 16, 2007.