dQUOB (dynamic QUery OBjects)

















































































































































































































































































































































































































































Table of Contents

dQUOB
Current Research
Releases
Publications
Contact Us
Documentation

dQUOB

dQUOB is a middleware system providing continuous evaluation of queries over time sequenced data. The system provides access to data in data streams by means of SQL queries. The queries are dynamically embedded into the data streams at runtime, and managed remotely during application execution. SQL queries have the power to filter and aggregate data, combine streams, and create new streams. Support for embedded user defined functions provides data transformation capabilities. dQUOB is targeted towards data streaming parallel and distributed computations such as scientific visualization, performance monitoring, and large-scale sensor data. The dQUOB system has been applied to such diverse applications as a safety critical autonomous robotics simulation, and scientific software visualization for global atmospheric transport modeling.

[Top]

Currently Researching

Grid-based streaming architecture: some stream systems, such as sensor networks and other continuous data production networks, can be viewed as a distributed data resource. The best way to bring these systems onto the grid is still an open problem. Our approach explained in "Using Global Snapshots to Access Data Streams on the Grid" leverages OGSA-DAI, the grid service access framework developed at the e-Science Institute in Edinburgh.

Calder Project: Visit Calder project webpage for more information. Calder is the next generation of dQUOB. [Top]

Releases

Current Release: dQUOB v1.0 - Download here.
For future releases visit Calder webpage: Calder Project
[Top]

Publications

For recent publications on this project please visit Publications page.
[Top]

Contact Us

Beth Plale: email, http
Nithya Vijayakumar: email, http
Ying Liu: email, http
[Top]

Documentation

Installation Instructions
Instructions to run examples
Writing SQL Queries
Programmer's Guide
[Top]

Installation Instructions

To use dquob system, you need:
  • Tcl interpreter (v8.3 or later)
  • Linux (version2.6.9 or later)
  • GNU C++ compiler (v3.2.3 or later)
  • PBIO (v3.3)

Download and install PBIO (Portable Binary Input Output) in the same top level directory as dQUOB. Add PBIO home directory to LD_LIBRARY_PATH. Our software has been tested with Pbio v 3.2.64 and a few earlier versions.

To install dQUOBEC

  • Change to dQUOB directory
  • ./configure --prefix=`pwd`
  • make -C src/dquobEC
  • make install -C src/dquobEC

To install dQUOB

  • Change to dQUOB directory
  • ./configure --prefix=`pwd` or specify the path where dQUOB is to be installed
  • make
  • make install
Note: The configure and make scripts have been created using Autoconf 2.58, Automake 1.7.8 and Libtool 1.5.8. If you run into problems during compilation, please download the latest version of GNU Autotools and run dquob-1.0/build.sh script before re-running the above steps.
[Documentation]

Instructions to run examples

Prior to running examples: Compile and install pbio. Compile and install dquobEC for provider and receiver. Compile and install dquob to run the quoblet. The provider, receiver and quoblet can be run on different machines. In our examples the main data event used is a logical grouping of cpu and memory information of a host. Request events carry values that are matched against the host information. Please note the examples that distributed with dquob release need to be compiled separately using dquob-.10/examples> make. Duplicates in the result are automatically eliminated after 2 * RANGE seconds specified in the input sql query.
Set the following environment variables in all consoles:
CHANNEL_SERVER_HOST {hostname}
CHANNEL_SERVER_PORT {port}
Add pbio home directory and dquob-1.0/lib/dquob to LD_LIBRARY_PATH 

Example 1: Simple Select: Brings out the functionality of select operator. Filter the data event based on a condition.
1. Change to dquob-1.0/bin directory.
 cd dquob-1.0/bin
2. Create a sql query and place it in a file. In this example scripts/simple_select.sql.
3. Convert the sql query into tcl script.
dquob-1.0/bin> ./sql2tcl -f ../scripts/simple_select.sql -t ../scripts/simple_select.tcl
4. Start the channel server
dquob-1.0/bin> ./channel_server
To start the provider (console 1):
1. Change to dquob-1.0/examples directory.
2. Start the provider. It takes two parameters. Channel name is the name of the channel in which data is sent. Rate is a double that gives the frequency of data sent by provider.
dquob-1.0/examples> ./provider -channel data_ev -rate 1
To start the receiver (console 2):
1. Change to dquob-1.0/examples directory
2. Start the receiver. To see the parameters input to receiver, do ./receiver --help. -data_down is the channel in which the receiver gets data from the quoblet.
dquob-1.0/bin> ./receiver -data_down data_ev_down
To start the quoblet (console 3):
1. Start the quoblet. -script specifies the query to be executed. -primitive specifies the name of the input data stream. -data_down specifies the name of the output data channel.
dquob-1.0/bin> ./quob -script ../scripts/simple_select.tcl -primitive data_ev  -data_down data_ev_down
Result:
The data from provider is filtered according to query by the quoblet. This result is then sent to the receiver and printed on the screen.

Example 2: Select_Project: Brings out the functionality of project operator. Project a subset of data event into a different event.
1. Change to dquob-1.0/bin directory.
 cd dquob-1.0/bin
2. Create a sql query and place it in a file. In this example scripts/select_project.sql.
3. Convert the sql query into tcl script.
dquob-1.0/bin> ./sql2tcl -f ../scripts/select_project.sql -t ../scripts/select_project.tcl
4. Start the channel server
dquob-1.0/bin> ./channel_server
To start the provider (console 1):
1. Change to dquob-1.0/examples directory.
2. Start the provider. It takes two parameters. Channel name is the name of the channel in which data is sent. Rate is a double that gives the frequency of data sent by provider.
dquob-1.0/examples> ./provider -channel data_ev -rate 1
To start the receiver (console 2):
1. Change to dquob-1.0/examples directory
2. Start the receiver. To see the parameters input to receiver, do ./receiver --help. -mem_data_down is the channel in which the receiver gets the projected data from the quoblet.
dquob-1.0/bin> ./receiver -mem_data_down data_ev_down
To start the quoblet (console 3):
1. Start the quoblet. -script specifies the query to be executed. -primitive specifies the name of the input data stream. -mem_data_down specifies the name of the output data channel. (In this example, the output is mem_data events).
dquob-1.0/bin> ./quob -script ../scripts/select_project.tcl -primitive data_ev  -mem_data_down data_ev_down
Result:
The data from provider is filtered according to query and memory information projected into mem events by the quoblet. This result is then sent to the receiver and printed on the screen.

Example 3: Select And: Brings out the functionality of and operator. Combines more than one filter on the data events
1. Change to dquob-1.0/bin directory.
 cd dquob-1.0/bin
2. Create a sql query and place it in a file. In this example scripts/select_and.sql.
3. Convert the sql query into tcl script.
dquob-1.0/bin> ./sql2tcl -f ../scripts/select_and.sql -t ../scripts/select_and.tcl
4. Start the channel server
dquob-1.0/bin> ./channel_server
To start the provider (console 1):
1. Change to dquob-1.0/examples directory.
2. Start the provider. It takes two parameters. Channel name is the name of the channel in which data is sent. Rate is a double that gives the frequency of data sent by provider.
dquob-1.0/examples> ./provider -channel data_ev -rate 1
To start the receiver (console 2):
1. Change to dquob-1.0/examples directory
2. Start the receiver. To see the parameters input to receiver, do ./receiver --help. -data_down is the channel in which the receiver gets the data from the quoblet.
dquob-1.0/bin> ./receiver -data_down data_ev_down
To start the quoblet (console 3):
1. Start the quoblet. -script specifies the query to be executed. -primitive specifies the name of the input data stream. -data_down specifies the name of the output data channel.
dquob-1.0/bin> ./quob -script ../scripts/select_and.tcl -primitive data_ev  -data_down data_ev_down
Result:
The data from provider is filtered according to query  by the quoblet. This result is then sent to the receiver and printed on the screen.

Example 4: Select Project Join: Brings out the functionality of join operator. Combines data and request events and project only the cpu information into a new data event.
1. Change to dquob-1.0/bin directory.
 cd dquob-1.0/bin
2. Create a sql query and place it in a file. In this example scripts/select_project_join.sql.
3. Convert the sql query into tcl script.
dquob-1.0/bin> ./sql2tcl -f ../scripts/select_project_join.sql -t ../scripts/select_project_join.tcl
4. Start the channel server
dquob-1.0/bin> ./channel_server
To start the provider (console 1):
1. Change to dquob-1.0/examples directory.
2. Start the provider. It takes two parameters. Channel name is the name of the channel in which data is sent. Rate is a double that gives the frequency of data sent by provider.
dquob-1.0/examples> ./provider -channel data_ev -rate 1
To start the receiver (console 2):
1. Change to dquob-1.0/examples directory
2. Start the receiver. To see the parameters input to receiver, do ./receiver --help. -cpu_data_down is the channel in which the receiver gets the cpu data from the quoblet. -upstream specifies the channel name to send request events to the quobet. -rate takes in a double value that specifies the rate at which request events are sent.
dquob-1.0/bin> ./receiver -cpu_data_down data_ev_down -upstream req_ev -rate 1
To start the quoblet (console 3):
1. Start the quoblet. -script specifies the query to be executed. -primitive specifies the name of the input data stream. -cpu_data_down specifies the name of the output data channel. -action_up specifies the channel in which request events arrive
dquob-1.0/bin> ./quob -script ../scripts/select_project_join.tcl -primitive data_ev  -cpu_data_down data_ev_down -action_up req_ev
Result:
The data from provider is filtered according to query  by the quoblet. This result is then sent to the receiver and printed on the screen.

Example 5: Select Project Join Or: Brings out the functionality of the Or operator. Join two streams based on two conditions.
1. Change to dquob-1.0/bin directory.
 cd dquob-1.0/bin
2. Create a sql query and place it in a file. In this example scripts/select_project_join_or.sql.
3. Convert the sql query into tcl script.
dquob-1.0/bin> ./sql2tcl -f ../scripts/select_project_join_or.sql -t ../scripts/select_project_join_or.tcl
4. Start the channel server
dquob-1.0/bin> ./channel_server
To start the provider (console 1):
1. Change to dquob-1.0/examples directory.
2. Start the provider. It takes two parameters. Channel name is the name of the channel in which data is sent. Rate is a double that gives the frequency of data sent by provider.
dquob-1.0/examples> ./provider -channel data_ev -rate 1
To start the receiver (console 2):
1. Change to dquob-1.0/examples directory
2. Start the receiver. To see the parameters input to receiver, do ./receiver --help. -cpu_data_down is the channel in which the receiver gets the cpu data from the quoblet. -upstream specifies the channel name to send request events to the quobet. -rate takes in a double value that specifies the rate at which request events are sent.
dquob-1.0/bin> ./receiver -cpu_data_down data_ev_down -upstream req_ev -rate 1
To start the quoblet (console 3):
1. Start the quoblet. -script specifies the query to be executed. -primitive specifies the name of the input data stream. -cpu_data_down specifies the name of the output data channel. -action_up specifies the channel in which request events arrive
dquob-1.0/bin> ./quob -script ../scripts/select_project_join_or.tcl -primitive data_ev  -cpu_data_down data_ev_down -action_up req_ev
Result:
The data from provider is filtered according to query  by the quoblet. This result is then sent to the receiver and printed on the screen.
[Documentation]

Writing SQL queries

dQUOB compiler takes queries from a file. The file has two parts, the first part is the relation definition part, in which you should define all the refered relations; the second part is the query part, in which the query is written in SQL-like extended continuous query language.

1.The relation definition uses SQL's table create statement: CREATE TABLE tablename (attributeName datatype, ......); For each relation definition, you should have the according defintion in data format file ~/formats/ec_formats.h And the attributes' definition order should be reversed. Currently, the data types we support are integer and float.

2.SQL-like extended continuous query language has following syntax CREATE RULE node_name IF
SELECT result_relation_name attribute_list
FROM relation_list
WHERE condition_list
START query_start_time
EXPIRE query_expire_time
RANGE join_window_size
THEN user_defined_function

In each file, you can have several queries, each query is contained within a node. We use node_name to denote this node. The naming rule is that the first node has the name as C:1, the second one has the name as C:2 and etc. "SELECT", "FROM" and "WHERE" are like standard SQL statements, except that the result relation name should be given explicitly after the "SELECT" and wildcard sign "*" is not supported in "SELECT".

Key words "START" and "EXPIRE" specifies the lifetime of query. "RANGE" is followed by the sliding window size for join operation. Our extended language accepts user defined function which can be specified after the keyword "THEN". If you don't have such funtion, you can just put "levelAll" as the user_defined_function.

For more details, please refer to the example of sql script file.

[Documentation]

Programmer's Guide

1. What is dQUOB?
dQUOB is a framework to execute continuous SQL queries on data streams. Quoblet is the query processor of dQUOB. It accepts queries in Tcl script. dQUOB compiler converts SQL queries into Tcl script. Continuous queries are queries that are running for a long time. The output from dQUOB system is also in the form of a data stream of qualifying tuples.

2. What is the dQUOB architecture?
dQUOB framework has the following components:
1. Compiler - Convert SQL queries into Tcl scripts
2. dQUOBEC - Event channel communication system used by dQUOB. All communications to dQUOB are made through the interface provided by dQUOBEC.
3. Quoblet - The query processor of dQUOB. 
Input parameters to this process: input channels, output channels, scripts to be executed.
Output: Stream of output data in respective streams

3. How to use the interface of dQUOB?
dQUOBEC provides the interface to dQUOB. dQUOBEC is an event channel communication system capable of transferring binary data. It uses PBIO as the underlying data format. For more information about using dQUOBEC to send events, requests and queries please refer to the user guide section.

4. How to use the compiler?
The compiler accepts a SQL query (.sql file) and creates a tcl script (.s file) as output. This tcl script needs to put in the dQUOB-1.0/scripts directory for the quoblet to find it.

5. How to pass the query to quoblet?
There are two methods to pass a tcl script to the quoblet. The first method is to pass the name of the script to be executed as an input argument to the quoblet. The second method is to pass the name of the tcl script to an already running quoblet through the management channel using dQUOBEC. We support only the first method in this release. The next release will provide functionality to dynamically deploy a query into an existing quoblet.

6. What data event types are supported by dQUOB?
dQUOB currently supports a few data event types: EC_Data_EV (The main data), EC_Request_EV(Event with values to be matched against the data event) and two subsets of the Data Event to project data out.

7. Can I introduce new data events?
Yes. But this involves changing dQUOB code. The future releases will support new data events dynamically. If you would like to use dQUOB and need more event types than currently available please email the dQUOB support group for instructions and help.

8. What data formats are supported?
Data format is the actual format of data inside each event type. This is a C structure and a corresponding PBIO description. All examples in dQUOB currently use EC_Data_EV (cpu load + memory information of a host),  EC_Request_EV( request event with matching values), EC_Mem_Data_EV (subset of EC_Data_EV containing only the memory information of a system),  EC_Cpu_Data_EV (containing only the cpu information of a host).

9. Can I change the data formats?
Yes. To change the data format, edit dQUOB-1.0/formats/ec_protocol.h and dQUOB-1.0/formats/ec_formats.h. You also need to change dQUOB-1.0/src/quoblet/pickfuncs.C. For queries or concerns please contact the dQUOB support group. These changes are very simple. In future release, there will be support for XML data formats and dynamic addition of new formats.

10. What operators are supported by dQUOB?
dQUOB supports SELECT, PROJECT, JOIN, AND and OR operators. It also supports a user defined function to be acted on the output stream using an ACT operator.

11. What relativity operators are supported by dQUOB?
dQUOB supports the following relativity operators: =, !=, >, <, >=, <= on integers and floats.

[Documentation]