Fall 2007
The Honors Seminar is listed as H498 in both the Computer Science
Department and the School of Informatics. It is taught jointly by
George Springer in the Computer Science Department and Santiago
Schnell in Informatics. The faculty members in the Computer Science
Department and the School of Informatics are engaged in research
projects that are investigating highly interesting problems that will
influence computing in the future. Most undergraduate students do not
have an opportunity to hear about this fascinating work in their
normal coursework. The goal of this seminar is to have these
professors present their research programs to interested juniors and
seniors in a way that will be easily understood and when possible to
offer the students the chance to participate in the projects. The seminar meets each Monday evening from 7:00 to 9:00 p.m. in LH
101. Any students who meets the following reqirements may attend the
seminar meetings whether or not they are enrolled for credit. The
requirements are (1) be an undergraduate majoring in computer science
or informatics, (2) have a GPA of at least 3.3, and (3) have junior or
senior standing (sophomores may attend or enroll with permission of
one of the instructors). Those who meet these three requirements may
enroll for one hour of credit each semester..
Each week a different professor lectures about his/her research during
the first hour. The second hour is devoted to an informal discussion
of the research topic or any other questions that come up. This in a
great way to get to know the faculty members more personally. To earn
the grade for this course, the student must select one of the speakers
he/she heard in the seminar and write a report of approximately ten
pages about the goals of that that speaker's project, what hasS08 been
accomplished on the project to date, and what their plans are for the
future. Attendance and participation in the discussions will also
influence the grade. The speakers change from semester to semester so one my take the
seminar several times to get a broader view of the research being done
at IU. The speakers are mostly from Computer Science and Informatics,
but some have come from such fields as astronomy, cognitive science,
psychology, library science, law, chemistry, biology, and physics. For
your information, the speakers for the entire Spring Semester 2007 are
listed here along with their research interests and an abstract of
their lecture. A complete list of speakers for the Fall semester is
given here, with their research interests, the title, and a brief
abstract of the talk.
August 27, 2007: Margaret Dolinsky, Assistant Professor of
Fine Arts. Research Interests: Margaret
Dolinsky researches, designs, and creates for the CAVE Automatid
Virtual Environment (CAVE). She investigates visual metaphores for
navigation and guiding participants' roles in completing an art
experience in a virtual environment. September 3, 2007: Dirk Van Gucht, Professor of Computer
Science. Research Interests: Database
theory and systems and data mining, machine learning.
September 10, 2007: Steve Johnson, Professor of Computer
Science. Research Interests: Formal
methods for systems, design derivation, parallel symbolic computation,
scientific instrumentation.
September 17, 2007: Catherine A. Pilachowski, Professor and
Daniel Kirkwood Chair of the Department of Astronomy; Research Interests: September 24, 2007: Minaxi Gupta, Assistant Professor of
Computer Science. Research Interests: Computer
Networks and Distributed Systems, Security and Performance of Computer
Networks.
October 1, 2007: Jonathan Mills, Associate Professor of
Computer Science, also Leverhulme Trust Professor (United Kingdom),
Research Fellow (The University of the West of England, Bristol,
UK). Research Interests: Physics and
philosophy of Rubel's extended analog computer, the EAC; Wittgenstein’
s picture theory; composing and performing classical, trance and jazz
music on his many digital and analog synthesizers and workstations.
October 8, 2007: John Paolillo, Associate Professor of
Informatics (INFO) and Information Science (SLIS). Research Interests:
Sociolinguistics and language acquisition, computational linguistics,
second language acquisition, and South Asian languages.
October 15, 2007: L. Jean Camp, Associate Professor of
Informatics. Research Interests: Discovering
and understanding the organizational, social, economic and technical
interactions underlying technologies of trust.
October 22, 2007: Yuqing (Melanie) Wu, Assistant Professor
of Informatics. Research Interests: Database
systems, XML, database query languages, query optimization, data
integration, data mining and knowledge discovery.
October 29, 2007: Luis Rocha, Associate Professor of
Informatics. Research Interests: Complex
Systems Modeling: Network Analysis (Biological, Social and
Knowledge Networks), Agent-based Modeling, Collective Knowledge
Organization, Dynamical Systems. Computational and Mathematical
Biology: Bioinformatics, Microarray Data Analysis, Automatic
Functional Annotation, RNA Editing, Network Models, Systems Biology,
Evolutionary Systems, Origin of Codes. Distributed Artificial
Intelligence and Artificial Life: Adaptive and Evolutionary
Computation, Cellular Automata, Emergent Computation, Embodied
Cognition, Models of Cognitive Categorization, Origin of
Representations and Symbols. Informatics: Intelligent
Information Retrieval, Recommendation Systems, Knowledge Management,
Data-Mining, Knowledge Discovery, Bioinformatics, Internet
Development. Uncertainty Modeling: Fuzzy Set Theory, Evidence
Theory, Measures of Uncertainty, Interval Computation, Evidence Sets,
Fuzzy Graphs, Decision-Support Systems.
November 5, 2007: Kalpana Shankar Assistant Professor of
Informatics. Research Interests:
Recordkeeping and Scientific Memory, the Nature of Expertise.
November 12, 2007: Paul Purdom, Professor of Computer
Science. Research Interests: Analysis of
Algorithms, Rewriting Systems, Compilers, Game Playing.
November 26, 2007: Alessandro Flammini, Assistant Professor
of Informatics, Adjunct Assistant Professor of Physics, College of
Arts and Sciences, Affiliated Researcher, the Biocomplexity
Institute. Research Interests: Complex
Networks, Applications to Information Systems, Ecological Systems, and
Biological Systems.
December 3, 2007: Peter Ortoleva, Distinguished Professor of
Chemistry, Adjunct Professor of Informatics and Geological Sciences,
Director of the Center for Cell and Virus Theory. Research Interests: The Center
for Cell and Virus Theory is a research institute, the main objective
of which is to develop mathematical and computational models of the
physical and chemical processes underlying cell and virus behavior. We
are addressing the challenge of understanding the workings of life on
multi-, single- and sub-cellular scales. The interdisciplinary
approach of the Center integrates methods from statistical mechanics,
quantum chemistry, chemical kinetics, cell physiology, virology,
biochemistry and computational sciences. Information theory is used to
integrate models with data to arrive at a revolutionary automated
model development, calibration and risk assessment approach.
Title: The use of visual analogy as a navigation tool in virtual
environments.
Abstract: This research focuses on projection based interactive
arts and the development of navigational strategies enhanced by visual
imagery. Display theaters discussed include CAVE and CAVE-like devices
and live performance. The CAVE is a virtual environment (VR) theater
that uses stereo graphics and audio with head and hand tracking. The
images and objects in the CAVE art environment act as navigational
icons that show the visitor the way through a narrative. Perceptual
shifts in cognition can be facilitated by a subversive confrontation
with 3D worlds that are rich with the provocation of visual
analogy. Virtual experience is enhanced by artistic applications
combined with the use of such digital technologies as the CAVE,
network collaboration and the international GRID.
Title: The Golden Age of Databases
Abstract: We find ourself surrounded by systems and tools that
allow us to generate and access large data sets that come in all kinds
of shapes and variety. In this talk, I will describe the effort that
has gone into the development from data systems developed in the
1960's to those we have now. This effort has arguably been the most
important contribution to society by computer scientists, computer
engineers,and informaticians. But more importantly, we find ourself
in a situation where there is a tremendous amount of work and
discovery to be done by your generation. Thus, find yourself in the
Golden Age of Databases, and look forward to your contribution!
Title: ERTS: a robotic platform for education and research in Embedded
Systems.
Abstract: The laboratory for CSCI P545---Embedded and Real-Time
Systems---is a computer controlled golf car called ERTS. Each class
undertakes projects to enhance or refine the ERTS architecture: last
Spring we implemented GPS navigation; this Fall the goal is adding
obstacle-avoidance capability. Ultimately, we hope to develop a
vehicle that can autonomously make its way in the real world. I'll
talk generally about embedded systems, ERTS itself, and the higher
aims of research it enables.
Haldan N. Cohn, Professor of Astronomy;
Eric Ost, Senior Analyst/Programmer;
Scott Michael, Graduate Student.
Caty Pilachowski, Stellar Compositions, Stellar Evolution and
Nucleosynthesis, The Origin of the Elements in the Milky Way, Stellar
Seismology. In addition to her astronomical research, Professor
Pilachowski has been active in the areas of light pollution,
astronomical instrumentation, large telescope design and
construction, and electronic publications. She has served on numerous
national and international boards and committees, and served as
President of the American Astronomical Society from 2002-2004.
Haldan N. Cohn, Dynamical Evolution of Globular Star Clusters,
Galactic Nuclei and Clusters of Galaxies, Optical and X-Ray Studies of
Binary X-Ray sources, GRAPE-6 N-body Simulations of Globular Star
Clusters.
Title: Computing among the Stars.
Abstract: Computing has always been an essential tool of
astronomy, beginning with the measurement of time and the prediction
of eclipses and continuing through modern-day simulations of highly
complex physical systems. At the turn of 19th to the 20th century,
computing in astronomy was done by women hired as "computers" who
carried out all of the detailed calculations needed to measure orbits,
coordinates, and variability in stars. Astronomers were among the
first scientists to take advantage of the development of computers,
and IU's own Wrubel Computer Center is named for Marshall Wrubel, a
pioneer in astronomical computations of stellar atmospheres in the
1950s and 1960s. Today, astronomers use computing in all aspects of
the field, from real-time processing for telescope and instrument
control to image processing and analysis, simulations and modeling,
theory, and even publications. Astronomy offers some of the largest
public domain, non-commercial data archives in the world, and proposed
new instruments will dramatically increase requirements for
information storage, access, and visualization. The International
Virtual Observatory and its U.S. counterpart, the National Virtual
Observatory, will seamlessly combine observations from diverse sources
and over a broad wavelength range to allow astronomers to access all
data available for astronomical sources from radio through gamma rays.
Four members of the Astronomy Department will describe various aspects
of computing used in the field today.
Title: Watch Your Downloads.
Abstract: Peer-to-peer (P2P) networks such as, Limewire,
continue to be popular means of trading content. However, very little
protection is in place to make sure that the files exchanged in these
networks are not infected with malicious software, a.k.a. malware. In
this talk, I will describe a 7-month long measurement study we
conducted at Indiana University that measured malware on two distinct
P2P networks, Limewire and OpenFT. Our study found that quite a few
different types of malware reside in these networks. However, only a
few malware account for most of the observed infections. I will also
discuss simple and effective approaches to filter malware.
Title: What is computing?
Abstract: The Δ-digraph is a semantic modeling tool
that structures nature, mathematics and computer architecture into
hierarchical levels, and clearly defines the two fundamental types of
computing machines: conventional computers that are based on the
familiar paradigm algorithm, and unconventional computers based on the
paradigm analogy. We will examine some properties of these paradigms
as expressed with the Δ-digraph, and see how the simple
question, “What is computing?” leads to the uncomfortable answer
that, under the possible interpretations of quantum mechanics, we do
not know what it means to compute even though we use computers
daily. Indeed, the two paradigms of computing may lead to experiments
that provide a new understanding of nature and physical law.
Title: YouTube: Network, Structure, and Content.
Abstract: In this talk, I present results of an empirical
investigation into the social structure of YouTube, addressing friend
relations and their correlation with tags applied to uploaded videos.
Results indicate that YouTube producers are strongly linked to others
producing similar content. Furthermore, there is a socially cohesive
core of producers of mixed content, with smaller cohesive groups
around Korean music video and anime music videos. Thus, social
interaction on YouTube appears to be structured in ways similar to
other social networking sites, but with greater semantic coherence
around content. These results are explained in terms of the
relationship of video producers to the tagging of uploaded content on
the site.
Title: Privacy in ubiquitous and home computing: design,
social, technical, and policy issues
Abstract: Ubiquitous computing (i.e. ubicomp) is moving from
theory to true ubiquity. Individuals increasing interact with
ubiquitous computing in providing data to streams to video, location
from their mobile handsets, personal data from RFID passports, and
data in RFID-enabled retail locations. Ubiquitous computing offers new
challenges to privacy and information autonomy. The nature of privacy
with respect to ubicomp is hotly contested.
In designing for privacy in ubicomp systems the common practice is to
select a framing of privacy from the range of definitions, and to use
that to inform design. The classic sociological approach to addressing
privacy is to develop hypotheses about communities, and then use large
samples to predict what an individual might define as privacy. The
idea of privacy as data protection requires clear delineation of data,
data flows, and contractual consent.
In the case of in-home ubicomp, the subject of the data is the optimal
source for the conceptualization of privacy. While data protection can
offer valuable limits, and sociology can predict variance in
population, only the particular individual can determine if his or her
interaction with technology enhances his or her life.
Design for values can be utilized to leverage the complexity of
privacy to improve ubicomp designs. In design for values, also called
value-sensitive design, every party that interacts with a system
participates in developing a values statement. Design for values
conceives of participants in ubicomp as stakeholders rather than as
users and designers, while acknowledging that the interaction between
different parties is limited by domain-specific knowledge. To support
value-sensitive design in ubicomp and enhance the construction of a
values statement, the paper presents an abbreviated overview of the
various legal and philosophical constructs of privacy.
In summary, this presentation discusses privacy in ubicomp as a
design, social, technical, as well as policy issue and outlines the
research program at IU that is designed to meet the technical and
social challenges of using sensor networks as a monitoring technology.
Title: What we learn from the back of the book, the GPS systems
and google search?
Abstract: You may all notice that all books have a few pages of
table of contents at the front and several pages of index at the
end. They help us to locate things quickly in a heavy 900 page
book. Have you noticed that the table of contents and the index work
quite differently from each other. But most importantly, they work
very well.
Why a GPS device can identify your location
precisely and instantly? Why it can locate the closest highway exit,
gas stations, restaurants, relative to your currently location?
Why google search is so fast and precise? Why typing in a few
keyword returns all relevant webpage?
Believe it or not, the ideas behind all these are the same - indexing,
which is one of the most very important technique in managing data,
especially for efficient data retrieval. We will discuss different
types of indices, and how they help us in managing billions and
trillions of data entries.
Title: A Computational Model of RNA Editing and the Evolution
of Regulation and Memory in Dynamic Environments
Abstract: Evolutionary algorithms rarely deal with ontogenetic,
non-inherited alteration of genetic information because they are based
on a direct genotype-phenotype mapping. In contrast, in Nature several
processes have been discovered which alter genetic information encoded
in DNA before it is translated into amino-acid chains. Ontogenetically
altered genetic information is not inherited but extensively used in
regulation and development of phenotypes, giving organisms the ability
to, in a sense, re-program their genotypes according to environmental
cues. An example of post-transcriptional alteration of gene-encoding
sequences is the process of RNA Editing. We introduce a novel
Agent-based model of genotype editing and a computational study of its
evolutionary performance in static and dynamic environments.
Our agent-based model of genotype editing is defined by two distinct
genetic components: a coding portion encoding phenotypic solutions,
and a non-coding portion used to edit the coding material. This set up
leads to an indirect, stochastic genotype/phenotype mapping which
captures essential aspects of RNA editing found in Nature. Previously,
we have established the quantitative performance advantages of
genotype editing against the canonical evolutionary algorithm in
static and dynamic environments of various types. In this talk, we
present a study of the qualitatively different evolutionary solutions
attainable via genotype editing in drastically changing
environments. In particular, we show how genotype editing leads to the
emergence of regulatory signals, and allows agents to evolve an
emergent memory of previous environment---a capacity not attainable by
evolutionary algorithms that use only coding genetic material.
More information about our model:
http://informatics.indiana.edu/rocha/editing/
Title: The social lives of bits: understanding the creation,
sharing, and meanings of this thing we call 'data'
Abstract: We consider data management and recordkeeping (DM/RK)
as fundamental to the conduct of research. Serious consequences to
research and even human health can result from deliberate or
accidental lapses. DM/RK and the use of information technology, in
turn, reflect the complex intersections of policy, pedagogy, and
practice. For social informatics, scientific DM/RK raises numerous
intriguing questions around formal and informal policymaking,
scientific practice, and ethics. Even more importantly, perhaps, data
is not something that is only created by scientists. Data has become
an important, almost invisible, part of all of our lives. In this
talk, I’ll discuss how my interest in data and records have been
shaped by theory and methodology, and some of the projects I am
currently working on that explore the meanings, roles, and social
dimensions of “data”.
Title: Missing Values and Algorithms Similar to Singular Value
Decomposition
Abstract: Paul Purdom in Computer Science and Dan Maki in
Mathematics have been collaborating in a research project that
attempts to extract usefull information from systems of linear
equations in which some of the data is missing. This kind of problem
arises in many disciplines such as understanding microarray data in
biology, editing the photographic image of a scene taken with multiple
cameras, and understanding election results in politics. One of the
examples they considered was predicting elections. The linear system
is represented by a matrix whose columns are candidates, the rows are
precincts, and the entries are the number of votes that the candidate
received in the precinct. If the data includes precincts that are not
in a candidate's district, then there is no vote recorded (and this is
different from the candidate receiving a vote of zero). They use the
method called singular value decomposition to get information that can
be used for reliable predictions for the election. Paul will discuss
this method and its applications.
Title: Modeling Statistical Features of Natural Language
Abstract: Almost sixty yrs have passed since the linguist
G. K. Zipf showed how the frequency of words is related to their rank
in occurrence. Since then statistical analysis of natural languages
has reveled several other examples of "patterns" or "regularities".
The occurrence of such regularities - as in the case of the Zipf’s
law - often transcends the borders of Linguistics, and finds
counterparts in the most diverse disciplines. Why, and to what
extent, these laws are "universal? What statistics can (or can not)
teach us about the inner structure of language? In the framework of
these questions, I will present some recent work that my collaborators
(M. Serrano and F. Menczer) and I have been doing in this area. Here
are few things I would like to touch upon: Heaps' law and the coupons
collector problem, the "burstiness effect" and the problem of topics
identification, the "unexpected" relation between simple generative
models for text and for complex networks.
Title: Introduction to the Self-Organizing Planet
Abstract: Earth presents a rare circumstance in the Universe
wherein matter can be organized over a wide range of scales in space
and time. The consequence is viruses, bacteria, eukaryotic cells,
crystals, tornados and hurricanes, tectonic motion of the continents,
and the Earth’s magnetic field. While these phenomena arise out of a
variety of physical mechanisms, they have a common theme -- they
emerge spontaneously, i.e., they self-organize. No external “guiding
hand” moves the individual atoms in the correct way at the right
time. How then is this possible? We shall discuss selected examples
in detail and then open the discussion with the observation that such
wondrous organization that characterize planet Earth could emerge from
a very few basic laws of physics.