The Honors Seminar

CSCI H498 and INFO H498

Fall 2007



The Honors Seminar is listed as H498 in both the Computer Science Department and the School of Informatics. It is taught jointly by George Springer in the Computer Science Department and Santiago Schnell in Informatics. The faculty members in the Computer Science Department and the School of Informatics are engaged in research projects that are investigating highly interesting problems that will influence computing in the future. Most undergraduate students do not have an opportunity to hear about this fascinating work in their normal coursework. The goal of this seminar is to have these professors present their research programs to interested juniors and seniors in a way that will be easily understood and when possible to offer the students the chance to participate in the projects.

The seminar meets each Monday evening from 7:00 to 9:00 p.m. in LH 101. Any students who meets the following reqirements may attend the seminar meetings whether or not they are enrolled for credit. The requirements are (1) be an undergraduate majoring in computer science or informatics, (2) have a GPA of at least 3.3, and (3) have junior or senior standing (sophomores may attend or enroll with permission of one of the instructors). Those who meet these three requirements may enroll for one hour of credit each semester.. Each week a different professor lectures about his/her research during the first hour. The second hour is devoted to an informal discussion of the research topic or any other questions that come up. This in a great way to get to know the faculty members more personally. To earn the grade for this course, the student must select one of the speakers he/she heard in the seminar and write a report of approximately ten pages about the goals of that that speaker's project, what hasS08 been accomplished on the project to date, and what their plans are for the future. Attendance and participation in the discussions will also influence the grade.

The speakers change from semester to semester so one my take the seminar several times to get a broader view of the research being done at IU. The speakers are mostly from Computer Science and Informatics, but some have come from such fields as astronomy, cognitive science, psychology, library science, law, chemistry, biology, and physics. For your information, the speakers for the entire Spring Semester 2007 are listed here along with their research interests and an abstract of their lecture. A complete list of speakers for the Fall semester is given here, with their research interests, the title, and a brief abstract of the talk.


August 27, 2007: Margaret Dolinsky, Assistant Professor of Fine Arts.

Research Interests: Margaret Dolinsky researches, designs, and creates for the CAVE Automatid Virtual Environment (CAVE). She investigates visual metaphores for navigation and guiding participants' roles in completing an art experience in a virtual environment.

Title: The use of visual analogy as a navigation tool in virtual environments.

Abstract: This research focuses on projection based interactive arts and the development of navigational strategies enhanced by visual imagery. Display theaters discussed include CAVE and CAVE-like devices and live performance. The CAVE is a virtual environment (VR) theater that uses stereo graphics and audio with head and hand tracking. The images and objects in the CAVE art environment act as navigational icons that show the visitor the way through a narrative. Perceptual shifts in cognition can be facilitated by a subversive confrontation with 3D worlds that are rich with the provocation of visual analogy. Virtual experience is enhanced by artistic applications combined with the use of such digital technologies as the CAVE, network collaboration and the international GRID.

September 3, 2007: Dirk Van Gucht, Professor of Computer Science.

Research Interests: Database theory and systems and data mining, machine learning.

Title: The Golden Age of Databases

Abstract: We find ourself surrounded by systems and tools that allow us to generate and access large data sets that come in all kinds of shapes and variety. In this talk, I will describe the effort that has gone into the development from data systems developed in the 1960's to those we have now. This effort has arguably been the most important contribution to society by computer scientists, computer engineers,and informaticians. But more importantly, we find ourself in a situation where there is a tremendous amount of work and discovery to be done by your generation. Thus, find yourself in the Golden Age of Databases, and look forward to your contribution!

September 10, 2007: Steve Johnson, Professor of Computer Science.

Research Interests: Formal methods for systems, design derivation, parallel symbolic computation, scientific instrumentation.

Title: ERTS: a robotic platform for education and research in Embedded Systems.

Abstract: The laboratory for CSCI P545---Embedded and Real-Time Systems---is a computer controlled golf car called ERTS. Each class undertakes projects to enhance or refine the ERTS architecture: last Spring we implemented GPS navigation; this Fall the goal is adding obstacle-avoidance capability. Ultimately, we hope to develop a vehicle that can autonomously make its way in the real world. I'll talk generally about embedded systems, ERTS itself, and the higher aims of research it enables.

September 17, 2007: Catherine A. Pilachowski, Professor and Daniel Kirkwood Chair of the Department of Astronomy;
Haldan N. Cohn, Professor of Astronomy;
Eric Ost, Senior Analyst/Programmer;
Scott Michael, Graduate Student.

Research Interests:
Caty Pilachowski,
Stellar Compositions, Stellar Evolution and Nucleosynthesis, The Origin of the Elements in the Milky Way, Stellar Seismology. In addition to her astronomical research, Professor Pilachowski has been active in the areas of light pollution, astronomical instrumentation, large telescope design and construction, and electronic publications. She has served on numerous national and international boards and committees, and served as President of the American Astronomical Society from 2002-2004.
Haldan N. Cohn, Dynamical Evolution of Globular Star Clusters, Galactic Nuclei and Clusters of Galaxies, Optical and X-Ray Studies of Binary X-Ray sources, GRAPE-6 N-body Simulations of Globular Star Clusters.

Title: Computing among the Stars.

Abstract: Computing has always been an essential tool of astronomy, beginning with the measurement of time and the prediction of eclipses and continuing through modern-day simulations of highly complex physical systems. At the turn of 19th to the 20th century, computing in astronomy was done by women hired as "computers" who carried out all of the detailed calculations needed to measure orbits, coordinates, and variability in stars. Astronomers were among the first scientists to take advantage of the development of computers, and IU's own Wrubel Computer Center is named for Marshall Wrubel, a pioneer in astronomical computations of stellar atmospheres in the 1950s and 1960s. Today, astronomers use computing in all aspects of the field, from real-time processing for telescope and instrument control to image processing and analysis, simulations and modeling, theory, and even publications. Astronomy offers some of the largest public domain, non-commercial data archives in the world, and proposed new instruments will dramatically increase requirements for information storage, access, and visualization. The International Virtual Observatory and its U.S. counterpart, the National Virtual Observatory, will seamlessly combine observations from diverse sources and over a broad wavelength range to allow astronomers to access all data available for astronomical sources from radio through gamma rays. Four members of the Astronomy Department will describe various aspects of computing used in the field today.

September 24, 2007: Minaxi Gupta, Assistant Professor of Computer Science.

Research Interests: Computer Networks and Distributed Systems, Security and Performance of Computer Networks.

Title: Watch Your Downloads.

Abstract: Peer-to-peer (P2P) networks such as, Limewire, continue to be popular means of trading content. However, very little protection is in place to make sure that the files exchanged in these networks are not infected with malicious software, a.k.a. malware. In this talk, I will describe a 7-month long measurement study we conducted at Indiana University that measured malware on two distinct P2P networks, Limewire and OpenFT. Our study found that quite a few different types of malware reside in these networks. However, only a few malware account for most of the observed infections. I will also discuss simple and effective approaches to filter malware.

October 1, 2007: Jonathan Mills, Associate Professor of Computer Science, also Leverhulme Trust Professor (United Kingdom), Research Fellow (The University of the West of England, Bristol, UK).

Research Interests: Physics and philosophy of Rubel's extended analog computer, the EAC; Wittgenstein’ s picture theory; composing and performing classical, trance and jazz music on his many digital and analog synthesizers and workstations.

Title: What is computing?

Abstract: The Δ-digraph is a semantic modeling tool that structures nature, mathematics and computer architecture into hierarchical levels, and clearly defines the two fundamental types of computing machines: conventional computers that are based on the familiar paradigm algorithm, and unconventional computers based on the paradigm analogy. We will examine some properties of these paradigms as expressed with the Δ-digraph, and see how the simple question, “What is computing?” leads to the uncomfortable answer that, under the possible interpretations of quantum mechanics, we do not know what it means to compute even though we use computers daily. Indeed, the two paradigms of computing may lead to experiments that provide a new understanding of nature and physical law.

October 8, 2007: John Paolillo, Associate Professor of Informatics (INFO) and Information Science (SLIS).

Research Interests: Sociolinguistics and language acquisition, computational linguistics, second language acquisition, and South Asian languages.

Title: YouTube: Network, Structure, and Content.

Abstract: In this talk, I present results of an empirical investigation into the social structure of YouTube, addressing friend relations and their correlation with tags applied to uploaded videos.  Results indicate that YouTube producers are strongly linked to others producing similar content.  Furthermore, there is a socially cohesive core of producers of mixed content, with smaller cohesive groups around Korean music video and anime music videos.  Thus, social interaction on YouTube appears to be structured in ways similar to other social networking sites, but with greater semantic coherence around content.  These results are explained in terms of the relationship of video producers to the tagging of uploaded content on the site.

October 15, 2007: L. Jean Camp, Associate Professor of Informatics.

Research Interests: Discovering and understanding the organizational, social, economic and technical interactions underlying technologies of trust.

Title: Privacy in ubiquitous and home computing: design, social, technical, and policy issues

Abstract: Ubiquitous computing (i.e. ubicomp) is moving from theory to true ubiquity. Individuals increasing interact with ubiquitous computing in providing data to streams to video, location from their mobile handsets, personal data from RFID passports, and data in RFID-enabled retail locations. Ubiquitous computing offers new challenges to privacy and information autonomy. The nature of privacy with respect to ubicomp is hotly contested.

In designing for privacy in ubicomp systems the common practice is to select a framing of privacy from the range of definitions, and to use that to inform design. The classic sociological approach to addressing privacy is to develop hypotheses about communities, and then use large samples to predict what an individual might define as privacy. The idea of privacy as data protection requires clear delineation of data, data flows, and contractual consent.

In the case of in-home ubicomp, the subject of the data is the optimal source for the conceptualization of privacy. While data protection can offer valuable limits, and sociology can predict variance in population, only the particular individual can determine if his or her interaction with technology enhances his or her life.

Design for values can be utilized to leverage the complexity of privacy to improve ubicomp designs.  In design for values, also called value-sensitive design, every party that interacts with a system participates in developing a values statement. Design for values conceives of participants in ubicomp as stakeholders rather than as users and designers, while acknowledging that the interaction between different parties is limited by domain-specific knowledge. To support value-sensitive design in ubicomp and enhance the construction of a values statement, the paper presents an abbreviated overview of the various legal and philosophical constructs of privacy.

In summary, this presentation discusses privacy in ubicomp as a design, social, technical, as well as policy issue and outlines the research program at IU that is designed to meet the technical and social challenges of using sensor networks as a monitoring technology.

October 22, 2007: Yuqing (Melanie) Wu, Assistant Professor of Informatics.

Research Interests: Database systems, XML, database query languages, query optimization, data integration, data mining and knowledge discovery.

Title: What we learn from the back of the book, the GPS systems and google search?

Abstract: You may all notice that all books have a few pages of table of contents at the front and several pages of index at the end. They help us to locate things quickly in a heavy 900 page book. Have you noticed that the table of contents and the index work quite differently from each other. But most importantly, they work very well.

Why a GPS device can identify your location precisely and instantly? Why it can locate the closest highway exit, gas stations, restaurants, relative to your currently location?

Why google search is so fast and precise? Why typing in a few keyword returns all relevant webpage?

Believe it or not, the ideas behind all these are the same - indexing, which is one of the most very important technique in managing data, especially for efficient data retrieval. We will discuss different types of indices, and how they help us in managing billions and trillions of data entries.

October 29, 2007: Luis Rocha, Associate Professor of Informatics.

Research Interests: Complex Systems Modeling: Network Analysis (Biological, Social and Knowledge Networks), Agent-based Modeling, Collective Knowledge Organization, Dynamical Systems. Computational and Mathematical Biology: Bioinformatics, Microarray Data Analysis, Automatic Functional Annotation, RNA Editing, Network Models, Systems Biology, Evolutionary Systems, Origin of Codes. Distributed Artificial Intelligence and Artificial Life: Adaptive and Evolutionary Computation, Cellular Automata, Emergent Computation, Embodied Cognition, Models of Cognitive Categorization, Origin of Representations and Symbols. Informatics: Intelligent Information Retrieval, Recommendation Systems, Knowledge Management, Data-Mining, Knowledge Discovery, Bioinformatics, Internet Development. Uncertainty Modeling: Fuzzy Set Theory, Evidence Theory, Measures of Uncertainty, Interval Computation, Evidence Sets, Fuzzy Graphs, Decision-Support Systems.

Title: A Computational Model of RNA Editing and the Evolution of Regulation and Memory in Dynamic Environments

Abstract: Evolutionary algorithms rarely deal with ontogenetic, non-inherited alteration of genetic information because they are based on a direct genotype-phenotype mapping. In contrast, in Nature several processes have been discovered which alter genetic information encoded in DNA before it is translated into amino-acid chains. Ontogenetically altered genetic information is not inherited but extensively used in regulation and development of phenotypes, giving organisms the ability to, in a sense, re-program their genotypes according to environmental cues. An example of post-transcriptional alteration of gene-encoding sequences is the process of RNA Editing. We introduce a novel Agent-based model of genotype editing and a computational study of its evolutionary performance in static and dynamic environments.

Our agent-based model of genotype editing is defined by two distinct genetic components: a coding portion encoding phenotypic solutions, and a non-coding portion used to edit the coding material. This set up leads to an indirect, stochastic genotype/phenotype mapping which captures essential aspects of RNA editing found in Nature. Previously, we have established the quantitative performance advantages of genotype editing against the canonical evolutionary algorithm in static and dynamic environments of various types. In this talk, we present a study of the qualitatively different evolutionary solutions attainable via genotype editing in drastically changing environments. In particular, we show how genotype editing leads to the emergence of regulatory signals, and allows agents to evolve an emergent memory of previous environment---a capacity not attainable by evolutionary algorithms that use only coding genetic material.

More information about our model:
http://informatics.indiana.edu/rocha/editing/

November 5, 2007: Kalpana Shankar Assistant Professor of Informatics.

Research Interests: Recordkeeping and Scientific Memory, the Nature of Expertise.

Title: The social lives of bits: understanding the creation, sharing, and meanings of this thing we call 'data'

Abstract: We consider data management and recordkeeping (DM/RK) as fundamental to the conduct of research. Serious consequences to research and even human health can result from deliberate or accidental lapses. DM/RK and the use of information technology, in turn, reflect the complex intersections of policy, pedagogy, and practice. For social informatics, scientific DM/RK raises numerous intriguing questions around formal and informal policymaking, scientific practice, and ethics. Even more importantly, perhaps, data is not something that is only created by scientists. Data has become an important, almost invisible, part of all of our lives. In this talk, I’ll discuss how my interest in data and records have been shaped by theory and methodology, and some of the projects I am currently working on that explore the meanings, roles, and social dimensions of “data”.

November 12, 2007: Paul Purdom, Professor of Computer Science.

Research Interests: Analysis of Algorithms, Rewriting Systems, Compilers, Game Playing.

Title: Missing Values and Algorithms Similar to Singular Value Decomposition

Abstract: Paul Purdom in Computer Science and Dan Maki in Mathematics have been collaborating in a research project that attempts to extract usefull information from systems of linear equations in which some of the data is missing. This kind of problem arises in many disciplines such as understanding microarray data in biology, editing the photographic image of a scene taken with multiple cameras, and understanding election results in politics. One of the examples they considered was predicting elections. The linear system is represented by a matrix whose columns are candidates, the rows are precincts, and the entries are the number of votes that the candidate received in the precinct. If the data includes precincts that are not in a candidate's district, then there is no vote recorded (and this is different from the candidate receiving a vote of zero). They use the method called singular value decomposition to get information that can be used for reliable predictions for the election. Paul will discuss this method and its applications.

November 26, 2007: Alessandro Flammini, Assistant Professor of Informatics, Adjunct Assistant Professor of Physics, College of Arts and Sciences, Affiliated Researcher, the Biocomplexity Institute.

Research Interests: Complex Networks, Applications to Information Systems, Ecological Systems, and Biological Systems.

Title: Modeling Statistical Features of Natural Language

Abstract: Almost sixty yrs have passed since the linguist G. K. Zipf showed how the frequency of words is related to their rank in occurrence. Since then statistical analysis of natural languages has reveled several other examples of "patterns" or "regularities". The occurrence of such regularities - as in the case of the Zipf’s law - often transcends the borders of Linguistics, and finds counterparts in the most diverse disciplines. Why, and to what extent, these laws are "universal? What statistics can (or can not) teach us about the inner structure of language? In the framework of these questions, I will present some recent work that my collaborators (M. Serrano and F. Menczer) and I have been doing in this area. Here are few things I would like to touch upon: Heaps' law and the coupons collector problem, the "burstiness effect" and the problem of topics identification, the "unexpected" relation between simple generative models for text and for complex networks.

December 3, 2007: Peter Ortoleva, Distinguished Professor of Chemistry, Adjunct Professor of Informatics and Geological Sciences, Director of the Center for Cell and Virus Theory.

Research Interests: The Center for Cell and Virus Theory is a research institute, the main objective of which is to develop mathematical and computational models of the physical and chemical processes underlying cell and virus behavior. We are addressing the challenge of understanding the workings of life on multi-, single- and sub-cellular scales. The interdisciplinary approach of the Center integrates methods from statistical mechanics, quantum chemistry, chemical kinetics, cell physiology, virology, biochemistry and computational sciences. Information theory is used to integrate models with data to arrive at a revolutionary automated model development, calibration and risk assessment approach.

Title: Introduction to the Self-Organizing Planet

Abstract: Earth presents a rare circumstance in the Universe wherein matter can be organized over a wide range of scales in space and time. The consequence is viruses, bacteria, eukaryotic cells, crystals, tornados and hurricanes, tectonic motion of the continents, and the Earth’s magnetic field. While these phenomena arise out of a variety of physical mechanisms, they have a common theme -- they emerge spontaneously, i.e., they self-organize. No external “guiding hand” moves the individual atoms in the correct way at the right time. How then is this possible? We shall discuss selected examples in detail and then open the discussion with the observation that such wondrous organization that characterize planet Earth could emerge from a very few basic laws of physics.