Informatics faculty have independent research portfolios with peer-reviewed funding through NCI, NSF, DOD and other agencies. Some examples include:
Link Discovery and Pattern Learning (Page, Shavlik): Dr. Shavlik and Dr. Page have been involved in Defense Department-funded (DARPA, Air Force) methodological research focused on the development of multi-relational data-mining algorithms. This research has contributed to the growth of statistical relational learning (SRL), one of the most rapidly-growing areas of data mining and machine learning research (http://www.dagstuhl.de/05051/, http://kdl.cs.umass.edu/events/srl2003/).
Advice-Taking Machine Learners (Shavlik): There is much that human teachers can provide to learning machines than the traditionally used labeled examples and simple reinforcements. In another DARPA-funded project, Dr. Shavlik is developing robust algorithms that allow human teachers to provide advice, using simple English, to machine learners. The ability to instruct the software systems that one uses everyday, thereby personalizing software to match one's style of working, promises to have a revolutionary impact on how humans operate in our complex information-technology society.
Biomedical Text Mining (Craven and Shavlik): Dr. Craven and Dr. Shavlik have both been developing novel algorithms and systems for automatically (i) filtering biomedical articles for their relevance to a particular information need, (ii) annotating the results of high-throughput experiments with literature-extracted key phrases, and (iii) populating databases by extracting assertions from the literature.
Machine Learning with Rich Data Sources and Interrelated Tasks (Craven): In the context of developing methods for uncovering gene-regulatory elements and networks, Dr. Craven's group has devised a number of novel machine-learning algorithms that are applicable to problem domains that involve multiple data sources, sequence data, and/or interrelated learning tasks. Among the specific contributions of this research program are state-of-the-art algorithms for (i) refining the structure of stochastic context free grammars, (ii) taking advantage of relationships among multiple learning tasks to generate additional "weakly" labeled training examples from a pool of unlabeled examples, and (iii) representing and predicting elements in sequential data that overlap in arbitrary ways. |