Research

Bill’s interests in computer science include data mining, graph theory, artificial intelligence, and software engineering. These are all products of his three prominent career experiences: Masters and Ph.D. degrees in Artificial Intelligence, and over eighteen years of industry software engineering experience covering every phase of the software life cycle.

Bill’s current research involves the discovery of patterns and anomalies in data represented as graphs. In 1996, Bill was recruited to head up a fraud detection software implementation team at a major telecommunications company. His previous software development experiences and degree in artificial intelligence opened a door to a fascinating area of research and work: the development of systems to detect fraudulent patterns in telecommunications calling traffic, internal network traffic, as well as e-commerce transactions. Using what were then the latest techniques in expert system development and artificial intelligence, Bill was able to implement various approaches to supervised data mining.

This led to Bill’s Ph.D. research work in the discovery of anomalies using graph-based algorithms. Taking a graph-theoretic approach to analyzing data for anomalies, Bill and Dr. Larry Holder initially investigated graph properties as they related to the discovery of anomalies in cargo shipments. Subsequently, they formulated information theoretic (using the minimum description length principle) and probabilistic algorithms with high discovery success rates and minimal false positives, resulting in acceptance of their work in several publications and conference proceedings. Currently, they are investigating the scalability of graph-based anomaly detection. Their work has also resulted in multiple funding through the Department of Homeland Security, as well as the National Science Foundation.

Bill’s research aspirations are to expand on this area of research by further investigating the scalability of graph-based approaches as well as other data mining techniques that can be applied to real-world problems such as fraud detection, on various domains such as telecommunications call records, financial transactions, and healthcare data.