A Theory of Aspects

We just got the news that our paper entitled "A Theory of Aspects" was accepted for the technical track at ooPSLA'08! This paper resulted from a close collaboration between Pierre Baldi and myself, through work done mostly by Erik Linstead, with Sushil doing a piece of the validation. The work is being supported by a grant from the NSF. Here's a bit of history behind this paper.

Way back in 1996-1997, when I was finishing my PhD thesis about AOP, I had a feeling aspects could be formalized in terms of information theory, specifically the concept of entropy; I just didn't know how to do it. It turns out that my intuitions were right, but slightly ahead of time.

The late 90s were also the time at which we started to see a lot of work in topic modeling. In 2003, David Blei, Andrew Ng and Michael Jordan published a paper explaining a technique they devised called Latent Dirichlet Allocation (LDA) for doing non-supervised probabilistic topic modeling. LDA has proven to be of great value in topic modeling of natural language texts. One of the by-products of LDA is a couple of probability distributions of topics over documents and documents over topics, which correspond directly to entropies in the information theory sense.

So… Pierre, who knows a lot about these mathematical techniques, and I, who know a thing or two about aspects, put these two things together and devised a way of formulating aspects as latent topics. In process of doing this, tangling and scattering can be measured as those entropies mentioned above.

We tested this theory in a collection of 4,500+ Java projects comprising 37+ million LOCs, and, much to our surprise, we saw all the prototypical examples of aspects (and a few more) emerge as latent topics with high entropy! Indeed, it seems that those topics are pervasive in (Java) software development, and that they are highly scattered among the files.

There are lots of important details about this study, but you'll have to read the paper…