The following is a real review that one of our papers got in a very reputable CS conference. If you ran ICSERGen before reading this, you may think this is a joke, but it's not. I'm reproducing it here verbatim.

2007-12-11 UPDATE: For the record, in an email I got today, Michele Lanza said:
"I hope that people do not think that I have anything to do with that review, I've never been on a ICSE PC... would it be possible to clarify this?"
Clarification added by means of this quotation, with his consent.


-----

My major problem with this paper is that I could not find anything new in the paper. Using linguistic idioms (the paper calls is "meaningful names") has been used, for example, in concept assignment since the early 90s (Biggerstaff's paper in 89).

The heuristics are explained on a rather shallow level. No numbers are given, no intervals, no details. So, for example, how is "specifity" really defined and what is the detailed heuristics. Same for the other heuristics.

Some of the well-known concepts have been just renamed: "popularity" is nothing else than coupling; "specifity" is decomposition level; heuristics one has not even gotten a name.

When the paper gets to a bit more detail such as the page rank, it stops abruptly.

The assessment claims recall: how do you assure that? over 17 million lines of code? There another notion is needed (such as accuracy or conciseness) and a different way of measuring it. How can I know the recall for a search query over thousand projects and a multi-million line code base?

The paper uses the term "Code Crawler" which is a reverse engineering tool developed by Michele Lanza. The paper cannot use the same term and generate confusion (in many dimensions). A simple "google" search would have helped with the naming.

The paper talks about "any relations between two code entities", but I haven't seen any discussion on that (whether in the theoretical part of the paper nor in the validation part). It is just mentioned. For a multi-million code base this is _relevant_ how an approach can deal with that.

The paper mentions source code representations, but omits to mention "FAMIX" as one of the major oo source code meta-models.

The difference to Google code search facility is also not discussed.

I could not understand Fig 2; this kind of "visualization" is not effective (and also not intuitive).

In general, I found the paper disappointing, hardly any technical details, too many claims, not well described. I could not find convincing scientific depth in the paper other than we have a tool and it works.