MIT Artificial Intelligence Lab
Click to edit Master text styles
Second level
Third level
Fourth level
Fifth level
Mark A. Foltz mfoltz@ai.mit.edu
Thesis Committee Meeting
Hello, today I am going to talk about my
progress on my thesis research on the relationship between software
diagramming and software refactoring.
My agenda is to first briefly describe
the problem I’m working on and my approach to it.
Then I will talk about
refactorings in Dr. Jones – what Dr. Jones knows about them and what space of
refactorings Dr. Jones will incorporate.
Then I’ll illustrate a scenario of multiple refactorings in Dr.
Jones and end with a status update.
First I’ll describe the problem I’m attacking.
That problem is the complexity of redesigning software.
When thinking about design, programmers prefer diagrams that abstract from the
details in source code.
However,
current tools provide diagrams like the one on the left that quickly become as
complicated as the program they’re trying to depict.
The programmer can
try to filter out what she doesn’t want to see, but usually it’s easier for
her pick up a pen and paper and redesign by drawing only the parts she wants
to see (like the diagram on the right).
Unfortunately, the computer can’t help the programmer when she
redesigns with pen and paper – and I believe that it should.
I’d like to bring the computer
back into this process by starting with two observations.
First, programmers’ pen and paper diagrams are task-relevant –
they draw the parts of the program they want to change and the dependencies
that are involved, and leave out the rest.
Second, there’s a growing body of
commonly used design moves called refactorings – local, structural changes to
the program that involve a few of its related parts.
These observations led me to the
thesis that if a diagramming tool understood the refactorings the programmer
wanted to make, it could
(1) draw relevant, task-specific diagrams and
(2) use those diagrams to help the programmer
interactively explore the program’s design.
Stepping back for a bit, a redesign tool
could potentially support the user in a number of roles, because refactoring
is a multi step process.
First, the tool can give the user a visual
representation where it’s easy to spot opportunities for refactoring.
Second,
the tool can diagnose design problems by looking for `bad smells’ or `antipatterns’,
known patterns of design weakness like code duplication.
Third, the tool
can show the user the results of proposed refactorings, allowing them to chain
together multiple refactorings to visualize new designs.
And finally the
tool can verify that the refactorings preserve behavior and implement them by
changing the source code.
Choosing the actual refactorings – between steps
two and three in this process – is the most difficult step and remains up to
the user.
Dr. Jones addresses the first and third roles listed here.
The metaphor is that of a fellow programmer who knows the program you’re refactoring
(although not what it does), can draw accurate diagrams of it, and give the
programmer guidance while refactoring.
It innovates by decoupling the steps of planning and implementing
the refactorings
-- current tools
transform the source immediately when the user makes a refactoring
decision.
I see three main contributions resulting from this research.
The motivating contribution is Dr. Jones, the tool I am developing that
requires two main innovations.
The
first innovation is a knowledge base of refactorings for Java programs, built
from the perspective of a tool that assists the user in
visual design exploration.
The second
innovation is a mechanism for keeping the contents of software diagrams
relevant across multiple refactorings by tracking the focus of refactoring
attention.
The rest of this talk
will focus on my progress towards realizing this first innovation.
First I will describe what Dr. Jones
knows about each individual refactoring for Java.
Briefly, a refactoring is a structural
change to a program that improves its design, without changing its visible
behavior.
Common examples are moving a method, generalizing classes to a
base class, and encapsulating methods into a delegate.
Let’s look at the move method refactoring
in detail, to illustrate what Dr. Jones knows about a typical
refactoring.
Suppose we decide that the miles per gallon of a vehicle is
really a property of a vehicle, and not its engine, to reduce coupling.
We
refactor by moving the method to Vehicle, and leaving a skeleton method behind
in Engine that delegates to the new location.
What would Dr. Jones need to know to help me plan this refactoring?
In Dr. Jones, I represent a refactoring by four pieces of knowledge.
First, what are the obvious reasons not to perform the refactoring (the
guards).
Second, how does the refactoring change Dr. Jones’ representation of the program
design and thus what is shown in its diagrams.
Third, does the refactoring suggest other refactorings that
are likely to improve the program design.
And finally, where are the places in the source that might have to
be changed to implement the refactoring.
I’ll now examine these four
pieces of knowledge in detail for the move method refactoring.
First, Dr. Jones can check for guards –
obvious reasons that one wouldn’t want to do the refactoring.
In this
case, Dr. Jones can check for name conflicts, and that you’re not trying to
move a constructor.
Also a different set of rules apply for move method if
the source and target classes are related by inheritance.
Note that
these`guards’ don’t completely check that the refactoring preserves the
program’s behavior (since that would involve much more difficult analyses).
Rather
these are more like sanity checks to help the programmer avoid refactoring
mistakes.
(Dr. Jones can remind the programmer to check the more difficuly safety conditions
when it’s time to implement the refactorings.
This runs the risk, however, of allowing the programmer to plan unsafe
refactorings with Dr. Jones.)
The impact on the design representation
is straightforward (as we saw a few slides ago in the move method
example).
Dr. Jones copies the method signature from the source to the
target and notes that the old method delegates to the new location.
Dr. Jones can make several design
suggestions even for this seemingly straightforward refactoring.
If the
method is polymorphic in the source hierarchy, it’s likely that the programmer
will want to express that polymorphism on the target hierarchy.
So we leave to-dos for the programmer
to move the overriding and overridden methods to appropriate places in the
target hierarchy.
Also, if the method is overloaded with functions of the
same name but different signatures, then the programmer might want to move
those as well.
Finally, if the method uses fields or methods in the
source class, the programmer will need to provide access to them (I.e., by
encapsulating those fields).
Finally we can give some guidance to the
programmer when he is ready to tackle the source, like moving the method body
to the target, implementing the delegation in the source, and converting the
uses of source members.
(Dr.
Jones builds a cross reference of which methods use which other fields and methods
for use in these last two steps.)
Dr. Jones decouples the choices of the
refactoring steps to take, and the actual manipulation on the source to
implement the refactorings.
In this way multiple alternatives can be more
easily explored, and the hard work of implementing the refactorings undertaken
once an alternative is chosen.
Dr. Jones uses these four pieces of
knowledge together to play its role as a diagramming assistant.
I’d like to compare Dr. Jones’ knowledge
about refactoring to that of another research project, the Smalltalk
Refactoring Browser developed at UIUC.
The Refactoring Browser was
primarily concerned with giving the user a safe and reliable tool – the user
could trust it to know when a refactoring is behavior-preserving, and if so to
transform the source correctly.
Dr Jones on the other hand has knowledge
that will give the user visual feedback on the new designs generated by
refactoring, prevent bad refactorings, and suggest`follow-up’
refactorings.
These kinds of knowledge haven’t been explicitly considered
before in a refactoring tool, and I believe my specifications of it represents
a contribution to refactoring research.
I also believe these two bodies of
knowledge are complementary, and a tool that integrates, for example, design
diagnosis, design exploration and source transformation would be a more
complete solution and a fruitful direction for future work.
Next I am going to give an overview of
what refactorings are included in the knowledge base.
The KB is structured around a set of
refactoring verbs that can be applied to the major program elements in
Java.
This vocabulary was
motivated by the desire to have a economical number of actions that the user
can apply to elements of the diagram, instead of a flat list that would have
to be learned and remembered.
The vocabulary also sets up a space of
possible refactorings whose cases can be filled in for a specific language (in
this case Java).
I have specified entries in the KB for
each of these check marks in a semi-formal language.
Most of the missing marks are cases that don’t make sense in Java, [next
slide]
I.e. hiding a package.
Creating and removing fields and methods aren’t included because they don’t seem
to be in the spirit of behavior preservation, and adding new functionality is
a separate concern.
The intention is to let programmers naturally express
typical sequences of refactorings they would use in practice.
We can compare the coverage of Dr. Jones
to the catalog in Fowler’s 1999 book and a leading refactoring CASE tool.
Although we’re comparing apples and oranges, in terms of expressiveness,
Dr. Jones has a significant fraction of the Fowler’s refactorings collected
from practice and more than a source-transformation-only CASE tool.
To bring the two main parts of the talk
together I’ll present a scenario that shows Dr. Jones’ body of knowledge in
action. simulte
We start with a class that keeps a calendar of appointments.
The information for each appointment is kept in fields of arrays (one for the date,
one for the start time, etc.)
Appointments can be made in three types:
regular, to-dos, and and all-day (indicated by a numeric type code).
The
programmer would like to refactor this to create an extensible abstraction for
an Appointment.
The first step is to encapsulate the array fields into a new Appointment
class.
Dr. Jones would ask the user to name the new class, and to choose a container for
the aggregation (in this case an array).
It would then change the program
representation as necessary and diagram the new design, including replacing
the multiple aggregation edges with a single new one.
Now that we have an Appointment class, it
makes sense to replace the multiple parameters to add() with a single
Appointment parameter.
The user does this with the encapsulate parameters refactoring.
Since Dr. Jones knows where the add() method is called in the original program,
it can tell the programmer where to change the calling syntax later.
Now the user would like to make the
appointment-type-specific behavior explicit in the class hierarchy.
The
user prepares for this by moving the type code fields to the Appointment class.
[
include something about moving methods in Calendar here, or come up with an
example. ]
The user can then encapsulate the type codes of Appointment into
subclasses.
Things look pretty good, until she realizes that the user of the calendar
might want to change the type of an appointment.
Objects can’t change class in Java, so this creates a problem.
Here she can use Dr. Jones’ ability to explore alternatives to back up and try
a different refactoring.
Encapsulating the type codes in a separate class avoids this probem.
An Appointment can change its type dynamically by reassigning its AppointmentType
instance.
We can compare this design to the original to see the improvement [flip
slides].
[ slide copied here to illustrate
improvement in design ]
Before concluding, I wanted to bring the
talk back to the issues that got me started looking at refactoring, that of
keeping the diagrams simple and relevant while redesigning software.
This
is the goal of focus tracking, and I see it as one of Dr. Jones’ major payoffs
to the user.
The KB I’ve described will be a major component of the focus
tracking mechanism, since it knows what program elements are involved in the
current refactoring and which ones are likely to be further refactored.
The
focus tracking mechanism will use this information to render the elements at
appropriate levels of detail, shown here.
For instance, it could show some
historical context by showing elements refactored in the past at a low level
of detail.
The currently refactored elements will get the highest level of
detail.
Likely future refactorings will also get more detail, but since we can’t
predict the user’s next actions exactly, the drop off is quick.
The specifics of this mechanism remain future work in my research.
Today I’ve presented an overview of my
research progress on Dr. Jones, an interactive refactoring tool for Java
programs.
I’ve described the four kinds of design exploration knowledge I
have specified for each of Dr. Jones’
50 refactorings.
And I’ve also described a scenario that will drive the
next phases of my research.
The tasks I’ve completed are the
specification of 50 refactorings for Dr. Jones in a semi-formal language, and
the infrastructure to analyze and flexibly diagram existing Java
programs.
Next I plan to implement the set of refactorings used in the
scenario and understand what focus tracking would be for the scenario.
This
in turn will drive work on a more general focus tracking mechanism, and the
implementation of the remaining refactorings I have specified.
In the final phase of my research I
want to evaluate Dr. Jones by obtaining user reactions and feedback.
Before breaking for discussion I’ll give
you a tour of the prototype’s current capabilities.