by Baptiste Autin, April 2008
UML is a language that define objects for modeling design. Some of them are well-known (classes, packages, interfaces, actors, use cases...) but many people ignore how numerous they are (more than 200 objects in UML2 !)
The relations between those UML objects and the elements of a concrete programming language are not always obvious. For example, UML defines packages. But what's a package in PHP ??
PHP_UML deals only with the most basic UML elements: classes, interfaces, methods, parameters, types, artifacts... and packages (by relying on the @package docblock tag).
Here is the motus operandi of the very first versions of PHP_UML:
First, PHP_UML_PHP_Parser parsed the PHP files, and stored the classes it found in a classes array.
Each class array was itself composed of other arrays, to store the functions, their parameters, and so on...
The classes arrays were pushed into a global package array.
Once the parsing was done, those resulting arrays were passed to an XMI builder class, which took every item and built the corresponding XMI code.
It worked, but I was finally disappointed by that approach, because the XMI builder class was extremely dependent on the way I had originally defined the resulting arrays.
Then I discovered XMI and MOF.
MOF is an OMG standard that can be used to integrate various
types of tools for modeling, code generation, code analyses...
It defines abstract program elements, like classes, packages, types, and so on.
It is a key concept of the MDA (Model Driven Architecture).
MOF is sometimes called a meta-metamodel, since it is used to build metamodels.
As for XMI, it is just an XML interchange dialect that maps MOF to metamodels like UML.
In some way, MOF is a subset of UML, but it aims at being simpler, and more directly implementable.
So instead of arrays, I could use MOF objects. I relied on EMOF (Essential MOF), a basic version of the standard. I specialized it a little bit, so it could store files (unknown in MOF, AFAIK) and interfaces in a simple manner (otherwise I would probably have to use the MOF "relationships" to implement interfaces, which would have increased the level of complexity - this might change in future releases, whether or not we consider that strict compliancy to MOF is a target).
In addition, PHP_UML defines a generic class for gathering MOF objects: PHP_UML_Sequence.
It stores the program elements in a stack (managed by an internal iterator), the same way as a FILO (you can read the last entered element).
At any time, you can completely traverse a PHP_UML_Sequence through an external iterator class: PHP_UML_Sequence_Iterator.
Some MOF elements are contained, others are nested. For example, in MOF, you model inheritance through self-composition (the superclass
class attribute).
But a PHP class extending another class may not already be known, because the PHP file where it is defined has not yet been parsed.
Thus, we cannot create PHP references between the elements on the fly.
So I decided to use the real names, temporarily, as strings, and then, only in the end, resolve them by searching into the
PHP_UML_Sequence collections.
Most relations are bidirectional, for easier later browsing.
PHP_UML does not rely on include or require to resolve the references (e.g. the reference a class owns of its nesting package),
but instead, PHP_UML looks into the "current package", and, if the element
is not found, it searches into the "root" package (where are stored all the "orphaned" elements, those that have no nesting package).
By the way, this is, I think, how the new PHP instructions namespace and use (>=PHP5.3) proceed with namespaces,
and I guess this is where PHP heads off...
(if it does not, this is where it should - but this is just a personal opinion :-).
So it may not always work for your code. If it does not, consider adding the proper @package(s) to your docblocks.
PHP_UML reads namespaces. It does only read simple namespaces, like namespace Foo;, but "hierarchical" qualified names (like A::B::C) should be easy to implement in future releases.
You can skip the PHP parser, and add program elements "manually" (through the API) to the PHP_UML_Sequence objects.
For example, if PHP_Documentor had to be interfaced with PHP_UML, this is how it could procede.
A tool like PHP_Documentor could ask PHP_UML to generate the XMI corresponding to some program elements
that it has previously stored in PHP_UML_Sequence objects.
Here is a PHP sample showing how to create XMI without PHP parsing.
If you have ideas, suggestions or demands regarding the API, tell me ! And if you know well the OMG standards, any advice is appreciated !
The OMG has released two very different versions of XMI (the 1... and the 2).
PHP_UML can generate XMI in both version.
However, many UML editors only support the first version.
UML has also changed from version 1 to version 2 (so you can meet two versions today: the 1.4, and the 2.1).
Some functions of UML are supported by PHP_UML only in version 1 (like the stereotypes, called profiles in UML2 - this is on the todo list).
UML2 and XMI2 are still not very frequent among software design tools,
so if your tool is XMI 2 compliant, it is extremely likely that it is UML 2 compliant as well.
And nobody, AFAIK, bothers with the case UML=2 and XMI=1 (or the inverse).
Therefore, the class PHP_UML_XMI_AbstractBuilder is specialized only into two implementation classes
(PHP_UML_XMI_BuilderImpl1 and PHP_UML_XMI_BuilderImpl2), one for each version.
If PHP language changes, only PHP_Parser needs to be updated.
If XMI/UML standards change, only the XMI builder classes must be updated.
I believe that this is also a good reason to rely on a separate and abstract model like MOF.
Baptiste Autin, April 2008