PHP_UML, packages and namespaces

by Baptiste Autin, May 2008

What's a package?

In UML, a package is just a container. It contains typed elements, or other packages.
Typed elements means datatypes, but also - and most often - classes and interfaces.

UML, by itself, does not say how you should name and organize your packages - this is your business - but it is likely that you will define them according to logical rules.

For example, in a tool like PHP_UML, the classes responsible for XMI generation are gathered in a package called "XMI", while the PHP parser is put into a different package.
No matter the filesystem, which is a different topic.

Dependencies

A dependency between two packages A and B can arise because, for example, inside a class belonging to A, there is the instantiation of a class belonging to package B.
Generally, in a given program, you will want to minimize the number of dependencies between the packages, especially if those dependencies are bidirectional, or if they form circular references.
For example, A->B->C->A (package A depends from package B, which depends from package C, which depends from package A) is not a good package design.
But A->B->C (package A depends from B, which depends from C) is a common package design.

It should be noted that the latter design means that you can deploy the package C alone, without A and B.
This is an important point in reusable software development: the top-level packages in a package diagram (that is to say, the least dependant ones) are the best candidates to be reused in other contexts, for example through model libraries, or as parts of a framework.

Packages and namespaces

Thus, the package is an important concept in object oriented programming:

  • It helps to organize the classes in an large application, as their number grows
  • It allows for better reuse and maintainability of the code.
  • Unfortunately, for some strange reasons, PHP still does not offer any form of package management.
    A framework like PEAR had even to develop from scratch its own system of packages, while many other languages offer natively such a feature.

    From version 5.3, PHP has introduced the concept of namespace, which is intended to avoid name conflicts between classes.
    This is not enough for achieving packages, but it is a first step.

    Why is it similar?

    Because a package is a namespace for its members. Inside a model, two classes can share the same name, provided that they live in separate packages. You can refer to them by prefixing their name with their nesting packages, like Java does in dotted style: java.io.File

    And why is it not the same at all?

    In UML, a package can have a URI, a universal resource identifier. There is some good logic to that: when you get a package named DateTime, you don't want it to enter in conflict with another package of the same name...

    To enforce such unicity in a development environment, one possible way is to match the packages with the filesystem folders, and to put the nested classes into the contained files.
    This is not ideal, but it is better than no rule at all, and by using hostames (like org.apache.xalan etc....) for the first packages of the hierarchy, you also lower the risk of collision.
    Another possibility is to implement a specific relationship (sometimes called manifest) between the source files and the logical program elements (like packages and classes). This is how .NET procedes, with its assemblies.
    In both implementations, the compiler knows where to reach all the classes of a given package (thus allowing for full imports like import javax.swing.*;).

    Namespaces offer nothing of the kind. They are just qualifying names for interfaces and classes.
    You can always "mentally" map the components of a namespace (A::B::C) to some package hierarchy (A, B, C), but then every class can claim that it belongs to this or that "package": without a specific implementation, the compiler has no means to determine if the class tells the truth or not.

    Ok, ok, and what about PHP_UML?

    Since packages are so important in UML, PHP_UML tries to use them as much as it can.

    First, PHP_UML can read docblocks. So you can specify the package of a class by inserting a @package Foo in the class docblock.
    It can also read file docblocks. So all the classes defined in a file whose main docblock contains a @package Foo will be considered as belonging to the package Foo.
    Then, PHP_UML interprets the new PHP namespacing instructions: namespace and use.
    The rule is simple: once it has parsed something like, say, namespace PEAR::PHP::PHP_UML, PHP_UML considers every further class as belonging to a package called PHP_UML, that belongs to a package called PHP, that belongs itself to a package called PEAR.
    In other words, :: is considered as a package delimiter in a namespace.

    And what about classes that are not docblock-commented and that are not preceded by a namespace instruction?
    They are simply put into the "default" (or root) package of the UML model.

    Last thing to know about packages...

    Ok, PHP_UML can divide the classes into different packages, if specified.
    But how do you refer to the packaged classes (or interfaces) within other packaged classes?
    This is a delicate question. PHP_UML does not parse the require or include instructions, and this might lead to unwanted results.
    Take that class:

    /**
     * @package A
     */
     Class Foo
     {
         function foo(Foobar $x) {
          ...
         }
     }
    

    And here's Foobar:

    
    /**
     * @package B
     */
     Class Foobar() {
      ...
     }
    

    Since it is defined in B, Foobar cannot be reached from context A... unless foo() is modified in:

        function foo(B::Foobar $x) {
         ...
        }
    

    ... but that writing is allowed only in namespaced PHP.

    A workaround should be available in a later release of PHP_UML (by defining some "default" packages where PHP_UML will look into, each time an unnamespaced classes must be resolved).

    Priority rules

    Finally, here are the priority rules, regarding package definition:

    1. Package specified in a namespace instruction (overrides everything)
    2. Package defined in a class comment
    3. Package defined in a file docblock (top of page, before every PHP instruction)
    4. Package specified in parseDirectory/parseFile($directory, $model, $package)
    5. Package by default (in the root package)

    Baptiste Autin, May 2008