Of course I have a backup!

Random blobs of wisdom about software development

A few years with Doctrine2

Thursday, June 26, 2014

I have been using Doctrine2 for about 3 years now. I started with the first 2.0 stable, integrating it into my employer's in-house framework, and in the last year, I have been using it as part of Symfony2. There are quite a few things that I love about it, and a small amount that I dislike.

The Pros

It's a data mapper. As far as I know, there are no other ORMs for PHP that implement data mapper currently, everything else implements active record. With Doctrine you don't have to do stuff like class User extends UserBase , or class User extends ActiveRecord , or anything like that. Your entities are simple PHP classes, that only need to worry about their own job, and nothing else, which is an extremely liberating thing.

The configuration drivers. Since your entities are simple PHP classes, you have to define the persistence related configuration somewhere else. If you like the validation that XSD provides, you can use XML. If that's too verbose for you, you can use YAML. If you want to keep the metadata next to your entities, you can write it as annotations on your entities. If you don't like any of the above, you can write it as native PHP code. There is an option for everyone. The performance is all the same, since they are cached after the first read.

The schema tool. Doctrine offers a very powerful database analyser, that can:

  1. Generate the whole database from your mapping configuration
  2. Do it the other way around, reverse engineer your database, and create entities/mapping data from it
  3. If your entities change, incrementally update the database

I can't stress how good this is. You don't have to worry about schema migrations! If you make a change to your entities, you don't have to write SQL by hand to modify the database, Doctrine will do it for you! Need to add a new $age property to your User? Add it to the User class, and modify the mapping configuration, and Doctrine will issue an ALTER TABLE ... for you, and all the other developers. While this only works for the tables that are tied to an entity, this will take you a very long way, before having to write actual migrations. It also comes extremely handy for functional testing, because you no longer have to write a script that drops/rebuilds the database between tests. If you have done that before, you know the pain of having to figure out the correct order for dropping/creating the tables (because of foreign keys), but Doctrine will happily calculate that for you.

The extensibility. Want to add slugging (generating seo-friendly urls for your articles), to your entities? Use sluggable. Want to version your entites, so that you can roll them back? Versionable. Want translation support? Translatable. Nested Sets? Yeah. Soft delete? Yes. So many things are already done, you just have to composer.phar require them.

The QueryBuilder. There is a QueryBuilder, provided in both the ORM, and the DBAL package, that lets you construct SQL/DQL queries in a programmatic way:

$qb = new QueryBuilder();
$qb->select('a.id, a.title')
   ->from('article a')
   ->innerJoin('a', 'category', 'c', 'a.category_id = c.id')
...

The case where this most often comes useful is pagination, sorting, and filtering. You no longer have to concatenate strings together, and collect the query parameters in some array, you can use the where(), andWhere(), orWhere(), join(), leftJoin(), addOrderBy() methods to add new where clauses, or join in more tables, or sort by different fields. If you are using the QueryBuilder provided by the ORM, it can be even more terse, because the ORM already knows what columns your entities are joined on:

$qb = new QueryBuilder();
$qb->select('a, c')
   ->from('Article a')
   ->innerJoin('a.category', 'c')
...

The unit of work. I don't have to keep track of what entities I have modified, even more, there is no $article->save() or anything like that. At the end of the request, I just call $em->flush() and doctrine will update the entities that have been changed. I still have to issue a $em->persists($article) for newly created articles, and a $em->remove($article) , but for entities that I have loaded from the database, there is automatic change tracking. Doctrine will also of course figure out the correct insertion/deletion order, and it will assign the foreign keys to the entities too after insertion.

The cons

The DQL. This is actually a blessing, and a curse at the same time. I understand why the DQL exists, and I think it's presence is justified, but since Doctrine supports many database vendors, it's support is limited the common subset of the vendors. There is no SELECT ... OVER , and no SELECT GROUP_CONCAT . There is nothing that is specific to a vendor, but you have two options if you need something:

  • extend DQL with your own functions
  • write a native query, and map it to your entities

I have done the first one before, when I had to implement MD5() for MySQL. It was easy to do, because it was a simple function, applied to a single column, and the docs are very good, but I have no clue how I would implement something like PostgreSQL windowing functions, since they alter the whole result set, and not just a column. I haven't written a single native query yet, so I can't comment on that.

It wants to drop all the tables it doesn't know. The schema tool, by default, never removes tables. If you remove an entity that you no longer need, the schema tool will not generate the necessary DROP TABLE statement to drop the correspoding table, unless you also specify --complete on the command line. However, this gets rid of all the tables that Doctrine does not know, like denormalized tables, search lookups, and so on. There is no way to specify some kind of whitelist, what I do currently is pipe it to grep -v, to filter out tables I don't want to drop.

The dreaded N+1 problem. Everyone who has used an ORM knows about the N+1 problem. Quick refresh: If an association is set to "lazy" (which is the default), then the associated entity will be loaded, the first time a field is accessed on it. This becomes a major problem, if you have a loop like this:

$author = $em->find('Author', 1);
foreach ($author->getArticles() as $article) {
    echo $article->getTitle();
}

Article is in a Many-To-One relationship with Author, and you are looping through all the articles, which was lazy loaded. This will result in 1 query for the Author, and N queries for all the Articles it has. The solution is easy (set the association to "eager" load, or write a DQL query that eager loads it), but it can be major PITA to track it down. It's usually a 4 step process:

  1. Find a page that performs slowly (or times out altogether)
  2. Identify the query on the page that is causing the problem
  3. Find out which relation you are accessing that is lazy loaded
  4. Fix the query to include said entity

Step 2 and 3 is the problem, because you might be accessing the entity at multiple places, and by multiple methods. It would be awesome to have an option in Doctrine, that denies lazy loads altogether, and would throw an exception if an entity tried to lazy load itself. I could just switch this on in my test environment, run the tests, and if any of them fail, I know there is an N+1 query there.

Closing thoughts

Overall, I am extremely happy with Doctrine, and I'm really looking forward to future updates, eg. the Embeddables docs are already up (Value Objects basically), but not released yet. A lot of people claim that Doctrine is too heavy, and while I wouldn't describe it as lightweight (which is all the rage nowadays, every library is tiny, and lightweight) but it's not bloated either. Doctrine is an extremely well written ORM. My personal opinion is that people who complain about it being bloated have either:

  • Never tried it, but saw someone else say "it's heavy" and keep repeating it
  • Couldn't be arsed to learn it so they went with something easier
  • Didn't understand it

The docs and the library is larger than any other ORM currently available for PHP, but larger library != slower code. Just look at the numbers on packagist, on the first page of the most popular packages, 7 are from Doctrine, with most of them hitting the 3million downloads mark (I know that Symfony2 packages Doctrine2 by default, however doctrine-bundle is only at 2 million). For reference, paris is on ~8k, redbean ~13k, and idiorm is on ~17k. If Doctrine were really slow and heavy, it wouldn't show these numbers.

This was written by Norbert Kéri, posted on Thursday, June 26, 2014, at 23:47

Tagged as:

Post a comment

Providing your email is optional, it is never published or shared, it is only used for auto approval purposes. If you already have at least 1 approved comment(s) tied to your email, you don't have to wait for moderation, otherwise the author must approve your comment.

Please solve this totally random captcha