Complex Database Relationships with AtlasORM

How I used Atlas.ORM to perform complex queries in my Pen & Paper side project.

In my side project I’m working with a very well normalized database with many relationships.

One of the most complex pages is a Creator where it shows info about the creator and all of the works that they have contributed to.

creatordiagram

This is the database structure for this page. On the left, we have a Creator which has a one-to-many relationship to seven different join tables. Each of these tables represents this creators credit on that particular type of work. So if the creator worked on an RPG book there would be a RpgBookCreator that shows what credit that creator had in a particular RPG book. A credit could be something like Author or Illustrator.

When I first started working on this project I was doing plain SQL with PDO. Quickly I realized that for all the pages I was building I’d be writing a whole lot of queries and that didn’t sound particularly fun.

Then I tried porting part of this page over to Doctrine 2 ORM and that was a huge mess. All of the mapping files and entity objects was taking a long time to get right. Then, once I had it configured for Creator, RpgBookCreator, RpgBook, Credit, GameLine and Publisher (which is only 1/7 of the structure) I ran the query and the page just sat there spinning. I was not familiar enough with Doctrine to make it perform the way I wanted it to.

It was suffering from the “N+1 Problem”:

This problem occurs when the code needs to load the children of a parent-child relationship (the “many” in the “one-to-many”). Most ORMs have lazy-loading enabled by default, so queries are issued for the parent record, and then one query for EACH child record. As you can expect, doing N+1 queries instead of a single query will flood your database with queries, which is something we can and should avoid.

Well, it just so happens Paul M. Jones the creator of Radar wrote the book “Solving The N+1 Problem in PHP”. Not only that, he is the creator of AtlasORM described as:

Atlas is a data mapper implementation for your persistence model (not your domain model).

If anyone would have an ORM that could handle my data structure, it’s Paul.

It didn’t take very long to get my site working with Atlas. There is a very useful CLI that will analyze your database and generate the scaffolding needed for Atlas to work.

The two types of classes required for atlas are Tables and Mappers. The Table classes (ex: CreatorTable) described things like the column names, the primary key, and the default values.

The Mapper describes the relationships between tables. By default, the method setRelated is empty because the CLI doesn’t create them. However, it was very easy for me to write a script that did automatically create these relationships because I was using a consistent naming scheme.

Here is the final CreatorMapper which describes all of its one-to-many relationships.

Finally, in my CreatorsAtlasRepository class I query the database, specifying all of the relationships I’d like to include.

It was pretty fast. I picked a creator with many credits across different types of work (Monte Cook) and it loaded in about 1.8 seconds. However, I wanted to make sure Atlas was querying efficiently.

In an ideal situation, it would query each table once or join some tables and have less than 20 queries. I wasn’t expecting this.

In a more expected situation, it would query the tables close to the root once and some of the edge tables (publisher, game_line) might get queries a few times for different batches of data. Probably around 30-40 queries depending on the creator.

I found a project called PHP Debug Bar that could hook into different systems (including PDO) and show stats for each page.

It took a little work to get that configured with Atlas, but I got it working and went to see how many queries were made to render the Monte Cook page.

I was very surprised to see a total of 484 queries. That was an order of magnitude larger than I was expecting. Clearly, there was still an N+1 problem with Atlas. I dug through the query log and determined that the issue was with the edge relationships. There were many many queries for single publishers and game lines.

I submitted a PR explaining my issue. Today Paul commented that he had a patch in a development branch and wanted me to test it.

I’m happy to report that the same Monte Cook page now loads with only 32 queries which is right where I was expecting it to be. The page is a little faster at 1.5 seconds which is good enough for development. Before the site goes live I’ll implement a caching layer.

This patch should get merged into the main branch soon. This is really a fabulous library that I’m planning on using in all of my new projects.

LazyArray Officially Part of Aura.Di

My LazyArray feature has been officially added to Aura.DI.

In my prior posts about integrating Symfony Forms with Radar, I created a helper class called LazyArray.

It was designed so I could pass an array of lazily instantiated objects into a setter like Twigs setExtensions method.

Paul liked my implementation and asked me to submit a PR.

Today it was merged and included in Aura.Di 3.2.0 which makes me very excited.

Symfony Forms and Radar – Part 2

I refactor my code from Part One and integrate Aura.Filter.

In Part 1 I talked about the difficulties in getting Symfony Forms to play nicely with Aura.Di and Radars ADR model.

Today I’m picking up where I left off. My goal is to figure out what I need to do differently to achieve the clean separation I desire.

All of my issues with Aura.Di have been resolved and my LazyArray class is currently being reviewed for inclusion in Aura.Di.

Removing Validation

The first thing I did was remove Symfony Validator from the form (and my app). I did this for several reasons.

First, my initial attempts to validate independently from the Form showed that Validator is fairly complex. There also seems to be very little documentation for using outside of Symfony. The few examples I found didn’t seem like the sort of validation I’d like to use in my projects.

Second, the point of this series is to integrate Symfony Forms, not Validator. No sense going down another rabbit hole. If you’re interested in me figuring out Symfony Validator, please contact me and let me know.

Isolating Form to the Responder

The second thing I did was move all use of the Form object to the Responder. Since I’d like to handle the Request manually and validate separately in the domain, the form can be moved to the Responder.

You can follow along with my commits here.

How I’m handling things in Input and the Domain right now are just placeholders until I get the form figured out.

I’m still calling handleRequest at this point because I haven’t figured out how to pass in the payload from the domain. I’ll do that next.

Handling Request from Payload

This was far easier than I expected. Since I’m not doing anything crazy with file uploads or anything I was simply able to submit the form directly.

Validating and Showing Validation Errors

Now that the basics are working, I can implement validation and display the validation errors with the form.

I decided to use Aura.Filter for my validation. It integrates easily with Aura.Di and fits my idea of a clean validation library.

The first step is to create a filter. This is easily done by extending Aura\Filter\SubjectFilter and overriding the init method.

Next, since I’ll need this filter in my post domain, I convert my closure into an invokable class.

I then configure the container to instantiate this filter and pass it into RegistrationPost.

Next, I apply this filter on the data I get from RegistrationInput.

Finally, I consume the payload in RegistrationResponder going through the failure messages and adding FormError objects to the correct elements.

You can see the result of this in the 2.x branch of the project repo.

Conclusion

This exercise left me with a greater understanding of Symfony Forms and Aura.Filter. I think the solution is pretty good. If I was using this in production, I’d probably refactor some of this to be more reusable, especially if I plan on defining a lot of forms. I probably wouldn’t use arrays as the data transfer mechanism. I’d also probably create a factory that takes a generic definition and generates the form, validation and entity objects that I’d be using.

I’m very interested in hearing from you. I will be creating new content over the coming weeks and I want to customize the content for YOU and help you overcome any obstacles you might be experiencing. Send me a message!

Symfony Forms and Radar – Part 1

I explore how to integrate the Symfony Form Component with Radar.

I have several projects I’d like to build soon. One thing they have in common is there will be a lot of data entry via forms. Back in the day I used to use HTML_QuickForm but I haven’t used it for a long time.

For my projects at work, one uses a custom solution I built and the other uses AngularJS. I figured it was time to look around and find a current solution.

The two I came across was Symfony Form Component and Zend\Form. I’ve used many Symfony components so I figured I’d start with theirs.

I really dig Radar but I know that not everything easily fits in the Action-Domain-Responder model.

I leveraged the Symfony Forms documentation and webmozart/standalone-forms to create my proof of concept.

Configuring a Symfony Form

I managed to get it to work but it’s not ideal (yet). To see my first proof of concept check out futureproofphp/symfony-forms-radar.

The first caveat that I want to express is that I wanted to leverage as much of what Symfony Forms has to offer much of which is optional. My final solution may omit parts of this.

The first issue I found was with having to use multiple setters to build objects.

Symfony Translation Component uses addLoader and addResource to be configured correctly. FormFactoryBuilder and Twig_Environment use addExtension.

Aura.Di doesn’t really handle this use case. You can define a single parameter to be passed to the setter method. The was around this is with the two pass system. In a ContainerConfig you have two methods, define and modify. The first part define is where you define all of your dependencies. You should not be instantiating anything in this method other than LazyInterface objects. The second part modify actually takes an instantiated object from the container and further modifies it. Ideally, you’d do as much as possible in define so you’re not instantiating objects that you’re not actually using.

I was able to work around this using two strategies.

The first was to find or create setters that can take an array of objects instead of having to call the single method multiple times in the container. FormFactoryBuilder already has addExtensions that I can use, but Translator needed to be extended so I could have addLoaders and addResources methods. Twig_Extension has setExtensions but I was unable to use that (I’ll explain why later).

The only issue with this strategy was that if I included LazyInterface objects in an array, they didn’t get resolved when the setter was called.

This leads to my second strategy which was the creation of a new LazyInterface class LazyArray which takes an array as a parameter and resolves any LazyInterface objects within it, returning an array with the resolved objects.

Now, why wasn’t I able to use the setExtensions method from Twig_Environment? Circular dependencies.

The Twig_Environment extension FormExtension requires a TwigRenderer which requires a TwigRendererEngine which requires the Twig_Environment. Aura.di can’t resolve this.

The way around it is either to not use that extension, or to configure the Twig_Environment extensions in modify. I chose the second option.

Update!

After more digging, I found that calling setEnvironment on the TwigRendererEngine is not required. I had done it because that’s how it was done in the webmozart tutorial but then I noticed that setEnvironment is called from FormExtension::initRuntime. This makes sense because since it’s a Twig extension, shouldn’t it already know the Twig_Environment?

Once I removed this setter from the container, I was able to move the configuration of Twig extensions back into define.

Using the Form

Now that I got everything configured in the ContainerConfig I need to use the form I created. The issue I quickly ran into is that a single form does a whole lot of stuff.

First, the form handles the request. By default, it pulls data out of $_POST but can be configured via an extension to take an HTTP Foundation Request object.

Then, the form validates this input and will return the filtered/sanitized output via the getData method.

Lastly, through the Twig the form is rendered, prepopulated with data and any validation errors.

If you’re not familiar with Radar check out my post Radar Under the Hood where it talked in depth how it works.

The problem here is that the whole point of Radar is to separate your concerns.

For my form to work, I create the form in the input and call handleRequest. I return the form so it’s passed on to the domain.

In the domain, I’d check if the form was valid and potentially do something with the data. I return the form as the payload for the responder.

In the responder, I render the form in twig template and return the PSR-7 Response.

I don’t like that I need to have a single form object that’s passed through all three layers just to render a form.

Ideally, it would work more like this.

In the input, I’d pull out the values from the PSR-7 Request and return an array or domain request object with the data.

In the domain, I’d validate the data from the input and do something with the data if valid. I’d return a domain response to the responder that contained the filtered/sanitized data and any validation error messages I generated.

In the responder, I’d take the domain response and feed it into a form object that is then passed into a twig template for rendering. I’d return the PSR-7 Response.

Conclusion

I haven’t given up on this yet (hence the “Part 1” in the title of this post).

My next step is to look at the components and extensions being used in my proof of concept and see what they are doing and if there are ways to pull them apart.

For instance, I’m creating a Validation object that gets passed into the form via the ValidationExtension. I’m assuming I can define my validation separately, but I’ll have to research how to assign validation error messages to the form externally.

If validation is separate, I can probably pass in an array of data to be validated, instead of having the validator pull data directly from the request.

However, this post is already very long, so I’ll save the rest for Part 2.

Discovering Clean Architecture

Story about how I discovered Uncle Bob and Clean Architecture.

Several years ago I came across the following video. It was a keynote from a Ruby conference by some guy named Robert “Uncle Bob” Martin. It’s really worth your time to watch.

Watching this opened my eyes. The idea that the framework is a delivery mechanism and not your app was completely new to me. I had been working on an app that I built using Kohana which was getting really hard to maintain.

I had never worked at a company with senior developers who talked about best practices or strategies for architecting applications. The idea of writing your application separate from the framework and interacting with it through use cases was foreign to me.

I picked up Uncle Bob’s book Clean Code and bought some of his videos on Clean Coders.

I hope to talk more about what he covers in those resources, but today I want to talk more about Clean Architecture.

In August 2012 almost a year after he gave the talk at Ruby Midwest he wrote a post titled The Clean Architecture on his companies blog.

cleanarchitecture-8b00a9d7e2543fa9ca76b81b05066629

The concentric circles represent different areas of software. In general, the further in you go, the higher level the software becomes. The outer circles are mechanisms. The inner circles are policies.

The overriding rule that makes this architecture work is The Dependency Rule. This rule says that source code dependencies can only point inwards. Nothing in an inner circle can know anything at all about something in an outer circle. In particular, the name of something declared in an outer circle must not be mentioned by the code in an inner circle. That includes functions, classes, variables, or any other named software entity.

By the same token, data formats used in an outer circle should not be used by an inner circle, especially if those formats are generated by a framework in an outer circle. We don’t want anything in an outer circle to impact the inner circles.

This has been the number one influence on how I write software.

This is one of the key motivators for me creating this blog. I hope to help PHP developers discover tools and techniques that help them become better, more productive developers.

I’m also very interested in hearing from you. I will be creating new content over the coming weeks and I want to customize the content for YOU and help you overcome any obstacles you might be experiencing. Send me a message!