It seems like a dying art, but I still strongly feel that Entity Relationship Diagrams (ERD) should be the starting point of all software development projects. Since they are for me anyway, I wanted a place to refer colleagues to for how to read these diagrams, and an Entity Relationship Diagram Example seemed like a great place to start.
The Example: A Resource Management Application
Consider that we're writing a resource management application. The first step to creating an ERD is always to identify the nouns (entities). In this case let's start with:
- Project; and
- Technology Project (which are a specific type of Project that perhaps require special fields like "number of entities")
Here's the Example Entity Relationship Diagram I'll decipher piece by piece in this article (click to enlarge):
(note that I'm now using singular names since my somewhat controversial decision to switch to naming entities in the singular)
To read the notations of an Entity Relationship Diagram:
An Entity Relationship Diagram conveys a lot of information with a very concise notation. The important part to keep in mind is to limit what you're reading using the following technique:
- Choose two entities (e.g. Company and Employee)
- Pick one that you're interested in (e.g. how a single Company relates to employees)
- Read the notation on the second entity (e.g. the crow's feet with the O above it next to the Employee entity).
The set of symbols consist of Crow's feet (which Wikipedia describes as looking like the forward digits of a bird's claw), O, and dash, but they can be combined in four distinct combinations. Here are the four combinations:
- Zero through Many (crow's feet, O)
- One through Many (crow's feet, dash)
- One and Only One (dash, dash)
- Zero or One (dash, O)
Zero through Many
If, as in the diagram above, the notation closest to the second entity is a crow's feet with an O next to it, then the first entity can have zero, one, or many of the second entity. Consequently the diagram above would read: "A company can have zero, one, or many employees".
This is the most common relationship type, and consequently many people ignore the O. While you can consider the O optional, I consider it a best practice to be explicit to differentiate it from the less common one through many relationship.
One through Many
If, as the next diagram shows, the notation closest to the second entity is a crow's feet with a dash, then the first entity can have one through many of the second entity. More specifically it may not contain zero of the second entity. The example above would thus read (read bottom to top): "A Project can have one through many Employees working on it."
This is an interesting combination because it can't (and for various reasons probably shouldn't if it could) be enforced by a database. Thus, you will only see these in logical, but not a physical, data models. It is still useful to distinguish, but your application will need to enforce the relationship in business rules.
One and Only One (onne)
If the notation closest to the second entity contains two dashes it indicates that the first entity can have one and only one of the second. More specifically it cannot have zero, and it cannot have more than one. The example would thus read: "An Employee can have one and only one Company."
This combination is the most common after zero through many, and so frequently people consider the second dash optional. In fact, some ignore both dashes, but I would highly recommend at least using one for clarity so as not to confuse the notation with "I'll fill in the relationship details later".
Zero or One
A zero or one relationship is indicated by a dash and an O. It indicates that the first entity can have zero or one of the second, but not more than one. The relationship in the example above would thus read: "A Project can have zero or one Technology Project."
The zero or one relationship is quite common and is frequently abbreviated with just an O (however it is most commonly seen in a many-to-many relationship rather than the one-to-one above, more on this later).
Having examined the four types of notation, the discussion wouldn't be complete without a quick overview of the three relationship types. These are:
- One to Many
- Many to Many
- One to One
A one-to-many (1N) is by far the most common relationship type. It consists of either a one through many or a zero through many notation on one side of a relationship and a one and only one or zero or one notation on the other. The relationship between Company and Employee in the example is a one-to-many relationship.
The next most common relationship is a many-to-many (NM). It consists of a zero through many or one through many on both sides of a relationship. This construct only exists in logical data models because databases can't implement the relationship directly. Physical data models implement a many-to-many relationship by using an associative (or link or resolving) table via two one-to-many relationships.
The relationship between Employee and Project in the example is a many to many relationship. It would exist in logical and physical data models as follows:
Probably the least common and most misunderstood relationship is the one-to-one. It consists of a one and only one notation on one side of a relationship and a zero or one on the other. It warrants a discussion unto itself, but for now the Project to Technology Project relationship in the example is a one to one. Because these relationships are easy to mistake for traditional one-to-many relationships, I have taken to drawing a red dashed line around them. The red dashed line is not standard at all (although a colleague, Steve Dempsey uses a similar notation), but in my experience it can help eliminate confusion.
I hope you've found this a useful example for deciphering and verifying entity relationship diagrams. As always please add any comments, disagreements, thoughts or related resources.