Schemaball: A New Spin on Database Visualization
Martin Krzywinski
Understanding relationships and patterns in complex database schemas is simplified
when the data can be effectively visualized. Schema diagrams are particularly
useful when numerous entities and relationships are involved. Conventions for
drawing schemas, such as the entity-relationship diagram (ERD) and the general
Universal Modeling Language (UML) framework, provide recipes to draw a wide
range of entities and relationships (see references). Their visual vocabulary
is highly controlled, to ensure consistency, and rich, to allow flexibility.
In an ERD, a table (entity) is represented by a rectangle, constraints (relationships)
by rhombuses that link the table symbols, while cardinality of relationships
is indicated by glyphs on the head and tail of the constraint lines. A large
array of open source and commercial tools exist to generate ERDs (see references).
Unfortunately, the ERDs for large databases quickly become difficult to follow.
At the same time, it is precisely in these cases where schema illustrations
become indispensable in development and optimization. For example, our MySQL
sequencing LIMS system, which models laboratory protocols and stores sequence
data, contains 141 tables and 205 foreign keys. Information about a single laboratory
process involves multiple constraints and tables, making the process of tracing
data flow in an ERD very cumbersome.
To more quickly grasp the large-scale structure of schemas, it is necessary
to reduce the complexity of a schema diagram while maintaining the pertinent
details. Drawing hundreds of tables with as many (or more) constraints in a
traditional ERD often produces a disheartening jumble.
|