Visualizing DB Schema Changes, even at a quick glance better structures stand out
Creating a robust Data Model for your business & application is one of the most important things to get right. We still don’t have great tools available for discussing proposed database schema changes. Fixing a bad data model after data has started flowing on a production system is more complicated & time consuming than folks estimate. A series of not fully thought out data decisions early on can cut a team’s velocity significantly. This is due to time being spent on trying to fix bad & invalid data, adding missing DB constraints & validations, and refactoring towards a more appropriate data model.
What can we do to ensure a more robust model from the start and increase the confidence we have in data model changes?
Below, I will lay out a proposal we are iterating on as part of our OGE team best practices.
Our Goals:
While this does add additional overhead while working on data changes to our system. It helps us to collaborate and socialize the data model. Working towards ubiquitous language across our teams and with our stakeholders.
It is important to note this is for data schema changes not for data migrations, which I have written about how we handle in the past.
The process creates some artifacts that help support a robust understanding and conversation around the data changes.
git diff
supportThe high level view are images that can include class or class and field information about the models under change. The PR will typically include both the before and after images so one can see how the classes and associations will change over time. If fields are moved you can also see a class shrinking as fields move to newly associated objects. This type of view can help see the bigger relationships and structures of the data model.
A high detail visual representation of the expected end result
By building image models from a textual diagramming format (DOT
files), one can get much more details and control for the diagrams.
The low level textual details will result in a git diff
easily viewed as part of the PR. It makes clear specific fields and associations that are added and removed. While this might be verbose for some folks, it is the level of details needed for a developer implementing and the business intelligence analyst reviewing. Seeing the specific fields makes it clear if the data will meet their needs. It can also be very helpful to discuss low level details such as field type.
A textual diff showing low level details when refactoring out a phone number
It might be easier to understand the goal with a quick animation to show the flow, than just describing it. Here is the flow one can expect to get into when working on DB model / schema changes.
This flow allows for a quick feedback loop, where folks in a room or on a screen share meeting could quickly diagram a model together. Making quick and small edits to the textual representation and generating new versions to discuss. This flow requires some initial work by a developer to prep the original model and DOT files.
Loop until the group is satisfied with the results
This should walk through the steps required to implement this process on your own project. This process could easily be implemented and integrated into various frameworks, but our specific details use tooling for Ruby on Rails to further simplify the efforts of generating these artifacts.
FOCUS=true VERBOSE=true OUTPUT=doc/customers.dot rake diagram:models:customers
neato -Tpng doc/customers.dot > doc/customers.png
git checkout -b feature/db_phone_change
open doc/customer.dot
neato -Tpng doc/customer.dot > doc/customers.png
This is a version of the presentation I gave internally when getting our team on board. Feel free to have a look Information Architecture Change Process
See below for details on how to implement this yourself.
There are many tools that could do this. If you don’t have the ability to automatically create diagrams from your code or access to graphviz, you could build out much of the diagrams by hand with a tool like draw.io. If you are using Rails, I recommend the below setup.
gemfile
in your dev/test groupWe use the Railroady gem to generate our initial DOT files & images.
group :test, :development do
gem 'railroady'
end
graphviz
and tools if you don’t already have itbrew install graphviz
By default Railroady
has some great Model diagrams.
rake diagram:models:all
rake diagram:models:brief
They tend to be to verbose by default. You can easily customize the output to focus in on the models under discussion. Below is some quick code to highlight how to customize the output.
namespace :diagram do
namespace :models do
def excluded_models
(['PaperTrail::Version']+(ENV['EXCLUDE_MODELS'] || '').split(',')).join(',')
end
####
#
# Convert dot files to images
# neato -Tpng doc/customer.dot > tmp/customers.png
#
####
def generate_for_files(files, output_file)
files = files + ['app/models/user.rb'] if ENV['WITH_USER']
options = ENV['VERBOSE'] ? '--alphabetize' : '--brief'
if output_file.ends_with?('dot')
`EXCLUDED_MODELS=#{excluded_models} railroady -M -s #{files.join(',')} --show-belongs_to #{options} > #{output_file}`
doc = File.read(output_file)
doc = doc.gsub('\l',"\\l\n")
File.open(output_file, 'w') {|f| f.write(doc) }
else
`EXCLUDED_MODELS=#{excluded_models} railroady -M -s #{files.join(',')} --show-belongs_to #{options} | neato -Tpng > #{output_file}`
end
puts "find the file in #{output_file}"
end
desc 'Generates an class diagram for all customer models.'
task :customers do
output = ENV['OUTPUT'] || 'tmp/customers.png'
customer_ignores = if ENV['FOCUS']
%w(Tagging Tag Ticket Task).join(',')
else
''
end
ENV['EXCLUDE_MODELS'] = ENV['EXCLUDE_MODELS'].to_s + customer_ignores
files = %w(
app/models/customer.rb
app/models/contact.rb
app/models/lead.rb
app/models/sms_notification.rb
app/models/phone_block_list.rb
app/models/call.rb
)
generate_for_files(files, output)
end
end
end