07 Nov 2011
Migrating JIRA to Bugzilla
Recently, my team needed to migrate from Atlassian’s JIRA to Mozilla’s Bugzilla. I expected this process to be uncommon but not unheard of. Not so. Atlassian has a nice tool for migrating Bugzilla data into its JIRA system, but the one googleable conversation I could find about migrating JIRA to Bugzilla was in 2006.
I’ll describe the process I used to migrate around 6K bugs.
Bugzilla’s importxml.pl: Bugzilla does ship with an import script. However, it’s specifically designed for moving bugs from one Bugzilla instance to another. As input, it takes an XML file of one or more bugs, assuming you just used the XML export built into Bugzilla. Makes sense. It meant that I just extract all the issues from JIRA and reformat them into Bugzilla’s DTD. I found it very helpful to add the –verbose flag for better errors.
Bugzilla’s checksetup.pl: Bugzilla will check your setup with this script and let you know if you are missing any Perl modules. In particular, for importing, you will need XML::Twig. Oddly, when I started the import process, I found that I was still missing XML::Parser, so I had to install that from CPAN as well. Also, the script changed the ownership and permissions of the bugzilla directory such that it was no longer accessible on the web; I had to recursively revert those like so:
nokogiri: This handy Ruby gem allowed me to use XPath to search through the JIRA XML files and extract specific fields and values. It’s extremely useful. It installs effortlessly on Lion (10.7) with a simple gem install nokogiri. Unfortunately, I was using Snow Leopard (10.6). Installing the gem on that OS was a small battle in itself. Finally, I used this gist and then followed steps here. That worked for me. Here are some handy snippets of Nokogiri in action:
JIRA Issues: There are a couple ways to export issues from JIRA. One obvious way is to simply search for the issues you want and click “View” at the top to select a different format, like XML. While straight-forward, this method has two significant downsides. First, data loss: comments are not included. Second, size: thousands of issues take a long time to process as a single XML file, and neither JIRA nor Chrome seemed happy about the size.
An alternative to this export is using curl to export issues individually. This process includes comments and involves lots of quick JIRA queries. As an added bonus, I can avoid any XML SAX state logic that a single large XML file would have needed, so I can focus on transforming issues into bugs in isolation. Sweet.
JIRA Attachments: In addition to the issues, I wanted to extract all of the attachments. Luckily, JIRA provides a standard HTTP API for getting these files. We just need all of the attachment IDs from each extracted JIRA issue to access its attachments. We’ll save those to an attachments directory. Bugzilla actually imports attachments as embedded base64 strings in the XML files, but we’ll address that later. For now, we just want to save the attachments out of JIRA.
Bugzilla Field Values: Bugzilla will not automatically create people, products, versions, components, or milestones during the import process. Those need to already exist. Otherwise, Bugzilla will use the default product and component. In my case, they did already exist, but the names had been changed. To handle these changes, I map JIRA strings to Bugzilla strings in my main script.
At this point, we have all the tools and the data we need to tranform JIRA issues into Bugzilla bugs. I committed the full jira2bugzilla.rb Ruby script to a GitHub repo. It doesn’t work out of the box, as there are quite a few instance-specific variables, but the script provides a nice base. I’ll touch on a couple points:
Attachments: Bugzilla expects attachments embedded in the XML file, so we need to convert our binary files into base64 strings and then include them inline. Below is Ruby code to convert to base64. Keep in mind that the import process takes far longer when attachments are included.
Description: While JIRA gives the bug description its own element (description), Bugzilla considers it the first comment on the bug. When rewriting the JIRA issues into a Bugzilla bug, I needed to migrate the description into the first comment.
Severity: JIRA doesn’t seem to have a notion of severity, so I assigned a default severity to all the bugs.
QA Contact: JIRA doesn’t seem to have a notion of QA contact, so when setting up the components, be sure to assign the default QA contact correctly. The import script will assign each bug to the correct person.
Again, see this GitHub repo for the full jira2bugzilla.rb script. I simply run ruby jira2bugzilla.rb in the directory with all of the JIRA XML issues.
Waiting for Import
At this point, we have converted all the JIRA issues into Bugzilla bugs. Next, we transfer them onto the Bugzilla server into a subdirectory of the bugzilla installation, like bugzilla/bugs/. To import the bugs, I ran the following one-liner. It finds all the XML files, sorts them alphabetically, and feeds each one by one into the Bugzilla import script.
The script imported 6K bugs in two hours. However, I didn’t include attachments. When I tried to import a single bug with a 5MB attachment, it took around thirty minutes. With multiple gigabytes of attachments, I opted to not include them.
Most importantly, check the log. The above one-liner pipes all of the log information to bug-import.log. Run tail -f bug-import.log while it’s running to make sure the process is working. Grep through the output afterwards for terms like “Bad” or “Error” to ensure all the bugs were imported. With 6K bugs, most imported correctly, but a few did not.
I just wanted to highlight that migrating away from Atlassian’s JIRA is not a reflection of their product. I was simply tasked with making it happen. Frankly, Atlassian has been doing an amazing job at steadily growing into a vertically integrated company. They received $60M in VC funding in 2010 to grow their business, and they’ve scooped up businesses like BitBucket and SourceTree. Just a couple weeks ago, they announced that their main products will now be available as cloud services. Atlassian seems to have a long-term strategy that they’re executing very well.