bdunagan

fill the void - bdunagan

27 Jun 2009
ibtool Caveats

I've previously covered how Apple's command-line utility ibtool can be used to automate localization tasks: generating a .strings file, localizing an English XIB, and incrementally updating a localized XIB. It's a fantastic tool, included in the Xcode development installation, and works well for both Mac OS X applications and iPhone apps. I use this automation extensively on Mac OS X (10.5): 14 XIBs across 11 languages means many, many files to keep in sync. However, the heavy use has exposed an annoying bug in ibtool: silent errors.

"Could not be parsed"

When ibtool encounters an expected bad character, like an unescaped double-quote, it verbally fails, supplying the message below.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
	<key>com.apple.ibtool.errors</key>
	<array>
		<dict>
			<key>description</key>
			<string>ibtool failed with exception: The stringsfile MainMenu.ES.strings could not be parsed.</string>
		</dict>
	</array>
</dict>
</plist>

This error is helpful, but it doesn't identify the offending line. To do that, we can use another Apple command-line tool: plutil. (Thanks to Cocoa Musings for pointing this out.) plutil can parse a .strings file and provide the line the bad character was on. It's a very useful program when the .strings file that failed has 3K lines, and the problem is a single unescaped double-quote. ibtool generates these verbal parse errors on all alphanumeric characters and a number of others (" ' : . / - _ $).

Silent Errors

Unfortunately, ibtool (and plutil) fails silently on these other characters:

{ } [ ] ; , < > ? \ | = + ! ` ~ @ # % ^ & * ( )

Let me say that again: ibtool fails silently. The program stops incorporating localized strings when it encounters one of these characters, but it does not produce an error. The bug becomes particularly nasty when scaling up to 150 localized XIBs. I only discovered the bug when I found a single XIB incorrectly localized and tracked it back to a set of brackets randomly inserted into a .strings file.

So, I wrote a ruby script, validate_strings_files.rb, to validate .strings files more completely. The script parses through a directory of .strings files and produces verbal errors, with line numbers, for any out-of-place characters. It works well on the 11 languages I deal with. An error that would silently fail in ibtool would produce the following result with the script:

VERIFYING FILE MainMenu.ES.strings
MainMenu.ES.strings (328): @

Below is the code for validate_strings_files.rb.

#!/usr/local/bin/ruby
# encoding: UTF-8

# NOTE: requires ruby 1.9
# MIT license

# validate_strings_files.rb
# This script reads through the .strings files in a directory and identifies any bad characters inside,
# both those that ibtool would catch and those that ibtool would miss.
# * ibtool verbal errors: " ' : . / - _ $
# * ibtool silent errors: { } [ ] ; , < > ? \ | = + ! ` ~ @ # % ^ & * ( )

require 'FileUtils'

# Check for arguments.
if ARGV.length != 1
  puts "Usage: ruby validate_strings_files.rb path_to_strings"
  exit
end

# Get path argument and 'cd' to that path.
PATH = ARGV[0]
FileUtils.cd(PATH)

def verify_file(file)
  line_number = 0
  is_multi_line_comment = false

  # Use general method unless file encoding is UTF-16.
  file_handle = File.new(file, "r")
  # file_handle = File.new(file, "r:UTF-16LE:UTF-8") (See http://blog.grayproductions.net/articles/ruby_19s_three_default_encodings)
  
  puts "VERIFYING FILE #{file}"
  while (!file_handle.eof?)
    has_seen_equals_sign = false
    has_seen_semi_colon = false
    is_single_line_comment = false
    is_string = false
    line_number += 1
    previous_char = nil
    second_previous_char = nil

    line = file_handle.readline
    # Use each_char (rather than each_byte) to support unicode.
    line.each_char do |char|
      # Notes:
      # * line can only be 2 strings, N multi-line comments, 1 equals sign (=), 1 semi-colon (;)
      # * ignore character if it's in a string
      # * single-line comments are ended by \n
      # * multi-line comments are ended by */

      if is_string || is_single_line_comment || is_multi_line_comment
        if is_string && previous_char != "\\" && char == "\""
          is_string = false
        elsif is_string && second_previous_char == "\\" && previous_char == "\\" && char == "\""
          is_string = false
        elsif is_single_line_comment && char == "\n"
          is_single_line_comment = false
        elsif is_multi_line_comment && previous_char == "*" && char == "/"
          is_multi_line_comment = false
        end
        # Ignore.
      elsif previous_char == nil && char == "/"
        # Ignore.
      elsif previous_char == "/" && char == "/"
        is_single_line_comment = true
      elsif previous_char == "/" && char == "*"
        is_multi_line_comment = true
      elsif !has_seen_equals_sign && char == "="
        has_seen_equals_sign = true
      elsif !has_seen_semi_colon && char == ";"
        has_seen_semi_colon = true
      elsif char == " " ||char == "\n"  || char == "\r"
        # Ignore.
      elsif char == "\""
        is_string = true
      elsif char == "\000"
        # Ignore unicode padding.
      elsif line_number == 1 && previous_char == nil && char == "\377"
        # Ignore unicode file header.
      elsif line_number == 1 && previous_char == nil && char == "\376"
        # Ignore unicode file header.
      else
        puts "#{file} (#{line_number}): #{char}"
      end
      
      if char != "\000" && char != "\377" && char != "\376"
        # Save previous character if it's not unicode padding.
        second_previous_char = previous_char
        previous_char = char
      end
    end
  end
end

# Iterate through the current directory.
Dir.entries(".").each do |file|
  filename = file.slice(0,file.length-8)
  extension = file.slice(file.length-8,file.length)
  # Only deal with .strings.
  if (extension == ".strings")
    # Read the file and identify any bad characters.
    verify_file(file)
  end
end

One important sidenote is Unicode support. The script works with Unicode characters but not universally. I have two lines to allow complete Unicode support. The first works on most languages, and the second is used for UTF-16 languages, like Japanese (JA).

# Use general method unless file encoding is UTF-16.
file_handle = File.new(file, "r")
# file_handle = File.new(file, "r:UTF-16LE:UTF-8")

Keep in mind that Ruby 1.9 is required because of its Unicode support; refer to the Hivelogic article for compilation instructions. The r:UTF-16LE:UTF-8 code is from this article on Ruby encodings.

"Class mismatch"

One other error I've encountered is "class mismatch". ibtool produces this error when the class of a UI element associated with a specific ObjectID changes. For instance, imagine two languages were modified independently, and both had a UI element added to them. The first might be an NSTextField whereas the second might be an NSMenuItem; however, the IDs might be the same. The two entries in the .strings files would look like so:

/* Class = "NSTextField"; title = "First Text"; ObjectID = "419"; */ /* Class = "NSMenuItem"; title = "Second Text"; ObjectID = "419"; */

Using the first as the base for localization, incoporating the strings into the second wouldn't make any sense for this UI element. Luckily, ibtool produces the following error when trying to localize those XIBs:

<plist version="1.0">
<dict>
  <key>com.apple.ibtool.errors</key>
  <array>
    <dict>
      <key>description</key>
      <string>Class mismatch during incremental localization for Object ID: 419.  New Base has class: NSTextField.  Old Base has class: NSTextField.  Old Loc has class: NSMenuItem.</string>
    </dict>
  </array>
</dict>
</plist>

08/10/2009 Update: I updated the validation script to account for escaped backslashes, such as "\\"," = "";.

11/17/2009 Update: I finally submitted ibtool's "silent error" bug to Apple's Radar as rdar://7384153 and added it to OpenRadar. Also, Wil Shipley posted his experiences with Cocoa localization, touching on ibtool's failures.

Previous LinkedIn Twitter GitHub Email Next