So this was a bit of a sticky wicket that I recently extracted myself from and I wanted to make notes so I didn’t make the same mistake again. I was adding a new table that related two existing tables and added the following code
class CreateTfcListingPeople < ActiveRecord::Migration
def change
create_table :tfc_listing_people do |t|
t.integer :listing_id, index: true
t.string :person_id, limit: 22, index: true
t.timestamps null: false
end
add_foreign_key :tfc_listing_people, :people
add_foreign_key :tfc_listing_people, :listings
end
end
However, I didn’t notice that the datatype of the person.id column (which is a varchar) was `id` varchar(22) COLLATE utf8_unicode_ci NOT NULL
This led to the following error popping up in one of the non production environments:
2018-02-27T17:10:05.277434+00:00 app[web.1]: App 132 stdout: ActionView::Template::Error (Mysql2::Error: Illegal mix of collations (utf8_unicode_ci,IMPLICIT) and (utf8_general_ci,IMPLICIT) for operation '=': SELECT COUNT(*) FROM `people` INNER JOIN `tfc_listing_people` ON `people`.`id` = `tfc_listing_people`.`person_id` WHERE `tfc_listing_people`.`listing_id` = 42):
I was able to fix this with the following alter statement (from this SO post): ALTER TABLE `tfc_listing_people` CHANGE `person_id` `person_id` VARCHAR( 22 ) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL.
But in other environments, there was no runtime error. There was, however, a partially failed migration, that had been masked by some other test failures and some process failures, since there was a team handoff that masked it. The create table statement had succeeded, but the add_foreign_key :tfc_listing_people, :people migration had failed.
I ran this migration statement a few times (pointer on how to do that): ActiveRecord::Migration.add_foreign_key :tfc_listing_people, :people and, via this SO answer, I was able to find the latest foreign key error message:
2018-03-06 13:23:29 0x2b1565330700 Error in foreign key constraint of table sharetribe_production/#sql-2c93_4a44d: FOREIGN KEY (person_id) REFERENCES people (id): Cannot find an index in the referenced table where the referenced columns appear as the first columns, or column types in the table and the referenced table do not match for constraint. Note that the internal storage type of ENUM and SET changed in tables created with >= InnoDB-4.1.12, and such columns in old tables cannot be referenced by such columns in new tables. Please refer to http://dev.mysql.com/doc/refman/5.7/en/innodb-foreign-key-constraints.html for correct foreign key definition.
So, again, just running the alter statement to change the collation of the tfc_listing_people table worked fine. However, while I could handcraft the fix on both staging and production and did so, I needed a way to have this change captured in a migration or two. I split apart the first migration into two migrations. The first created the tfc_listing_people table, and the second looked like this:
class ModifyTfcListingPeople < ActiveRecord::Migration
def up
execute <<-SQL
ALTER TABLE `tfc_listing_people` CHANGE `person_id` `person_id` VARCHAR( 22 ) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL
SQL
add_foreign_key :tfc_listing_people, :people
add_foreign_key :tfc_listing_people, :listings
end
def down
drop_foreign_key :tfc_listing_people, :people
drop_foreign_key :tfc_listing_people, :listings
end
end
Because I’d hand crafted the fixes on staging and production, I manually inserted a value for this migration into the schema_migrations table to indicate that the migration had been run in those environments. If I hadn’t had two related but different migration actions, I might not have had to go through these manual gyrations.
My lessons from this episode:
- pay close attention to any errors and failed tests, no matter how innocuous. This is a variation of the “broken window theory”
- break migrations into small pieces, which are easier to debug and to migrate back and forth
- knowing SQL and having an understanding of how database migrations work (they are cool, but they aren’t magic, and sometimes they leak) was crucial to debugging this issue
