So this was a bit of a sticky wicket that I recently extracted myself from and I wanted to make notes so I didn’t make the same mistake again. I was adding a new table that related two existing tables and added the following code
class CreateTfcListingPeople < ActiveRecord::Migration def change create_table :tfc_listing_people do |t| t.integer :listing_id, index: true t.string :person_id, limit: 22, index: true t.timestamps null: false end add_foreign_key :tfc_listing_people, :people add_foreign_key :tfc_listing_people, :listings end end
However, I didn’t notice that the datatype of the person.id column (which is a varchar) was `id` varchar(22) COLLATE utf8_unicode_ci NOT NULL
This led to the following error popping up in one of the non production environments:
2018-02-27T17:10:05.277434+00:00 app[web.1]: App 132 stdout: ActionView::Template::Error (Mysql2::Error: Illegal mix of collations (utf8_unicode_ci,IMPLICIT) and (utf8_general_ci,IMPLICIT) for operation '=': SELECT COUNT(*) FROM `people` INNER JOIN `tfc_listing_people` ON `people`.`id` = `tfc_listing_people`.`person_id` WHERE `tfc_listing_people`.`listing_id` = 42):
I was able to fix this with the following alter statement (from this SO post): ALTER TABLE `tfc_listing_people` CHANGE `person_id` `person_id` VARCHAR( 22 ) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL
.
But in other environments, there was no runtime error. There was, however, a partially failed migration, that had been masked by some other test failures and some process failures, since there was a team handoff that masked it. The create table
statement had succeeded, but the add_foreign_key :tfc_listing_people, :people
migration had failed.
I ran this migration statement a few times (pointer on how to do that): ActiveRecord::Migration.add_foreign_key :tfc_listing_people, :people
and, via this SO answer, I was able to find the latest foreign key error message:
2018-03-06 13:23:29 0x2b1565330700 Error in foreign key constraint of table sharetribe_production/#sql-2c93_4a44d: FOREIGN KEY (person_id) REFERENCES people (id): Cannot find an index in the referenced table where the referenced columns appear as the first columns, or column types in the table and the referenced table do not match for constraint. Note that the internal storage type of ENUM and SET changed in tables created with >= InnoDB-4.1.12, and such columns in old tables cannot be referenced by such columns in new tables. Please refer to http://dev.mysql.com/doc/refman/5.7/en/innodb-foreign-key-constraints.html for correct foreign key definition.
So, again, just running the alter statement to change the collation of the tfc_listing_people
table worked fine. However, while I could handcraft the fix on both staging and production and did so, I needed a way to have this change captured in a migration or two. I split apart the first migration into two migrations. The first created the tfc_listing_people
table, and the second looked like this:
class ModifyTfcListingPeople < ActiveRecord::Migration def up execute <<-SQL ALTER TABLE `tfc_listing_people` CHANGE `person_id` `person_id` VARCHAR( 22 ) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL SQL add_foreign_key :tfc_listing_people, :people add_foreign_key :tfc_listing_people, :listings end def down drop_foreign_key :tfc_listing_people, :people drop_foreign_key :tfc_listing_people, :listings end end
Because I’d hand crafted the fixes on staging and production, I manually inserted a value for this migration into the schema_migrations
table to indicate that the migration had been run in those environments. If I hadn’t had two related but different migration actions, I might not have had to go through these manual gyrations.
My lessons from this episode:
- pay close attention to any errors and failed tests, no matter how innocuous. This is a variation of the “broken window theory”
- break migrations into small pieces, which are easier to debug and to migrate back and forth
- knowing SQL and having an understanding of how database migrations work (they are cool, but they aren’t magic, and sometimes they leak) was crucial to debugging this issue