In case anyone was wondering, yes we were down for ~2 hours or so. I apologize for the inconvenience.

We had a botched upgrade path from 0.17.4 -> 0.18.0. I spent some time debugging but eventually gave up and restored a snapshot (taken on Saturday Jun 24, 2023 @ 11:00 UTC).

We’ll likely stick to 0.17.4 till I can figure out a safe path to upgrade to a bigger (and up-to-date) instance and carry over all the user data. Any help/advice welcome. Hopefully this doesn’t occur again!

  • lemmyrs@lemmyrs.orgOPM
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    1 year ago

    Well, the lemmy container kept running into:

    lemmy  | 2023-06-24T23:28:52.716586Z  INFO lemmy_server::code_migrations: Running apub_columns_2021_02_02
    lemmy  | 2023-06-24T23:28:52.760510Z  INFO lemmy_server::code_migrations: Running instance_actor_2021_09_29
    lemmy  | 2023-06-24T23:28:52.763723Z  INFO lemmy_server::code_migrations: Running regenerate_public_keys_2022_07_05
    lemmy  | 2023-06-24T23:28:52.801409Z  INFO lemmy_server::code_migrations: Running initialize_local_site_2022_10_10
    lemmy  | 2023-06-24T23:28:52.803303Z  INFO lemmy_server::code_migrations: No Local Site found, creating it.
    lemmy  | thread 'main' panicked at 'couldnt create local user: DatabaseError(UniqueViolation, "duplicate key value violates unique constraint \"local_user_person_id_key\"")', crates/db_schema/src/impls/local_user.rs:157:8
    

    despite the fact that:

    lemmy=# select id, site_id from local_site;
     id | site_id
    ----+---------
      1 |       1
    (1 row)
    

    So you can see that it was unconditionally trying to create a local_site and running into a DB constraint error. I further narrowed it down to this piece of code:

    ///
    /// If a site already exists, the DB migration should generate a local_site row.
    /// This will only be run for brand new sites.
    async fn initialize_local_site_2022_10_10(
      pool: &DbPool,
      settings: &Settings,
    ) -> Result<(), LemmyError> {
      info!("Running initialize_local_site_2022_10_10");
    
      // Check to see if local_site exists
      if LocalSite::read(pool).await.is_ok() {
        return Ok(());
      }
      info!("No Local Site found, creating it.");
    

    At this point I gave up because I couldn’t really tell why LocalSite::read(pool).await.is_ok() was, well…not ok.