@jjoseba Just to let you know what I found and tried here…
On my local Moodle and Oppia server, I don’t get this error. However I’m using Mysql 5.7. If I publish to our staging server I get the same error (this is on Mysql 8.x). I noticed the the default collation in MySQL is different between the versions utf8mb4_generic_ci (5.7) vs utf8mb4_900_ai_ci (8.x).
I tried exporting the database, changing the collations in the script, then reloading the database (so all was specified as utf8mb4_general_ci), but the problem remains.
After being able to replicate this, it was not a problem raised by the different database collations. The error shown related with unzipping was not misleading. After debugging the upload process, it fails directly at zip extracting. I searched in the course XML for the string that contains the character Django is complaining about (“Almaz’s cas”) and… it does not appear! It is the name of one of the images referenced in the course.
So the issue is not in writing that special character into the database, but in managing a zip file that contains non-ASCII characters in some of the paths. There are two possible solutions for this:
Instead of relying on zipfile.extractall() method of the zip library, manually manage the unzipping process of the course package and decode each path to avoid mismatching encoding errors.
Avoid using special characters in filenames in the Moodle export block.
Meanwhile, as a temporary fix for this course (or others with similar issues), the filenames can be changed manually avoiding the special characters. In this specific course these are the files with a non-ASCII character (right quotation mark):
ILLU_2.2.2.A. Illustration for Almaz’s case study.jpg
ILLU_2.2.2.B. Illustration for Fatuma’s case study.jpg
ILLU_3.1.2.A. Illustration for Lemlem’s case study.jpg
ILLU_3.2.1.A. Illustration for Aynalem’s case study.jpg
Glad to help!
Anyway, I think this should be detected and handled by the system, so it’s something to add in the development roadmap. The next time there will be no need for those extra checks