-
-
Notifications
You must be signed in to change notification settings - Fork 9k
frontend: Use system locale with UTF-8 instead of 'C' #12624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Scouted the web and the codebase for potential issues. Here's some observations... General stuff about setlocale() on Windows
multibyte <-> utf8 <-> wcharIn platform.h there are various string conversion functions. From these only the multibyte functions with The The utf8 <-> wchar functions (ie Streams and file operationsC++ streams, like C-style
When to setlocale() ?On principle locale should be one of the first things to set, since many things inherit it and it's cumbersome to retroactively reset it's state to existing things. However, since Qt overwrites locale during construction of |
9d20b0c to
c0dbe23
Compare
3849250 to
92f93f4
Compare
Highlighting this because this is a severe change, even though I think it's correct in principle. Changing an application's display language should not change the regional settings (which encompass sorting rules as well as decimal point character, etc.), and at least that's how it works on macOS. @Warchamp7 @Fenrirthviti would be good to hear if you'd be fine with this change conceptually as well. |
|
My main concern here, as someone who only uses the English/USA locale/region, is that I'm unsure what the expectation for a Windows application is. The current motivation seems to be "Unix does it this way" and that to me, is not sufficient. Do we have examples and recommendations from Microsoft, or other prominent Windows applications on how they handle this kind of setting for apps that use translations? |
|
Microsoft general guidelines for globalization suggests:
My own take is that OS regional settings + app translations is the desired output with no additional settings in UI, good likelyhood being fine by default, but allows configuration when needed. The obvious drawback is that to change the regional settings one needs to change OS settings, which could be an issue when using a shared device, but OS should provide options for it. Many Microsoft apps understandably just follow the OS region settings. That includes the file explorer. Apps like Office do offer a separate in app setting for regional settings as well, but that's because they specialize in that sort of thing. For apps in general many have translation + maybe time formatting options and don't follow a specific region per se. The sorting order is a gamble. Steam seems to sort games and friends using the app display language. Spotify sorts playlists by Windows display language (not region settings, nor Spotify display language, I don't recommend this). Whatsapp uses OS regional settings. In any case, any locale is still better than the current situation with non-changeable "C" locale. |
|
Thanks for the additional context here. My, admittedly mostly uninformed opinion based on the discussion here, is that this seems fine. Without lack of a clear "best practice" on Windows, moving things in-line with our cross-platform implementation seems like the best option. @PatTheMav or @jcm93 Does macOS follow a similar approach to what is being proposed here? |
On macOS language and regional settings are also separate things and it's expected that your app's language is actually changed from the OS' language settings (rather than within the application itself) which in gendered languages also includes choice of preferred pronoun: The language setting indeed only changes the display language, decimal format, sorting, et. al. still follow the regional setting (at least in "native" apps). |
|
In the current state the PR works as designed, but I'm thinking of adding the following lines to obs.manifest to set active code page to utf-8 for Win32 APIs (the A versions of functions). While OBS uses those relatively little, with hard coded strings mostly, it could ease 3rd party code adaptation, plugins etc. It also allows utf-8 interpretation of commandline arguments, like loading a specific collection, scene or profile by their name. Lines to add to obs.manifest Doing this would mean the locale setup routine would need to be done much earlier in the program than the |
dd143ca to
29aa8fc
Compare
|
The obs.manifest changes has been added and locale setup routine moved to beginning of the app, with unix re-running it after OBSApp has been created. Here's some info about the manifest if you wish to read. https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page If there's any issues with locale setup routine being the first thing to do please let me know. |
Sets runtime locale to system locale with UTF-8 codepage. This is already default behavior on unix, but Windows defaults to minimal 'C' locale. Use CRT locale for C++ std::locale default OBS Studio language settings no longer change QLocale default locale, instead system locale is used for conformity. It is likely this is what user wants as well. Ie. sorting and formatting functions should follow OS locale instead of OBS Studio language (which also lacks country information).
Cast ctype function char parameters to unsigned char to ensure they are in correct range (0 to 255 vs -128 to 127) when used with utf-8 encoding (or extended ascii). Fixes dstr astrcmp* functions when used with utf-8 (or extended ascii) characters, so now they are treated greater than the base ascii and thus sorted after them, not before.
Switch locale-aware timestamping for logging / crash handling to %H:%M:%S
Utf-8 manifest allows Win32 API to use utf-8 instead of ANSI codepages. This changes the "A" versions of fucntions to work with utf-8. Manifest also treats command line arguments as utf8. This allows for example --profile <name> to load profiles with special characters. The locale setup routine is moved to the beginning of program to better cover io streams, without the need to reconfigure them later. The caveat is that unix will need to re-do the routine after initilizing OBSApp, because Qt overwrites the locales for unix during it's construction. Added logging for locale info and app language to better diagnose potential errors.

Description
Sets C runtime locale to system locale with UTF-8 codepage on Windows.
This has always been default behavior on unix, but Windows defaults to minimal 'C' locale.
LC_NUMERICis still set to"C". Now on all platforms instead of just unix.This is so decimal point is a dot (not a comma) for string <-> float conversions.
The configured CRT locale is copied to be the default
std::localefor C++.All platforms have been using minimal
"C"until now. This change affects newfacetandios_baseinstances without a specified locale orimbue()call.Sets UTF-8 as active codepage for Windows in
obs.manifestfile. This changes Win32 API to use utf-8 for theAfunctions instead of the language dependent ANSI codepage. TheWfunctions are untouched and are used by default because we use the UNICODE build flag. It also treats commandline arguments as utf-8, which allows for example--profile <name>to load profiles with special characters.OBS Studio language setting no longer changes
QLocale's default locale and instead always uses system locale.This gives conformity with non Qt functions, but most importantly is likely what user wants as well. Ie. sorting and formatting functions should follow OS locale rules instead of OBS Studio translations language. (Reverts c4840dd)
obs_get_locale()still returns OBS language locale, which is used for Python and LUA apis, GDI+ text widget transformations, and HTTP accepted languages header.Motivation and Context
Locale-aware operations like sorting and time formatting in C are not available on Windows, but are on unix, as pointed out in PR #12577.
Fixes #11133, fixes #12953
The C++ locale and QLocale changes make the locale-aware functions of all layers work in similiar fashion.
For example: On unix currently the used locales are: OS locale for CRT, minimal
"C"for C++ and OBS language for QLocale.A weekday name can be in three different languages depending if you used
strftime(),std::time_getfacet orQLocale.This makes string transformations between C, C++ and Qt very tricky.
How Has This Been Tested?
An important point is that the CRT locale settings introduced here have always been this way for unix, which suggests there aren't any insurmountable problems with the new locales. Windows specific functions should be tested for CRT locale. Changes for C++ and QLocale defaults affect all platforms.
Searched the codebase for affected areas and addressed as necessary:
strftime()formatting with%placeholdersscanf()andprintf()formatting with%placeholdersFILEoperationsfstreamoperationsfacetlocale usageQStringlocale-aware methodsSome general testing with Japanese characters
I'm on Windows 11 English US version, but with Finnish locale settings (fi_FI). OBS language is English.
Types of changes
_mbs_conversion functions directly or via file io operations have to input text in utf-8 and expect utf-8 output (except forwchar/_wcs_which is OS defined).Checklist: