Skip to content

Conversation

@Dankirk
Copy link

@Dankirk Dankirk commented Sep 13, 2025

Description

  • Sets C runtime locale to system locale with UTF-8 codepage on Windows.
    This has always been default behavior on unix, but Windows defaults to minimal 'C' locale.

    • LC_NUMERIC is still set to "C". Now on all platforms instead of just unix.
      This is so decimal point is a dot (not a comma) for string <-> float conversions.
  • The configured CRT locale is copied to be the default std::locale for C++.
    All platforms have been using minimal "C" until now. This change affects new facet and ios_base instances without a specified locale or imbue() call.

  • Sets UTF-8 as active codepage for Windows in obs.manifest file. This changes Win32 API to use utf-8 for the A functions instead of the language dependent ANSI codepage. The W functions are untouched and are used by default because we use the UNICODE build flag. It also treats commandline arguments as utf-8, which allows for example --profile <name> to load profiles with special characters.

  • OBS Studio language setting no longer changes QLocale's default locale and instead always uses system locale.
    This gives conformity with non Qt functions, but most importantly is likely what user wants as well. Ie. sorting and formatting functions should follow OS locale rules instead of OBS Studio translations language. (Reverts c4840dd)

  • obs_get_locale() still returns OBS language locale, which is used for Python and LUA apis, GDI+ text widget transformations, and HTTP accepted languages header.

Motivation and Context

Locale-aware operations like sorting and time formatting in C are not available on Windows, but are on unix, as pointed out in PR #12577.
Fixes #11133, fixes #12953

The C++ locale and QLocale changes make the locale-aware functions of all layers work in similiar fashion.

For example: On unix currently the used locales are: OS locale for CRT, minimal "C" for C++ and OBS language for QLocale.
A weekday name can be in three different languages depending if you used strftime(), std::time_get facet or QLocale.
This makes string transformations between C, C++ and Qt very tricky.

How Has This Been Tested?

An important point is that the CRT locale settings introduced here have always been this way for unix, which suggests there aren't any insurmountable problems with the new locales. Windows specific functions should be tested for CRT locale. Changes for C++ and QLocale defaults affect all platforms.

Searched the codebase for affected areas and addressed as necessary:

  • CRT: ctype.h character classification function parameters and expected return values
  • CRT: strftime() formatting with % placeholders
  • CRT: scanf() and printf() formatting with % placeholders
  • CRT: FILE operations
  • C++: fstream operations
  • C++: facet locale usage
  • QLocale: Expected return values of formatting functions
  • QLocale: QString locale-aware methods

Some general testing with Japanese characters

  • Edited recording path with %A (weekday) variable and some Japanese characters. Recorded a video. Weekday name was localized and recording worked fine.
  • Remuxed said file. Worked fine.
  • Renamed some sources with Japanese characters and exported the scene collection, removed it from OBS and re-imported it. No problems.
  • Wrote names of those sources to logFile with blog()

I'm on Windows 11 English US version, but with Finnish locale settings (fi_FI). OBS language is English.

Types of changes

  • New feature (non-breaking change which adds functionality)
  • Tweak (non-breaking change to improve existing functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
    • 3rd party Python, LUA and rtmp that have been using _mbs_ conversion functions directly or via file io operations have to input text in utf-8 and expect utf-8 output (except for wchar/_wcs_ which is OS defined).
    • 3rd party plugins that use the A functions of Win32 API instead of W variants need to expect utf-8 instead of OS default ANSI coded data.

Checklist:

  • My code has been run through clang-format.
  • I have read the contributing document.
  • My code is not on the master branch.
  • The code has been tested.
  • All commit messages are properly formatted and commits squashed where appropriate.
  • I have included updates to all appropriate documentation.

@WizardCM WizardCM added the Enhancement Improvement to existing functionality label Sep 13, 2025
@Dankirk
Copy link
Author

Dankirk commented Sep 15, 2025

Scouted the web and the codebase for potential issues. Here's some observations...
EDIT: These have been accounted for in the PR description.

General stuff about setlocale() on Windows
For list of things C runtime locale affects https://cppreference.com/w/c/locale/setlocale.html

  • Important distinction is some functions only care about codepage/encoding of the locale, not the language_region rules specifically.
  • We can ignore all number related things, because we use the minimal 'C' locale for LC_NUMERIC.
  • Affected string.h and time.h functions are all functions that are specifically for locale-aware things, Things like weekday names will now be localized using strftime() and strcoll() will do a locale-aware comparison.
  • ctype.h character classification ranges are extended. ie. isalnum() may return true for more characters, so there is reason to check if any part using these functions is okay with that. Couldn't hurt to cast the parameters to unsigned char either, since many functions expect value to be 0-255, which char using utf-8 casted to int might not be (char range is -128 to 127). Then again, the functions in use have worked fine on unix until now...
  • stdio.h Formatting of the % placeholders in scanf() and printf() and the sort is affected. Decimals will still be dots (controlled by LC_NUMERIC), but %s will match more. More about file operations below.

multibyte <-> utf8 <-> wchar

In platform.h there are various string conversion functions. From these only the multibyte functions with _mbs_ are affected by this change. The rest use Windows API, which doesn't follow C runtime locale modified by setlocale(). All of these don't care about the language_region, only about the codepage/encoding, which should be UTF-8, not Windows default ANSI 1252 for example.

The _mbs_ functions are currently not used in OBS Studio itself, but are offered for external usage for Python, LUA and rtmp. This means there's no change for OBS Studio itself, but external things might see different results from these conversion functions on Windows, which will now be more alike to return values on unix. On Windows mb* functions are affected by _setmbcp() while setlocale() will suffice on unix.

The utf8 <-> wchar functions (ie os_utf8_to_wcs()) use MultiByteToWideChar() and WideCharToMultiByte() functions with utf-8 codepage, which will work after this update. Unlike mbstowcs() and the sort in _mbs_ implementations, these functions are independent from CRT locale, but do follow the manifest declaration (though only for CP_ACP, which we don't use). On unix these functions use a custom implementation for conversion, which naturally assumes the text is/is-to-be utf-8 encoded.

Streams and file operations

C++ streams, like fstream are controlled by std::locale::global() or facets, which is separate setting from CRT setlocale(). Thus, C++ stream operations have been using the minimal "C" locale by default (both unix and Windows). This change copies the CRT locale as default for C++ too. Any fstreams initialized before std::locale::global() call should call imbue() to match the new locale. Any streams initialized after inherit the global locale.

C-style FILE wide char streams pick the locale available when first io operation is used and continue using that. So it is important setlocale() is called before these streams are used or they are re-opened with freopen().

printf() and scanf() -type functions use locale for % placeholders, as explained above.

When to setlocale() ?

On principle locale should be one of the first things to set, since many things inherit it and it's cumbersome to retroactively reset it's state to existing things. However, since Qt overwrites locale during construction of QApplication (OBSApp) for unix we could use OBSApp constructor, as we have been to reset LC_NUMERIC back to "C". If it is decided that OBS translations locale should be followed instead of OS's, initLocale() also seems acceptable.

@Dankirk Dankirk force-pushed the locale branch 8 times, most recently from 9d20b0c to c0dbe23 Compare September 19, 2025 20:30
@Dankirk Dankirk force-pushed the locale branch 2 times, most recently from 3849250 to 92f93f4 Compare September 28, 2025 18:19
@Dankirk Dankirk changed the title frontend: Use system locale on Windows instead of 'C' frontend: Use system locale instead of 'C' Sep 28, 2025
@Dankirk Dankirk marked this pull request as ready for review September 29, 2025 21:04
@PatTheMav
Copy link
Member

  • OBS Studio language setting no longer changes QLocale's default locale and instead always uses system locale.
    This gives conformity with non Qt functions, but most importantly is likely what user wants as well. Ie. sorting and formatting functions should follow OS locale rules instead of OBS Studio translations language. (Reverts c4840dd)

Highlighting this because this is a severe change, even though I think it's correct in principle. Changing an application's display language should not change the regional settings (which encompass sorting rules as well as decimal point character, etc.), and at least that's how it works on macOS.

@Warchamp7 @Fenrirthviti would be good to hear if you'd be fine with this change conceptually as well.

@Fenrirthviti
Copy link
Member

My main concern here, as someone who only uses the English/USA locale/region, is that I'm unsure what the expectation for a Windows application is. The current motivation seems to be "Unix does it this way" and that to me, is not sufficient. Do we have examples and recommendations from Microsoft, or other prominent Windows applications on how they handle this kind of setting for apps that use translations?

@Dankirk
Copy link
Author

Dankirk commented Jan 15, 2026

Microsoft general guidelines for globalization suggests:

Don't use language to assume a user's region; and don't use region to assume a user's language.

My own take is that OS regional settings + app translations is the desired output with no additional settings in UI, good likelyhood being fine by default, but allows configuration when needed. The obvious drawback is that to change the regional settings one needs to change OS settings, which could be an issue when using a shared device, but OS should provide options for it.

Many Microsoft apps understandably just follow the OS region settings. That includes the file explorer. Apps like Office do offer a separate in app setting for regional settings as well, but that's because they specialize in that sort of thing. For apps in general many have translation + maybe time formatting options and don't follow a specific region per se. The sorting order is a gamble. Steam seems to sort games and friends using the app display language. Spotify sorts playlists by Windows display language (not region settings, nor Spotify display language, I don't recommend this). Whatsapp uses OS regional settings.

In any case, any locale is still better than the current situation with non-changeable "C" locale.

@Fenrirthviti
Copy link
Member

Thanks for the additional context here. My, admittedly mostly uninformed opinion based on the discussion here, is that this seems fine. Without lack of a clear "best practice" on Windows, moving things in-line with our cross-platform implementation seems like the best option.

@PatTheMav or @jcm93 Does macOS follow a similar approach to what is being proposed here?

@PatTheMav
Copy link
Member

Thanks for the additional context here. My, admittedly mostly uninformed opinion based on the discussion here, is that this seems fine. Without lack of a clear "best practice" on Windows, moving things in-line with our cross-platform implementation seems like the best option.

@PatTheMav or @jcm93 Does macOS follow a similar approach to what is being proposed here?

On macOS language and regional settings are also separate things and it's expected that your app's language is actually changed from the OS' language settings (rather than within the application itself) which in gendered languages also includes choice of preferred pronoun:

The language setting indeed only changes the display language, decimal format, sorting, et. al. still follow the regional setting (at least in "native" apps).

@Dankirk
Copy link
Author

Dankirk commented Jan 29, 2026

In the current state the PR works as designed, but I'm thinking of adding the following lines to obs.manifest to set active code page to utf-8 for Win32 APIs (the A versions of functions). While OBS uses those relatively little, with hard coded strings mostly, it could ease 3rd party code adaptation, plugins etc.

It also allows utf-8 interpretation of commandline arguments, like loading a specific collection, scene or profile by their name.
For example ./obs64.exe --profile キアラ only works with the utf-8 manifest or Japanese codepage (932) being the OS default.

Lines to add to obs.manifest

<application>
    <windowsSettings>
        <activeCodePage xmlns="http://schemas.microsoft.com/SMI/2019/WindowsSettings">UTF-8</activeCodePage>
    </windowsSettings>
</application>

Doing this would mean the locale setup routine would need to be done much earlier in the program than the OBSApp constructor, preferably at the start of main(). However, since Qt overwrites locales in the base constructor on unix, unix platforms would need to re-do the locale setup routine after, which is a bit of an ugly design.

@Dankirk Dankirk force-pushed the locale branch 2 times, most recently from dd143ca to 29aa8fc Compare January 30, 2026 01:10
@Dankirk
Copy link
Author

Dankirk commented Jan 30, 2026

The obs.manifest changes has been added and locale setup routine moved to beginning of the app, with unix re-running it after OBSApp has been created.

Here's some info about the manifest if you wish to read. https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page

If there's any issues with locale setup routine being the first thing to do please let me know.
The reasoning for it is that many things inherit the locale when they are initialized and retroactively updating that is cumbersome. This includes things like outputting to anything to stdout or using fstream. On my observation load_debug_privilege() is the first function to use blog(), so the locale setup routine should happen before that.

@Dankirk Dankirk changed the title frontend: Use system locale instead of 'C' frontend: Use system locale instead of 'C' with UTF-8 Jan 30, 2026
Sets runtime locale to system locale with UTF-8 codepage. This is already default behavior on unix, but Windows defaults to minimal 'C' locale.

Use CRT locale for C++ std::locale default

OBS Studio language settings no longer change QLocale default locale, instead system locale is used for conformity. It is likely this is what user wants as well. Ie. sorting and formatting functions should follow OS locale instead of OBS Studio language (which also lacks country information).
Cast ctype function char parameters to unsigned char to ensure they are in correct range (0 to 255 vs -128 to 127) when used with utf-8 encoding (or extended ascii).

Fixes dstr astrcmp* functions when used with utf-8 (or extended ascii) characters, so now they are treated greater than the base ascii and thus sorted after them, not before.
Switch locale-aware timestamping for logging / crash handling to %H:%M:%S
Utf-8 manifest allows Win32 API to use utf-8 instead of ANSI codepages. This changes the "A" versions of fucntions to work with utf-8.
Manifest also treats command line arguments as utf8. This allows for example --profile <name> to load profiles with special characters.

The locale setup routine is moved to the beginning of program to better cover io streams, without the need to reconfigure them later. The caveat is that unix will need to re-do the routine after initilizing OBSApp, because Qt overwrites the locales for unix during it's construction.

Added logging for locale info and app language to better diagnose potential errors.
@Dankirk Dankirk changed the title frontend: Use system locale instead of 'C' with UTF-8 frontend: Use system locale with UTF-8 instead of 'C' Jan 31, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Enhancement Improvement to existing functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OBS crashes if filename contains german Umlaut Months and weekdays are not localized in filename formatting

4 participants