Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 25 additions & 25 deletions .annotation_safe_list.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,43 +19,43 @@ auth.User:
".. pii_retirement": "consumer_api"
contenttypes.ContentType:
".. no_pii:": "This model has no PII"
oel_collections.Collection:
openedx_content.Collection:
".. no_pii:": "This model has no PII"
oel_collections.CollectionPublishableEntity:
openedx_content.CollectionPublishableEntity:
".. no_pii:": "This model has no PII"
oel_components.Component:
openedx_content.Component:
".. no_pii:": "This model has no PII"
oel_components.ComponentType:
openedx_content.ComponentType:
".. no_pii:": "This model has no PII"
oel_components.ComponentVersion:
openedx_content.ComponentVersion:
".. no_pii:": "This model has no PII"
oel_components.ComponentVersionContent:
openedx_content.ComponentVersionContent:
".. no_pii:": "This model has no PII"
oel_contents.Content:
openedx_content.Content:
".. no_pii:": "This model has no PII"
oel_contents.MediaType:
openedx_content.MediaType:
".. no_pii:": "This model has no PII"
oel_publishing.Container:
openedx_content.Container:
".. no_pii:": "This model has no PII"
oel_publishing.ContainerVersion:
openedx_content.ContainerVersion:
".. no_pii:": "This model has no PII"
oel_publishing.Draft:
openedx_content.Draft:
".. no_pii:": "This model has no PII"
oel_publishing.EntityList:
openedx_content.EntityList:
".. no_pii:": "This model has no PII"
oel_publishing.EntityListRow:
openedx_content.EntityListRow:
".. no_pii:": "This model has no PII"
oel_publishing.LearningPackage:
openedx_content.LearningPackage:
".. no_pii:": "This model has no PII"
oel_publishing.PublishLog:
openedx_content.PublishLog:
".. no_pii:": "This model has no PII"
oel_publishing.PublishLogRecord:
openedx_content.PublishLogRecord:
".. no_pii:": "This model has no PII"
oel_publishing.PublishableEntity:
openedx_content.PublishableEntity:
".. no_pii:": "This model has no PII"
oel_publishing.PublishableEntityVersion:
openedx_content.PublishableEntityVersion:
".. no_pii:": "This model has no PII"
oel_publishing.Published:
openedx_content.Published:
".. no_pii:": "This model has no PII"
oel_tagging.ObjectTag:
".. no_pii:": "This model has no PII"
Expand All @@ -65,17 +65,17 @@ oel_tagging.TagImportTask:
".. no_pii:": "This model has no PII"
oel_tagging.Taxonomy:
".. no_pii:": "This model has no PII"
oel_sections.Section:
openedx_content.Section:
".. no_pii:": "This model has no PII"
oel_sections.SectionVersion:
openedx_content.SectionVersion:
".. no_pii:": "This model has no PII"
oel_subsections.Subsection:
openedx_content.Subsection:
".. no_pii:": "This model has no PII"
oel_subsections.SubsectionVersion:
openedx_content.SubsectionVersion:
".. no_pii:": "This model has no PII"
oel_units.Unit:
openedx_content.Unit:
".. no_pii:": "This model has no PII"
oel_units.UnitVersion:
openedx_content.UnitVersion:
".. no_pii:": "This model has no PII"
social_django.Association:
".. no_pii:": "This model has no PII"
Expand Down
9 changes: 7 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,12 @@ venv/
!.vscode/settings.json.example

# Media files (for uploads)
media/
/media/

# Media files generated during test runs
test_media/
/test_media/

# uv stuff
.lock
CACHEDIR.TAG
pyvenv.cfg
10 changes: 5 additions & 5 deletions .importlinter
Original file line number Diff line number Diff line change
Expand Up @@ -36,24 +36,24 @@ layers=
openedx_learning.api.authoring

# The "backup_restore" app handle the new export and import mechanism.
openedx_learning.apps.authoring.backup_restore
openedx_learning.apps.openedx_content.applets.backup_restore

# The "components" app is responsible for storing versioned Components,
# which is Open edX Studio terminology maps to things like individual
# Problems, Videos, and blocks of HTML text. This is also the type we would
# associate with a single "leaf" XBlock–one that is not a container type and
# has no child elements.
openedx_learning.apps.authoring.components
openedx_learning.apps.openedx_content.applets.components

# The "contents" app stores the simplest pieces of binary and text data,
# without versioning information. These belong to a single Learning Package.
openedx_learning.apps.authoring.contents
openedx_learning.apps.openedx_content.applets.contents

# The "collections" app stores arbitrary groupings of PublishableEntities.
# Its only dependency should be the publishing app.
openedx_learning.apps.authoring.collections
openedx_learning.apps.openedx_content.applets.collections

# The lowest layer is "publishing", which holds the basic primitives needed
# to create Learning Packages and manage the draft and publish states for
# various types of content.
openedx_learning.apps.authoring.publishing
openedx_learning.apps.openedx_content.applets.publishing
80 changes: 80 additions & 0 deletions docs/decisions/0020-merge-authoring-apps-into-openedx-content.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
20. Merge authoring apps into openedx_content (using Applets)
=============================================================

Context
-------

Up to this point, Learning Core has used many small apps with a narrow focus (e.g. ``components``, ``collections``, etc.) in order to make each individual app simpler to reason about. This has been useful overall, but it has made refactoring more cumbersome. For instance:

#. Moving models between apps is tricky, requiring the use of Django's ``SeparateDatabaseAndState`` functionality to fake a deletion in one app and a creation in another without actually altering the database. Moving models also introduces tricky dependencies with respect to migration ordering (described in more detail later in this document). We encountered this when considering how to extract container functionality out of the ``publishing`` app.
#. Renaming an app is also cumbersome, because the process requires creating a new app and transitioning the models over. This came up when trying to rename the ``contents`` app to ``media``.

There have also been minor inconveniences, like having a long list of ``INSTALLED_APPS`` to maintain in openedx-platform over time, or not having these tables easily grouped together in the Django admin interface.

Decisions
---------

1. Single openedx_content App
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

All existing authoring apps will be merged into one Django app named ``openedx_content``. Some consequences of this decision:

- The tables will be renamed to have the ``openedx_content`` label prefix.
- All management commands will be moved to the ``openedx_content`` app.

2. Logical Separation via Applets
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

We will continue to keep internal API boundaries by using a new "Applets" convention. A Django applet is made to look like a miniature Django app, with its own ``models.py``, ``api.py``, and potentially other modules. The modules for the old authoring apps will be copied into various subpackages of ``openedx_content.applets``, such as ``openedx_content.applets.publishing``. Applets should respect each others' API boundaries, and never directly query models across applets. As before, we will use Import Linter to enforce dependency ordering.

3. Package Restructuring
~~~~~~~~~~~~~~~~~~~~~~~~

In one pull request, we are going to:

#. Rename the ``openedx_learning.apps.authoring`` package to be ``openedx_learning.apps.openedx_content``. (Note: We have discussed eventually moving this to a top level ``openedx_content`` app instead of ``openedx_learning.apps.openedx_content``, but that will happen at a later time.)
#. Create bare shells of the existing ``authoring`` apps (``backup_restore``, ``collections``, ``components``, ``contents``, ``publishing``, ``sections``, ``subsections``, ``units``), and move them to the ``openedx_learning.apps.openedx_content.backcompat`` package. These shells will have an ``apps.py`` file, the ``migrations`` package for each existing app, and in some cases a minimal ``models.py`` that will hold the skeletons of a handful of models. This will allow for a smooth schema migration to transition the models from these individual apps to ``openedx_content``.
#. Move the actual models files and API logic for our existing authoring apps to the ``openedx_learning.apps.openedx_content.applets`` package.
#. Convert the top level ``openedx_learning.apps.openedx_content`` package to be a Django app. The top level ``admin.py``, ``api.py``, and ``models.py`` modules will do wildcard imports from the corresponding modules across all applet packages.
#. Test packages will also be updated to follow the new structure.

4. Database Migration Specifics
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

When Django runs migrations, it calculates both an internal model state, as well as running database operations. We are going to take advantage of the fact that these two can be separated using the ``SeparateDatabaseAndState`` operation. We will use this to remove the model state from the old authoring apps and create the model state in the new ``openedx_content`` app without having to run database operations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggested rewording, based on my understanding of how this works. but please correct me if this isn't accurate.

Suggested change
When Django runs migrations, it calculates both an internal model state, as well as running database operations. We are going to take advantage of the fact that these two can be separated using the ``SeparateDatabaseAndState`` operation. We will use this to remove the model state from the old authoring apps and create the model state in the new ``openedx_content`` app without having to run database operations.
When Django runs migrations, it both:
* Calculates an ephemeral logical model **state**, based on the contents of the Python migration files and the ``django_migration`` database table, which indicates the list of migrations that have been "run".
* Actually executes the migration operations on the app **database** tables as each migration is "run".
We are going to take advantage of the fact that these two can be separated using the ``SeparateDatabaseAndState`` operation. We will use this to remove the model state from the old authoring apps and create the model state in the new ``openedx_content`` app without having to run database operations.


There are a few high level constraints that we have to consider:

#. Existing openedx-platform migrations should not be modified. Existing openedx-platform migrations should remain unchanged. This is important to make sure that we do not introduce ordering inconsistencies for sites that have already run migrations for the old apps and are upgrading to a new release (e.g. Verawood).
#. The openedx-learning repo should not have any dependencies on openedx-platform migrations, because our dependencies strictly go in the other direction: openedx-platform calls openedx-learning, not the other way around. Furthermore, openedx-learning will often be run without openedx-platform, such as for local development or during CI.
#. Two of the openedx-platform apps that have foreign keys to openedx-learning models are only in Studio's INSTALLED_APPS (``contentstore`` and ``modulestore_migrator``), while ``content_libraries`` is installed in both Studio and LMS. Migrations may be run for LMS or Studio first, depending on the user and environment. Tutor runs LMS first, but we can't assume that will always be true.
#. We must support people who are installing from scratch, those who are upgrading from the Ulmo release, as well as those who are running off of the master branch of openedx-platform.

Therefore, the migrations will happen in the following order:

#. All pre-existing ``backcompat.*`` apps migrations run as before.
#. New ``backcompat.*`` apps migrations that drop most model state, but leave the database unchanged.
#. The first ``openedx_content`` migration creates logical models without any database changes.
#. The second ``openedx_content`` migration renames the underlying tables.
#. Each the ``openedx-platform`` apps that had foreign keys to the old authoring apps will get a new migration that switches those foreign keys to point to ``openedx_content`` apps instead. These are: ``content_libraries``, ``contentstore``, and ``modulestore_migrator``.
#. The above ``openedx-platform`` apps will also get a squashed migration that sets them up to point to the new ``openedx_content`` models directly.

The tricky part is that all the ``opendx-learning`` migrations will run before any of the ``openedx-platform`` migrations run. We can't force it to do otherwise without making ``openedx-learning`` aware of ``opendx-platform``, and we explicitly want to avoid that. This makes things tricky with respect to the model state dependencies. There are two scenarios we have to worry about:

Migration from Scrach
The ``openedx-platform`` apps will each run the squashed migration that jumps straight to making foreign keys against the new ``openedx_content`` models, so the fact that the old authoring app models have been removed and the tables have been renamed doesn't matter.

Migration from Ulmo/master
No actual database operations have to happen here, as the keys were already created earlier. That being said, the migration framework will error out if the state of the old app models that it had foreign keys have been dropped entirely. That's why the bare skeletons of those old models are preserved in the ``backcompat`` app models files, along with their primary key field. Everything else can be dropped from the state point of view—though again, we're not modifying database state in this operation.

The main downside of this approach is that it may break migrations for developers if they have a months old dev database that is in an in-between release state, e.g. after some ``modulestore_migrations`` referencing the old app mdoels were run, but before the most recent ``modulestore_migrations`` creating foreign keys to those models. No production environment is expected to deploy like this, so the main negative consequence is that the developer would have to drop and recreate their database.
Copy link
Member

@kdmccormick kdmccormick Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand what this the affected range is. Is there a pair commits you could link to that would roughly correspond to it?

Any chance mitxonline could be in this situation?

I'll want to make a forum post and blast it out to #dev so that devs can recognize if they're in this state and know how to get out of it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after some modulestore_migrations referencing the old app mdoels were run

oh, I think the overloaded use of the term "migration" is tripping me up here. @ormsbee , so this is only a problem for someone who ran Legacy Library Migrations before the Ulmo cutoff? is a whole dev database wipe really necessary, or could they just drop/clear the affected tables?

but before the most recent modulestore_migrations creating foreign keys to those models

I don't think I quite understand this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any chance mitxonline could be in this situation?

No. Anyone on the latest master is good. Anyone on Ulmo is good. Anyone who makes a stop at Ulmo and migrates before moving on to something more recent is good.


I struggle to communicate this without a whole lot of detail that I thought might distract/confuse further. FWIW, when I say "migration", I mean Django migration, and not v1 -> v2 lib migration. I should probably pick a different app to illustrate this, come to think on it. Or maybe choose my squash ranges more carefully than I'm doing now... anyhow....

Let's take the modulestore_migrator migrations on master today:

https://github.com/openedx/openedx-platform/tree/master/cms/djangoapps/modulestore_migrator/migrations

Migration 0001 creates foreign keys to a handful of old authoring app models, like Collection, DraftChangeLog, etc.

Migration 0004 modifies a foreign key to PublishableEntity.

Keep in mind that all our openedx-learning migrations will potentially run before any of these openedx-platform migrations. This means that the tables underlying the openedx_content migrations have potentially been renamed with their new prefix before modulestore_migrator migrations run.

If no modulestore_migrator migrations have run yet, the new squashed migration will run, and we're golden. The new squashed migration jumps straight to the new openedx_content models, so old authoring model state is irrelevant.

If we're in a state where migration 0004 has already run, we're likewise fine. We can't run the squashed migration at this point, but all the actual database-altering, authoring-app-referencing migrations for the modulestore_migrator app have already run. Our skeleton backcompat model app state is good enough to satisfy the framework.

The Problem: Say the database is in a state where migration 0001-0003 have run, but 0004 has not yet run. The modulestore_migrator app doesn't exist in LMS at all, so someone runs the LMS migrations and we get to a state where all the openedx_content migrations have run and the tables have been renamed. Now 0004 wants to run and alter its foreign key reference to oel_publishing.publishableentitty. It thinks it can do that because the phantom backcompat model state says that it's still there. But the underlying table has actually been renamed. So when 0004 tries to run, it blows up with a SQL error about a missing table.

...

Okay, as I write that out, I realize that I might be able to fold together in a more robust way by properly by reflecting the rename in the backcompat model states and migrations. My first attempt at that didn't work out because there were conflicts in index naming, but that was before I realized that I could hollow out the backcompat models to just their primary keys.

I'll look into this, but I've lost track of how many times I've iterated on the database migration approach, and I really want to just push it at this point. 😩

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, nvm, I can't do it. The migrations framework will error if multiple models try to point to the same database table.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. That makes sense.

Okay, as I write that out, I realize that I might be able to fold together in a more robust way by properly by reflecting the rename in the backcompat model states and migrations. My first attempt at that didn't work out because there were conflicts in index naming, but that was before I realized that I could hollow out the backcompat models to just their primary keys.
I'll look into this, but I've lost track of how many times I've iterated on the database migration approach, and I really want to just push it at this point.

Yeah, from my perspective, no need to make it any more leakproof than you already have it. I just want to know what to tell devs. We can tell them run migrations before pulling down this commit and rebuilding your openedx image but I'm sure many will miss that memo. So we can also tell them: when you see error $X during migrations, then either (a) drop your whole DB and re-run tutor-dev-init, or (b) run the following SQL and then try running migrations again:

DELETE FROM django_migrations WHERE app = 'modulestore_migrator';
DROP TABLE modulestore_migrator_modulestoreblockmigration;
DROP TABLE modulestore_migrator_modulestoreblocksource;
DROP TABLE modulestore_migrator_modulestoremigration;
DROP TABLE modulestore_migrator_modulestoresource;

# and the samea idea for the affected contentstore tables
DELETE FROM django_migrations WHERE app = 'contentstore' and name IN (
     '000x_blah', '000y_blah'
)
DROP TABLE contentstore_blah1;
DROP TABLE contentstore_blah2;

does that look right?


5. Rejected Alternatives
~~~~~~~~~~~~~~~~~~~~~~~~

An earlier attempt at this tried to solve the migration ordering issue by dynamically injecting migration dependencies into the second ``openedx_content`` migration module during the app config ``ready()`` initialization. This was later abandoned because it didn't solve the problem of CMS vs LMS differences in ``INSTALLED_APPS``, so the ordering could still get corrupted unless we added those apps to LMS—which would have introduced more risk.

6. The Bigger Picture
~~~~~~~~~~~~~~~~~~~~~

This overall refactoring means that the ``openedx_content`` Django app corresponds to a Subdomain in Domain Driven Design terminology, with each applet being a Bounded Context. We call these "Applets" instead of "Bounded Contexts" because we don't want it to get confused for Django's notion of Contexts and Context Processors (or Python's notion of Context Managers).
6 changes: 3 additions & 3 deletions olx_importer/management/commands/load_components.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,9 @@
from django.db import transaction

# Model references to remove
from openedx_learning.apps.authoring.components import api as components_api
from openedx_learning.apps.authoring.contents import api as contents_api
from openedx_learning.apps.authoring.publishing import api as publishing_api
from openedx_learning.apps.openedx_content.applets.components import api as components_api
from openedx_learning.apps.openedx_content.applets.contents import api as contents_api
from openedx_learning.apps.openedx_content.applets.publishing import api as publishing_api

SUPPORTED_TYPES = ["problem", "video", "html"]
logger = logging.getLogger(__name__)
Expand Down
2 changes: 1 addition & 1 deletion openedx_learning/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@
Open edX Learning ("Learning Core").
"""

__version__ = "0.30.2"
__version__ = "0.31.0"
11 changes: 2 additions & 9 deletions openedx_learning/api/authoring.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,21 +2,14 @@
This is the public API for content authoring in Learning Core.

This is the single ``api`` module that code outside of the
``openedx_learning.apps.authoring.*`` package should import from. It will
``openedx_learning.apps.openedx_content.*`` package should import from. It will
re-export the public functions from all api.py modules of all authoring apps. It
may also implement its own convenience APIs that wrap calls to multiple app
APIs.
"""
# These wildcard imports are okay because these api modules declare __all__.
# pylint: disable=wildcard-import
from ..apps.authoring.backup_restore.api import *
from ..apps.authoring.collections.api import *
from ..apps.authoring.components.api import *
from ..apps.authoring.contents.api import *
from ..apps.authoring.publishing.api import *
from ..apps.authoring.sections.api import *
from ..apps.authoring.subsections.api import *
from ..apps.authoring.units.api import *
from ..apps.openedx_content.api import *

# This was renamed after the authoring API refactoring pushed this and other
# app APIs into the openedx_learning.api.authoring module. Here I'm aliasing to
Expand Down
14 changes: 7 additions & 7 deletions openedx_learning/api/authoring_models.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,10 @@
"""
# These wildcard imports are okay because these modules declare __all__.
# pylint: disable=wildcard-import
from ..apps.authoring.collections.models import *
from ..apps.authoring.components.models import *
from ..apps.authoring.contents.models import *
from ..apps.authoring.publishing.models import *
from ..apps.authoring.sections.models import *
from ..apps.authoring.subsections.models import *
from ..apps.authoring.units.models import *
from ..apps.openedx_content.applets.collections.models import *
from ..apps.openedx_content.applets.components.models import *
from ..apps.openedx_content.applets.contents.models import *
from ..apps.openedx_content.applets.publishing.models import *
from ..apps.openedx_content.applets.sections.models import *
from ..apps.openedx_content.applets.subsections.models import *
from ..apps.openedx_content.applets.units.models import *
24 changes: 24 additions & 0 deletions openedx_learning/api/django.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
"""
Module for parts of the Learning Core API that exist to make it easier to use in
Django projects.
"""


def openedx_learning_apps_to_install():
"""
Return all app names for appending to INSTALLED_APPS.
This function exists to better insulate edx-platform and potential plugins
over time, as we eventually plan to remove the backcompat apps.
"""
return [
"openedx_learning.apps.openedx_content",
"openedx_learning.apps.openedx_content.backcompat.backup_restore",
"openedx_learning.apps.openedx_content.backcompat.collections",
"openedx_learning.apps.openedx_content.backcompat.components",
"openedx_learning.apps.openedx_content.backcompat.contents",
"openedx_learning.apps.openedx_content.backcompat.publishing",
"openedx_learning.apps.openedx_content.backcompat.sections",
"openedx_learning.apps.openedx_content.backcompat.subsections",
"openedx_learning.apps.openedx_content.backcompat.units",
]
24 changes: 0 additions & 24 deletions openedx_learning/apps/authoring/components/apps.py

This file was deleted.

25 changes: 0 additions & 25 deletions openedx_learning/apps/authoring/publishing/apps.py

This file was deleted.

Loading