Skip to content

Conversation

@nedvedba
Copy link
Collaborator

@nedvedba nedvedba commented Jan 23, 2026

Ticket

#1824

Description

Fixed the Mapped Collection Consent loop bug, and added single domain auth for endpoints that require it.

How Has This Been Tested?

Going through the consent flow, and then browsing the endpoint after being granted consent.

Summary by Sourcery

Improve Globus consent and single-domain authentication flow, ensuring mapped collection tokens and UI state are handled robustly through the consent/login process.

New Features:

  • Persist and restore endpoint browser and dialog UI state across Globus consent/login redirects.
  • Support single-domain session requirements when generating Globus consent URLs and handling permission-denied errors.

Bug Fixes:

  • Prevent mapped collection consent loops by correctly persisting collection IDs, associating transfer scopes with collections, and avoiding token handling failures.
  • Fix endpoint retrieval and logging when clients have no endpoints or when client IDs are missing or invalid.
  • Ensure dependent scopes are stored correctly for Globus collection tokens when scopes are not explicitly provided.

Enhancements:

  • Add detailed logging and error handling around session updates, token setting, and Globus collection/token persistence for easier debugging.
  • Extend access token handling to support asynchronous callbacks and error reporting instead of throwing synchronously in all cases.
  • Harden parameter parsing and validation for client IDs, consent query parameters, and OAuth state to avoid invalid input and session pollution.
  • Allow requested scopes to be specified flexibly and ensure offline access is only requested when needed.
  • Preserve transfer dialog controllers on DOM elements to support later state restoration.
  • Make registration token storage robust by cleaning up session data only after token persistence succeeds.

@nedvedba nedvedba self-assigned this Jan 23, 2026
@nedvedba nedvedba added Type: Bug Something isn't working Priority: High Highest priority Component: Web API Relates to web service / API Component: Web UI Relates to web appp user interface javascript Pull requests that update javascript code labels Jan 23, 2026
@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Jan 23, 2026

Reviewer's Guide

Implements a more robust mapped collection consent/auth flow by persisting collection and UI state through the Globus consent redirect, tightening token handling, and hardening endpoint-related APIs and logging.

Sequence diagram for mapped collection consent and state restoration

sequenceDiagram
    actor User
    participant Browser
    participant EndpointBrowser
    participant WebAPI as WebServer
    participant ConsentHandler
    participant GlobusAuth
    participant AuthCallback as UiAuthnHandler
    participant TokenHandler
    participant Core as CoreSetAccessToken
    participant DB as FoxxUserRouter
    participant MainPage

    User->>Browser: Open endpoint browser
    Browser->>EndpointBrowser: List files on mapped collection
    EndpointBrowser->>WebAPI: API call to mapped collection
    WebAPI-->>EndpointBrowser: Error permission_denied / ConsentRequired

    EndpointBrowser->>EndpointBrowser: Build stateObj (endpoint, path, mode)
    EndpointBrowser->>EndpointBrowser: Optionally capture parent_dialog state
    EndpointBrowser->>WebAPI: api.getGlobusConsentURL(endpoint_id, required_scopes, refresh_tokens=false, query_params, state)

    WebAPI->>ConsentHandler: generateConsentURL(clientId, redirect_uri, requested_scopes, refresh_tokens, query_params, state)
    ConsentHandler-->>WebAPI: consent_url
    WebAPI-->>EndpointBrowser: consent_url
    EndpointBrowser-->>User: Render link to login with required identity

    User->>GlobusAuth: Click consent_url, login and grant consent
    GlobusAuth-->>AuthCallback: Redirect to /ui/authn with code and state

    AuthCallback->>AuthCallback: Validate user identity and derive username
    AuthCallback->>AuthCallback: Parse state JSON
    AuthCallback->>AuthCallback: Save restore_state to session
    AuthCallback->>AuthCallback: Store collection_id and collection_type in session

    AuthCallback->>TokenHandler: constructOptionalData(token_context)
    TokenHandler->>TokenHandler: If collection_id present, set type=GLOBUS_TRANSFER and other=collection_id|scope
    TokenHandler-->>AuthCallback: optional_data

    AuthCallback->>Core: setAccessToken(uid, access_token, refresh_token, expires_in, optional_data, cb)
    Core->>DB: Persist tokens for user and collection
    DB-->>Core: Ack
    Core-->>AuthCallback: cb(null)

    AuthCallback->>Browser: Redirect to /ui/main

    Browser->>WebAPI: GET /ui/main with session
    WebAPI->>WebAPI: Extract restore_state from session
    WebAPI-->>MainPage: Render main.ect with restore_state JSON

    MainPage->>MainPage: On load, check tmpl_data.restore_state
    MainPage->>MainPage: If parent_dialog.type is d_new_edit, import dlg_data_new_edit.js and reopen dialog
    MainPage->>MainPage: If parent_dialog.type is transfer, import transfer/index.js and reopen transfer dialog
    MainPage-->>User: Restored endpoint browser and dialogs
Loading

Entity relationship diagram for Globus collection and token after consent

erDiagram
    USER {
        string _key
        string eps
    }

    GLOBUS_COLLECTION {
        string _key
        string name
        string description
        string uuid
        string required_scopes
        string type
        bool ha_enabled
    }

    GLOBUS_TOKEN {
        string _key
        string _from
        string _to
        string type
        string dependent_scopes
        number request_time
        number last_used
        string status
        string access_token
        string refresh_token
        number expires_in
        string other
    }

    USER ||--o{ GLOBUS_TOKEN : has
    GLOBUS_COLLECTION ||--o{ GLOBUS_TOKEN : referenced_by
Loading

Updated class diagram for endpoint browsing and auth handlers

classDiagram
    class EndpointBrowser {
        +props endpoint
        +props mode
        +state path
        +handleError(error)
    }

    class TransferUIManager {
        -state frame
        -#controller
        +createDialog(labels)
    }

    class OAuthTokenHandler {
        +constructOptionalData(token_context)
    }

    class ConsentHandler {
        +generateConsentURL(clientId, redirectUri, requested_scopes, refresh_tokens, query_params, state)
    }

    class WebServer {
        +storeCollectionId(req, res, next)
        +getGlobusConsentURL(req, res)
        +uiAuthnHandler(req, res)
        +uiMain(req, res)
        +setAccessToken(a_uid, a_acc_tok, a_ref_tok, a_expires_sec, token_optional_params, a_cb)
    }

    class MainPage {
        +initFromTemplateData(tmpl_data)
        +restoreDialogs(restore_state)
    }

    EndpointBrowser --> WebServer : api.getGlobusConsentURL
    WebServer --> ConsentHandler : generateConsentURL
    WebServer --> OAuthTokenHandler : constructOptionalData
    WebServer --> WebServer : setAccessToken
    WebServer --> MainPage : render main with restore_state
    MainPage --> EndpointBrowser : restore endpoint_browser state
    MainPage --> TransferUIManager : reopen transfer dialog
    TransferUIManager --> TransferUIManager : createDialog attaches controller to frame
Loading

File-Level Changes

Change Details Files
Persist mapped collection ID and UI/dialog state across Globus consent/auth redirects to avoid consent loop and restore user context.
  • Update storeCollectionId middleware to save collection_id/collection_type to the session with explicit session.save callback and logging, only calling next() after completion or when no collection_id is present
  • In /ui/authn, parse optional state JSON from query, validate expected structure (endpoint_browser/restore_state), and stash as restore_state on the session for later use, with logging and error handling
  • In /ui/main, extract restore_state from the session, inject it into the main view model, then clear it from the session
  • Extend main.ect and main.js to pass restore_state into the client, and on load re-open the appropriate parent dialog (new data record or transfer dialog) using dynamic imports and reconstructed parameters
  • In EndpointBrowser error handling, when consent/login is required, construct a serializable state object capturing endpoint/path/mode and any open dialogs, pass it as the state parameter to getGlobusConsentURL, and update the user-facing message
web/datafed-ws.js
web/views/main.ect
web/static/main.js
web/static/components/endpoint-browse/index.js
web/static/components/transfer/transfer-ui-manager.js
web/static/dlg_data_new_edit.js
Improve consent URL generation and token handling so collection-specific transfer tokens and refresh-token scopes are handled correctly and safely.
  • In generateConsentURL, normalize requested_scopes to an array (supporting string/array/undefined) and only push offline_access when refresh_tokens is true and the scope is not already present
  • In OAuthTokenHandler for GLOBUS_DEFAULT tokens, if a collection_id is present in token_context, convert the token to type GLOBUS_TRANSFER and encode collection_id
scope into the optional other field, throwing if scope is missing
  • In /api/globus/consent_url, safely parse query_params if provided as a JSON string and log parse failures
  • Fix token_context.resource_server field to use the correct client_token.data.resource_server property
  • Make setAccessToken asynchronous-friendly and integrate it into auth and registration flows with better error handling and session cleanup.
    • Extend setAccessToken to accept an optional callback, invoking it with an error when the RPC reply is missing and returning early instead of throwing, or with the reply on success
    • Update /ui/authn to call setAccessToken with a callback, redirecting to /ui/error and clearing collection_id on error, logging detailed debug info, and only redirecting after the callback completes
    • Add defensive handling in /ui/authn so a missing/invalid username (uid) logs an error, redirects to /ui/error, and stops processing
    • Update /api/usr/register to pass an empty optional params object and a callback into setAccessToken, moving session-field cleanup and session reg flag setting into the callback, and returning a 500 on failure instead of throwing after cleanup
    web/datafed-ws.js
    Harden backend Globus collection/token persistence for mapped collections and improve debugging/logging around token storage.
    • In the GLOBUS_TRANSFER case in user_router, log detailed debug messages before and after collection lookup and token insert, including collection search key, scopes, and resulting documents
    • Change globus_coll.exists + save to an explicit exists check followed by document fetch or save, then fetch, logging whether the collection already existed or was created and the resulting metadata
    • When constructing token documents, derive dependent_scopes from the provided scopes or fall back to globus_collection.required_scopes, and log the full token_doc and insert result
    • Retain overwriteMode: "replace" but document its use as a full replacement upsert semantics
    core/database/foxx/api/user_router.js
    Tighten endpoint-related APIs and support utilities with better validation and logging details for clients and endpoints.
    • In support.getUserFromClientID, throw a clear ERR_INVALID_PARAM when a_client_id is missing to avoid ambiguous behavior
    • In /ep/get and /ep/set routers, move client variable to outer scope, fetch it inside try blocks, and replace optional chaining with explicit checks when computing first endpoint and logging extra data for success/failure events, preventing undefined-access issues when client or eps is missing
    • Adjust failure logging in /ep/set to guard against client being undefined when including extra endpoint data
    core/database/foxx/api/support.js
    core/database/foxx/api/user_router.js

    Possibly linked issues


    Tips and commands

    Interacting with Sourcery

    • Trigger a new review: Comment @sourcery-ai review on the pull request.
    • Continue discussions: Reply directly to Sourcery's review comments.
    • Generate a GitHub issue from a review comment: Ask Sourcery to create an
      issue from a review comment by replying to it. You can also reply to a
      review comment with @sourcery-ai issue to create an issue from it.
    • Generate a pull request title: Write @sourcery-ai anywhere in the pull
      request title to generate a title at any time. You can also comment
      @sourcery-ai title on the pull request to (re-)generate the title at any time.
    • Generate a pull request summary: Write @sourcery-ai summary anywhere in
      the pull request body to generate a PR summary at any time exactly where you
      want it. You can also comment @sourcery-ai summary on the pull request to
      (re-)generate the summary at any time.
    • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
      request to (re-)generate the reviewer's guide at any time.
    • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
      pull request to resolve all Sourcery comments. Useful if you've already
      addressed all the comments and don't want to see them anymore.
    • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
      request to dismiss all existing Sourcery reviews. Especially useful if you
      want to start fresh with a new review - don't forget to comment
      @sourcery-ai review to trigger a new review!

    Customizing Your Experience

    Access your dashboard to:

    • Enable or disable review features such as the Sourcery-generated pull request
      summary, the reviewer's guide, and others.
    • Change the review language.
    • Add, remove or edit custom review instructions.
    • Adjust other review settings.

    Getting Help

    Copy link
    Contributor

    @sourcery-ai sourcery-ai bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Hey - I've found 2 issues, and left some high level feedback:

    • In /ui/authn you’re calling logger.warning(...), but elsewhere only logger.info/logger.error appear to be used; if warning isn’t defined on this logger it will throw during error handling—consider switching to the existing logger.warn/logger.info style method or confirming the API.
    • The new console.log('DEBUG: ...') calls in the Foxx user_router (GLOBUS_TRANSFER branch) add a lot of noisy output and bypass your structured logging; consider either removing them or routing these through the existing logger with an appropriate log level.
    Prompt for AI Agents
    Please address the comments from this code review:
    
    ## Overall Comments
    - In `/ui/authn` you’re calling `logger.warning(...)`, but elsewhere only `logger.info`/`logger.error` appear to be used; if `warning` isn’t defined on this logger it will throw during error handling—consider switching to the existing `logger.warn`/`logger.info` style method or confirming the API.
    - The new `console.log('DEBUG: ...')` calls in the Foxx `user_router` (GLOBUS_TRANSFER branch) add a lot of noisy output and bypass your structured logging; consider either removing them or routing these through the existing logger with an appropriate log level.
    
    ## Individual Comments
    
    ### Comment 1
    <location> `web/datafed-ws.js:759-768` </location>
    <code_context>
    +                                                a_resp.redirect(redirect_path);
    +                                            },
                                             );
                                         } catch (err) {
                                             redirect_path = "/ui/error";
    -                                        logger.error("/ui/authn", getCurrentLineNumber(), err);
    </code_context>
    
    <issue_to_address>
    **🚨 issue (security):** Session cleanup for registration tokens no longer runs on error, leaving sensitive data in the session.
    
    Moving the cleanup from a `finally` block into the success callback means `name`, `email`, `uuids`, `acc_tok`, `acc_tok_ttl`, and `ref_tok` are only cleared on success. Any exception in the `try` path now leaves them in the session until expiry. Please reintroduce an unconditional cleanup path (e.g., a `finally` block or a dedicated `try/finally` around session field removal) to ensure PII and token data are always cleared, even on errors.
    </issue_to_address>
    
    ### Comment 2
    <location> `web/static/components/endpoint-browse/index.js:321-330` </location>
    <code_context>
    +                    const stateObj = {
    </code_context>
    
    <issue_to_address>
    **suggestion (bug_risk):** The serialized `state` object can become large and may risk exceeding URL length limits.
    
    Because this includes `endpoint.rawData`, dialog metadata, and possibly many record IDs, the resulting OAuth `state` value may become too large for some browsers or intermediaries to handle. Consider trimming `rawData` to only what’s required, capping the number of record IDs, or storing the full state server-side and passing only a short lookup token in `state`.
    
    Suggested implementation:
    
    ```javascript
                        // Serialize current state for restoration after consent flow.
                        // NOTE: Avoid putting the entire endpoint.rawData object into state,
                        // as it can become very large and cause URL length issues.
                        const rawEndpoint = this.props.endpoint?.rawData || {};
                        const endpointState = {
                            id: rawEndpoint.id,
                            name: rawEndpoint.name,
                            type: rawEndpoint.type,
                        };
    
                        const stateObj = {
                            endpoint_browser: {
                                endpoint: endpointState,
                                path: this.state.path,
                                mode: this.props.mode,
                            },
                        };
    
    ```
    
    To fully implement the optimization described in your comment, you will likely also want to:
    1. Identify where dialog metadata and record IDs are added to `stateObj` later in this function and:
       - Trim any large payloads to the minimal data needed to restore UI state (e.g., avoid embedding entire records).
       - Cap arrays of record IDs (for example, keep only the first N IDs that are required to resume the flow).
    2. Optionally introduce a server-side state store:
       - Generate a short token or ID on the server.
       - Store the full state object server-side keyed by that token.
       - Pass only the token in the OAuth `state` parameter instead of the full serialized state object.
    3. Update the code that restores state after the consent flow to:
       - Fetch the full state from the server using the token (if you adopt server-side state storage).
       - Adjust any assumptions that the full `endpoint.rawData` object is available directly from `state`.
    </issue_to_address>

    Sourcery is free for open source - if you like our reviews please consider sharing them ✨
    Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

    Comment on lines 759 to 768
    } catch (err) {
    redirect_path = "/ui/error";
    logger.error("/ui/authn", getCurrentLineNumber(), err);
    logger.error(
    "/ui/authn",
    getCurrentLineNumber(),
    "DEBUG: Exception in token handling: " + err,
    );
    delete a_req.session.collection_id;
    a_resp.redirect(redirect_path);
    }
    Copy link
    Contributor

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    🚨 issue (security): Session cleanup for registration tokens no longer runs on error, leaving sensitive data in the session.

    Moving the cleanup from a finally block into the success callback means name, email, uuids, acc_tok, acc_tok_ttl, and ref_tok are only cleared on success. Any exception in the try path now leaves them in the session until expiry. Please reintroduce an unconditional cleanup path (e.g., a finally block or a dedicated try/finally around session field removal) to ensure PII and token data are always cleared, even on errors.

    Comment on lines +321 to +330
    const stateObj = {
    endpoint_browser: {
    endpoint: this.props.endpoint.rawData,
    path: this.state.path,
    mode: this.props.mode,
    },
    };

    // Check if "New Data Record" dialog is open and save its state
    const new_data_dlg = $("#d_new_edit");
    Copy link
    Contributor

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    suggestion (bug_risk): The serialized state object can become large and may risk exceeding URL length limits.

    Because this includes endpoint.rawData, dialog metadata, and possibly many record IDs, the resulting OAuth state value may become too large for some browsers or intermediaries to handle. Consider trimming rawData to only what’s required, capping the number of record IDs, or storing the full state server-side and passing only a short lookup token in state.

    Suggested implementation:

                        // Serialize current state for restoration after consent flow.
                        // NOTE: Avoid putting the entire endpoint.rawData object into state,
                        // as it can become very large and cause URL length issues.
                        const rawEndpoint = this.props.endpoint?.rawData || {};
                        const endpointState = {
                            id: rawEndpoint.id,
                            name: rawEndpoint.name,
                            type: rawEndpoint.type,
                        };
    
                        const stateObj = {
                            endpoint_browser: {
                                endpoint: endpointState,
                                path: this.state.path,
                                mode: this.props.mode,
                            },
                        };

    To fully implement the optimization described in your comment, you will likely also want to:

    1. Identify where dialog metadata and record IDs are added to stateObj later in this function and:
      • Trim any large payloads to the minimal data needed to restore UI state (e.g., avoid embedding entire records).
      • Cap arrays of record IDs (for example, keep only the first N IDs that are required to resume the flow).
    2. Optionally introduce a server-side state store:
      • Generate a short token or ID on the server.
      • Store the full state object server-side keyed by that token.
      • Pass only the token in the OAuth state parameter instead of the full serialized state object.
    3. Update the code that restores state after the consent flow to:
      • Fetch the full state from the server using the token (if you adopt server-side state storage).
      • Adjust any assumptions that the full endpoint.rawData object is available directly from state.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Labels

    Component: Web API Relates to web service / API Component: Web UI Relates to web appp user interface javascript Pull requests that update javascript code Priority: High Highest priority Type: Bug Something isn't working

    Projects

    None yet

    Development

    Successfully merging this pull request may close these issues.

    3 participants