Skip to content

Use of union results in 101x performance regression #8813

@mkannwischer

Description

@mkannwischer

We've been observing a vast proof performance regression when moving variables whose lifetime does not overlap into a union (e.g, to save stack space). Below is a (very) trimmed down example showcasing the problem including instructions to reproduce. When switching from a struct to a union cbmc runtime increases from 0.9 seconds to 91 seconds (CBMC 6.8.0, z3 4.15.3, Apple M4).

Context: This came up during memory reduction attempts in mldsa-native: A first step can be found in pq-code-package/mldsa-native#800 which resulted in proof runtime for the modified function increasing from 137 seconds to 1278 seconds. We plan many similar changes, but this quickly becomes infeasible.

/*
 * Copyright (c) The mldsa-native project authors
 * SPDX-License-Identifier: Apache-2.0 OR ISC OR MIT
 */

/* Minimal example illustrating performance degradation
 * when using `union`s for memory reduction.
 *
 * - z3 version 4.15.3
 * - cbmc version 6.8.0
 *
 * To reproduce, save as sign.c and do:
 *
   goto-cc -DCBMC sign.c --function harness -o a.out
   goto-instrument --dfcc harness      \
      --replace-call-with-contract bar \
      a.out b.out
   cbmc b.out --smt2 --slice-formula
  */

#include <stdint.h>
#include <string.h>

/***************************************************
 * CBMC contract macros (from mldsa-native)
 ***************************************************/
#ifdef CBMC

#define __contract__(x) x
#define assigns(...) __CPROVER_assigns(__VA_ARGS__)
#define requires(...) __CPROVER_requires(__VA_ARGS__)
#define ensures(...) __CPROVER_ensures(__VA_ARGS__)
#define memory_slice(...) __CPROVER_object_upto(__VA_ARGS__)
#define memory_no_alias(...) __CPROVER_is_fresh(__VA_ARGS__)

/* clang-format off */
#define forall(qvar, qvar_lb, qvar_ub, predicate)                 \
  __CPROVER_forall                                                \
  {                                                               \
    unsigned qvar;                                                \
    ((qvar_lb) <= (qvar) && (qvar) < (qvar_ub)) ==> (predicate)   \
  }

#define CBMC_CONCAT_(left, right) left##right
#define CBMC_CONCAT(left, right) CBMC_CONCAT_(left, right)

#define array_bound_core(qvar, qvar_lb, qvar_ub, array_var,            \
                         value_lb, value_ub)                           \
  __CPROVER_forall                                                     \
  {                                                                    \
    unsigned qvar;                                                     \
    ((qvar_lb) <= (qvar) && (qvar) < (qvar_ub)) ==>                    \
        (((int)(value_lb) <= ((array_var)[(qvar)])) &&		       \
         (((array_var)[(qvar)]) < (int)(value_ub)))		       \
  }

#define array_bound(array_var, qvar_lb, qvar_ub, value_lb, value_ub) \
  array_bound_core(CBMC_CONCAT(_cbmc_idx, __COUNTER__), (qvar_lb),      \
      (qvar_ub), (array_var), (value_lb), (value_ub))

#define array_abs_bound(arr, lb, ub, k) \
  array_bound((arr), (lb), (ub), -((int)(k)) + 1, (k))
/* clang-format on */

#else /* !CBMC */
#define __contract__(x)
#endif /* CBMC */

typedef struct
{
  int32_t coeffs[256];
} data;

typedef struct
{
  data vec[8];
} vec;

void bar(vec *r)
__contract__(
  requires(memory_no_alias(r, sizeof(vec)))
  assigns(memory_slice(r, sizeof(vec)))
  ensures(forall(k1, 0, 8, array_abs_bound(r->vec[k1].coeffs, 0, 256, 1 << 23)))
);

void harness(void)
{
  #if 1 /* CHANGE THIS */
  union
  #else
  struct
  #endif
  {
    vec y;
    vec h;
  } yh;

  vec *y = &yh.y;
  vec *h = &yh.h;

  bar(y);
  bar(h);
}

Thanks to @hanno-becker for helping minimize the example above.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions