When CodeGen_LLVM tries to codegen a predicated load or store involving a 1-lane vector (which happens any time the vector length in the IR is 1 modulo the native vector length), it raises an error because slice_vector turns 1-lane vectors into single elements via CreateExtractElement. This is incompatible with CreateMaskedStore and CreateMaskedLoad, which require a vector type. The simplest way to fix this is to just not convert these 1-lane vectors into elements.
Code to reproduce:
f1(x) = input(x) * 2;
f2(x) = select(x < 8, 0, f1(x)+f1(x+1));
f2.split(x, xo, xi, 8);
f1.compute_at(f2, xo).vectorize(x);