Skip to content

Question: count_set_bits() and count_unset_bits() PR #86

@eadf

Description

@eadf

I've been playing around trying to create a faster .iter_unset_bits(..).count() method (like we discussed before)
This is what i came up with, i have not actually benchmarked it but since it is processing entire words instead of bit-by-bit, it should be substantially faster.

Is this something you want a PR for?

    /// Counts the number of set bits.
    /// Generates the same result as Vob::iterate_set_bits().count() only much faster.
    /// # Examples
    /// 
    /// use vob::Vob;
    /// let mut v = Vob::new();
    /// v.push(false);
    /// v.push(true);
    /// assert_eq!(v.iter_set_bits(..).count(), v.count_set_bits(..));
    /// 
    pub fn count_set_bits<R>(&self, range: R) -> usize
    where
        R: RangeBounds<usize>,
    {
        self._count_set_bits(self.process_range(range))
    }

    /// Counts the number of unset bits.
    /// Generates the same result as Vob::iterate_unset_bits().count() only much faster.
    /// # Examples
    ///  
    /// use vob::Vob;
    /// let mut v = Vob::new();
    /// v.push(false);
    /// v.push(true);
    /// assert_eq!(v.iter_unset_bits(..).count(), v.count_unset_bits(..));
    /// 
    pub fn count_unset_bits<R>(&self, range: R) -> usize
    where
        R: RangeBounds<usize>,
    {
        let range = self.process_range(range);
        (range.end - range.start) - self._count_set_bits(range)
    }

    /// Counts the number of set bits.
    /// This method assumes the range is processed with process_range()
    fn _count_set_bits(&self, range: Range<usize>) -> usize {
        // Early return for empty ranges
        if range.start >= range.end {
            return 0;
        }
        let start_word = block_offset::<T>(range.start);
        if start_word >= self.len {
            return 0;
        }
        // this -1 is safe since we already tested for range.start & range.end equality
        let end_word = blocks_required::<T>(range.end) - 1;

        if start_word == end_word {
            // Range entirely within one word
            let word = self.vec[start_word];
            let start_bit = range.start % bits_per_block::<T>();
            let end_bit = range.end % bits_per_block::<T>();

            // Remove bits before start_bit and bits after end_bit
            return count_ones_usize(if end_bit == 0 {
                // end_bit = 0 means we want everything from start_bit to end of word
                // After the right shift above, we already have what we want
                word >> start_bit
            } else {
                // We want bits from start_bit to end_bit
                // After right shift, we need to remove the high bits
                (word >> start_bit) << (start_bit + bits_per_block::<T>() - end_bit)
            });
        }

        // First word: shift out bits before start_bit
        let start_bit = range.start % bits_per_block::<T>();
        let mut count = count_ones_usize(self.vec[start_word] >> start_bit);

        // Middle words (unchanged)
        for word_idx in (start_word + 1)..end_word {
            count += count_ones_usize(self.vec[word_idx]);
        }

        // Last word: shift out bits after end_bit
        let end_bit = range.end % bits_per_block::<T>();
        count
            + if end_bit == 0 {
                count_ones_usize(self.vec[end_word])
            } else {
                count_ones_usize(self.vec[end_word] << (bits_per_block::<T>() - end_bit))
            }
    }

...
/// Convenience function that calls T::count_ones() and converts the result to usize
/// (The conversion is always safe even if T is u128 and usize is u16)
#[inline(always)]
fn count_ones_usize<T: PrimInt>(value: T) -> usize {
    use std::convert::TryFrom;
    usize::try_from(value.count_ones()).unwrap()
}

Issues:

  • I created a count_ones_usize() function only used in _count_set_bits(). Maybe a local function?
  • count_ones_usize() could safely use unwrap_unchecked()
  • n % bits_per_block::<T>() is frequently used, maybe it should be a function like block_offset() maybe bit_offset()?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions