-
Notifications
You must be signed in to change notification settings - Fork 82
Open
Description
Google group user JerLucid suggests adding a mechanism that would enable users to encode table columns as chars/shorts/ints, as an alternative to longs with the standard enumeration.
Testing supports the main arguments:
- the encoded table and mapping dictionaries tested used less disk space and memory than the enumerated table and sym file
- the writing process was faster and used less memory
- simple filtering and grouping queries were faster and used less memory
The main downsides are:
- queries are more complicated, involving dictionary lookups and reverse lookups for encoding/decoding
- encoding domains are smaller and the available domain space would potentially need to be monitored
One method of implementing this approach in TorQ would be to add some new configuration (e.g. a .csv file with table -> column -> mapping file name -> mapping data type) and a new function similar to .Q.en which would take this config and a table and update the mapping files on disk and encode the table. By inserting a call to this function just before .Q.en everywhere in the code, .Q.en would pick up any unencoded symbols (i.e. replicate the current behaviour if nothing is encoded).
Metadata
Metadata
Assignees
Labels
No labels