feat: GPU support for building DISKANN based indexes.#1460
feat: GPU support for building DISKANN based indexes.#1460liorf95 wants to merge 4 commits intozilliztech:mainfrom
Conversation
Signed-off-by: Lior Friedman <lior.friedman@il.kioxia.com> Signed-off-by: Hiroshi Murayama <hiroshi4.murayama@kioxia.com>
Signed-off-by: Lior Friedman <lior.friedman@il.kioxia.com> Signed-off-by: Hiroshi Murayama <hiroshi4.murayama@kioxia.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: liorf95 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@liorf95 🔍 Important: PR Classification Needed! For efficient project management and a seamless review process, it's essential to classify your PR correctly. Here's how:
For any PR outside the kind/improvement category, ensure you link to the associated issue using the format: “issue: #”. Thanks for your efforts and contribution to the community!. |
| constexpr uint32_t kK = 10; | ||
| constexpr uint32_t kK = 64; | ||
| #ifdef KNOWHERE_WITH_CUVS | ||
| constexpr uint32_t defaultMaxDegree = 64; |
There was a problem hiding this comment.
why is there a difference for the value of defaultMaxDegree for CUVS and non-CUVS versions for the testing? Any particular reasons?
There was a problem hiding this comment.
The default value of max_degree for diskann based indexes is 56, so we did not want to modify UT for those basic scenarios.
However, cuVS related limitations demand max_degree to be a power of 2 (e.g., 32, 64, 128, etc.)
| constexpr uint32_t kLargeDim = 256; | ||
| constexpr uint32_t kK = 10; | ||
| constexpr uint32_t kK = 64; | ||
| #ifdef KNOWHERE_WITH_CUVS |
There was a problem hiding this comment.
what are the requirements for the GPU memory for a cuVS-based version of this test? It would be nice to have it as a comment in the beginning of the file, including a minimum sifficient GPU, if appropriate. Something like This test is expected to run on a GPU with at least 2 GB of RAM, such as NVIDIA A10 or NVIDIA T4
| diskann::get_bin_metadata(base_file, base_num, base_dim); | ||
| #ifdef KNOWHERE_WITH_CUVS | ||
| raft::device_resources dev_resources; | ||
| if(compareMetric == diskann::L2 && is_gpu_available()) { |
There was a problem hiding this comment.
and what is going to happen if (compareMetric != diskann::L2)?
There was a problem hiding this comment.
Currently, the cuvs Vamana build only supports the L2Expanded metric, so that the index will be built with cpu as usual.
| " Gib free memory out of " << gpu_total_mem/(1024*1024*1024L) << " Gib total"; | ||
| ram_budget = std::min<double>(ram_budget,(double)0.9*(gpu_free_mem/(1024*1024*1024))); | ||
| shard_r = std::max<uint32_t>((uint32_t)32,(uint32_t)R/2); | ||
| } |
There was a problem hiding this comment.
} else {
// add to log why GPU is not going to be used
}| bool built_with_gpu=false; | ||
| #ifdef KNOWHERE_WITH_CUVS | ||
| //currently cuvs vamana build only supports L2Expanded metric | ||
| if (compareMetric == diskann::L2 && is_gpu_available() && |
There was a problem hiding this comment.
this and the following calls imply that if the metric is right, the library is built with #define KNOWHERE_WITH_CUVS and is_gpu_available(), then the code will ALWAYS be built using a GPU.
Is it possible to introduce a configurable boolean flag that indicates whether a GPU is expected to be used by a user? Say, the default value is true. At least, this would be helpful for running unit tests, so that it would be possible to validate both GPU and non-GPU branches of the code on the same machine without tricks.
There was a problem hiding this comment.
this and the following calls imply that if the metric is right, the library is built with
#define KNOWHERE_WITH_CUVSandis_gpu_available(), then the code will ALWAYS be built using a GPU. Is it possible to introduce a configurable boolean flag that indicates whether a GPU is expected to be used by a user? Say, the default value istrue. At least, this would be helpful for running unit tests, so that it would be possible to validate both GPU and non-GPU branches of the code on the same machine without tricks.
you can already control that with env varibale NVIDIA_VISIBLE_DEVICES - set it to none and the index will be built with CPU
| 1); | ||
| } | ||
| built_with_gpu=true; | ||
| } |
There was a problem hiding this comment.
} else {
// add to log why GPU is not going to be used
}Same comment to other possible else branches, if appropriate.
thirdparty/DiskANN/src/aux_utils.cpp
Outdated
| #ifdef KNOWHERE_WITH_CUVS | ||
| if(is_gpu_available()) { | ||
| if (R != 32 && R != 64 && R != 128) { | ||
| LOG_KNOWHERE_ERROR_ << "Invalid R value for cuvs - should be only 32 or 64 or 128"; |
There was a problem hiding this comment.
LOG_KNOWHERE_ERROR_ << "Invalid R value (" << R << ") for cuvs - should be only 32 or 64 or 128";Same comment applies for the following lines as well
| int iters); | ||
| bool is_gpu_available(); | ||
|
|
||
| void kmeans_gpu( |
There was a problem hiding this comment.
personally, I would prefer FAISS based names here. So, that would be train (kmeans_gpu) and assign (predict_gpu).
Alternatively, please add comments on what these functions do, because it looks confusing for those who are not aware.
There was a problem hiding this comment.
We will add comments on what these functions do.
| raft::resources res; | ||
| LOG_KNOWHERE_INFO_ << "Running k-means with " << k << " clusters...using GPU!"; | ||
| kmeans_gpu(res,train_data_float.get(), num_train, train_dim, | ||
| k, 10, centroids.data()); |
There was a problem hiding this comment.
would it be possible to extract 10 (number of iterations?) as a constexpr value nearly, please? Same for the CPU branch
|
|
||
| void gpu_get_mem_info(raft::resources &dev_resources, size_t &gpu_free_mem,size_t &gpu_total_mem); | ||
|
|
||
| #endif |
There was a problem hiding this comment.
what about selecting the right GPU to deal with? It is expected to be done by a caller using raft::resources object, correct?
There was a problem hiding this comment.
what about selecting the right GPU to deal with? It is expected to be done by a caller using
raft::resourcesobject, correct?
The current implementation doesnt handle multi GPU logic.IT is assumed that each process use one GPU. however with multi GPU platform in k8s with milvus you can assign each datanode a different GPU exclusively and accelerate the build further.
|
/kind improvement |
|
issue: #1422 |
Signed-off-by: Lior Friedman <lior.friedman@il.kioxia.com> Signed-off-by: Hiroshi Murayama <hiroshi4.murayama@kioxia.com>
thirdparty/DiskANN/src/aux_utils.cpp
Outdated
| LOG_KNOWHERE_ERROR_ << "Invalid R value for cuvs - should be only 32 or 64 or 128"; | ||
| return -1; | ||
| } | ||
| if (L != 32 && L != 64 && L != 128 && L != 256) { |
There was a problem hiding this comment.
what is the specific reason to choose the seven magic numbers? can it be more flexible?
There was a problem hiding this comment.
with cuvs 25.10 which is the version used, cuvs has validation for visted size which must be a power of 2.
This list is reasnoble considering this constraint as we see that higher L than 256 doesnt benfit too much
There was a problem hiding this comment.
maybe restrict it to 2^k with a larger boundary like L \in [32, 2048] instead of some magic number is better. maybe some larger or harder datasets or in higher recall requirement need higher parameter :)
Signed-off-by: Lior Friedman <lior.friedman@il.kioxia.com> Signed-off-by: Hiroshi Murayama <hiroshi4.murayama@kioxia.com>
See issue #1422.