diff --git a/catalogs/README.md b/catalogs/README.md new file mode 100644 index 0000000..51883d6 --- /dev/null +++ b/catalogs/README.md @@ -0,0 +1,3 @@ +This directory contains files that 'catalog' libfabric primitives (types, +variables, function calls) used in existing C or C++ codes (libraries, +applications, etc). diff --git a/catalogs/hpx_parcelport.csv b/catalogs/hpx_parcelport.csv new file mode 100644 index 0000000..0f2a048 --- /dev/null +++ b/catalogs/hpx_parcelport.csv @@ -0,0 +1,11 @@ +libfabric-primitive,file,line,description,, +fi_info * fabric_info_,/plugins/parcelport/libfabric/libfabric_controller.hpp,124,variable decl,, +fid_fabric * fabric_,/plugins/parcelport/libfabric/libfabric_controller.hpp,125,variable decl,, +fid_domain * fabric_domain_,/plugins/parcelport/libfabric/libfabric_controller.hpp,126,variable decl,, +fid_pep * ep_passive_,/plugins/parcelport/libfabric/libfabric_controller.hpp,128,variable decl - server/listener for RDMA connections,, +fid_ep * ep_active_,/plugins/parcelport/libfabric/libfabric_controller.hpp,129,variable decl - server/listener for RDMA connections,, +fid_ep * ep_shared_rx_ctx_,/plugins/parcelport/libfabric/libfabric_controller.hpp,130,variable decl - server/listener for RDMA connections,, +fid_eq * event_queue_,/plugins/parcelport/libfabric/libfabric_controller.hpp,133,variable decl,one event queue for all connections,, +fid_cq * txcq_,/plugins/parcelport/libfabric/libfabric_controller.hpp,134,variable decl,one event queue for all connections,, +fid_cq * rxcq_,/plugins/parcelport/libfabric/libfabric_controller.hpp,134,variable decl,one event queue for all connections,, +fid_av * av_,/plugins/parcelport/libfabric/libfabric_controller.hpp,135,variable decl,one event queue for all connections,, diff --git a/catalogs/hpx_parcelport.md b/catalogs/hpx_parcelport.md new file mode 100644 index 0000000..a6b0f2d --- /dev/null +++ b/catalogs/hpx_parcelport.md @@ -0,0 +1,123 @@ +the file hpx_parcelport.csv contains a table cataloging the following +information about the use of libfabric primitives in the hpx runtime. + +libfabric-primitive,file,line,description,, + +this table includes information about how the hpx runtime uses libfabric in +it's parcelport, or communication, system. + +here is a description of the columns: + +* libfabric-primitive - contains the name of the type and variable used in hpx +parcelport +* file - the file name containing the libfabric primitive +* line - the line number referencing the libfabric primitive's instantiation +* description - a quick explanation of how the libfabric primitive's instance +is used + +this document will contain a more contextual analysis describing the hpx +parcelport's design and implementation. this document will attempt to provide +a more general perspective, or analysis, that ties together contents included +in the csv file. + +file reviewed /plugins/parcelport/libfabric/libfabric_controller.hpp + +*** summary of lines 142:176 + +libfabric_controller::libfabric_controller( + std::string provider, + std::string domain, + std::string endpoint + int port=7910); + +constructor uses a function called open_fabric(provider, domain, endpoint) to +access the hardware. + +constructor creates a memory pool object which manages a memory containing +rma_memory_pool(fabric_domain_) data segments. +the constructor also initalizes a passive listener (or an active RDM endpoint) +using the function called create_local_endpoint() + +*** summary of lines 315::402 + +libfabric_controller::open_fabric( + std::string provider, + std::string domain, + std::string endpoint_type); + +this method calls fi_allocinfo() and stores results into a local variable called +struct fi_info * fabric_hints_. the method checks to see if fi_allocinfo returns +correctly. + + fabric_hints_->caps is set to FI_MSG|FI_RMA|FI_SOURCE|FI_WRITE +|FI_READ|FI_REMOTE_READ|FI_REMOTE_WRITE|FI_RMA_EVENT + + fabric_hints_->mode is set to FI_CONTEXT|FI_LOCAL_MR + + fabric_hintes_->fabric_attr->prov_name is set to provider.c_str() + + fabric_hints_->domain_attr->name is set to domain.c_str() + + fabric_hints_->domain_attr->mr_mode is set to FI_MR_BASIC (basic IB + registration) + +progress threads are disabled by setting + + fabric_hints_->domain_attr->control_progress to FI_PROGRESS_MANUAL + fabric_hints_->domain_attr->data_progress to FI_PROGRESS_MANUAL + +thread safe mode is enabled (and notes that this does not work with psm2 +provider) by setting + + fabric_hints_->domain_attr->threading to FI_THREAD_SAFE + +resource management is enabled by setting + + fabric_hints_->domain_attr->resource_mgmt to FI_RM_ENABLED + +shared recv context is enabled for active endpoints + + fabric_hints_->ep_attr->rx_ctx_cnt = FI_SHARED_CONTEXT + +if the endpoint_type value is set to "msg" then the following value +is set: + + fabric_hints->ep_attr->type to FI_EP_MSG + +if the endpoint_type value is set to "rdm" then the following value +is set: + + fabric_hints->ep_attr->type to FI_EP_RDM + +if the endpoint_type value is set to "dgram" then the following value +is set: + + fabric_hints->ep_attr->type to FI_EP_DGRAM + +by default, the method wants completions on both tx/rx events and sets + + fabric_hints_->tx_attr->op_flags to FI_COMPLETION + fabric_hints_->rx_attr->op_flags to FI_COMPLETION + +fi_get_info is called and then tests to see if the following values +are set: + + fabric_info_->rx_attr->mode & FI_RX_CQ_DATA != 0 + fabric_hints_->mode & FI_CONTEXT != 0 + +fi_fabric is called and given the following arguments + + fi_fabric(fabric_into_->fabric_attr, & fabric_, nullptr ) + +fi_domain is called and given the following arguments + + fi_domain(fabric_, fabric_info_, &fabric_domain_, nullptr) + +a method called '_set_disable_registration()' for Cray systems, it +disables memory registration caching. + +fi_free_info is called and passed the following arguments + + fi_free_info( fabric_hints_ ) + + diff --git a/proposal.md b/proposal.md index 2b45b06..8b6fb6d 100644 --- a/proposal.md +++ b/proposal.md @@ -16,24 +16,24 @@ VII. Acknowledgements ## I. Introduction In the fall of 2017, the Open Fabrics Working Group (OFIWG) discussed proposing an extention to the current version of the ISO C++ -Networking Technical Specification (N4643) to include support for HPC Fabrics. The intent of this proposal is to improve the +Networking Technical Specification (N4695) to include support for HPC Fabrics. The intent of this proposal is to improve the programmability and accessibility of HPC interconnect hardware. ## II. Motivation and Scope -N4643 currently targets commodity, ethernet based, interconnects. Developing a fabric extension to N4643 will increase the accessibility +N4695 currently targets commodity, ethernet based, interconnects. Developing a fabric extension to N4695 will increase the accessibility of fabric interconnects to HPC applications, runtimes, and languages. A fabric extension to the C++ Networking Technical Specification will provide new mechanisms improving HPC application, runtime, and language performance and efficiencies. ## III. Impact On the Standard -The proposed fabric extension will depend on N4643. The proposed fabric extension is a "pure extension" of N4643. The current suite of +The proposed fabric extension will depend on N4695. The proposed fabric extension is a "pure extension" of N4695. The current suite of libraries used for HPC fabrics are implemented in C99. This proposed fabric extension can be implemented, at a minimum, using C++11 compilers and libraries. ## IV. Design Decisions -Design decisions in this proposal are presented as an extension to N4643. This proposal may impact N4643. +Design decisions in this proposal are presented as an extension to N4695. This proposal may impact N4695. ## V. Technical Specifications @@ -41,5 +41,5 @@ Design decisions in this proposal are presented as an extension to N4643. This p ## VII. Acknowledgements -This document is based on N4643, the ISO C++ Networking Technical Specification. +This document is based on N4695, the ISO C++ Networking Technical Specification.