From 5ea3fbfa9991c9046fb4ef2f43e48feee4b73b25 Mon Sep 17 00:00:00 2001 From: ct-clmsn Date: Fri, 5 Jan 2018 15:00:01 -0500 Subject: [PATCH 1/8] create proposal.md provide a skeleton in markdown --- proposal.md | 45 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 45 insertions(+) create mode 100644 proposal.md diff --git a/proposal.md b/proposal.md new file mode 100644 index 0000000..b25fd2f --- /dev/null +++ b/proposal.md @@ -0,0 +1,45 @@ +**Document number:** Nnnnn=yy-nnnn +**Date:** yyyy-mm-dd +**Project:** HPC Fabrics Proposal +**Reply-to:** Your name and email address; list multiple authors if applicable. It is OK to obfuscate the address like this: "Jane Proposer + +## Table of Contents + +I. Introduction +II. Motivation and Scope +III. Impact on the Standard +IV. Design Decisions +V. Technical Specifications +VI. References +VII. Acknowledgements + +## I. Introduction + +In the fall of 2017, the Open Fabrics Working Group (OFIWG) discussed proposing an extention to the current version of the ISO C++ +Networking Technical Specification (N4643) to include support for HPC Fabrics. The intent of this proposal is to improve the +programmability and accessibility of HPC interconnect hardware. + +## II. Motivation and Scope + +N4643 currently targets commodity, ethernet based, interconnects. Developing a fabric extension to N4643 will increase the accessibility +of fabric interconnects to HPC applications, runtimes, and languages. A fabric extension to the C++ Networking Technical Specification +will provide new mechanisms improving HPC application, runtime, and language performance and efficiencies. + +## III. Impact On the Standard + +The proposed fabric extension will depend on N4643. The proposed fabric extension is a "pure extension" of N4643. The current suite of +libraries used for HPC fabrics are implemented in C99. This proposed fabric extension can be implemented, at a minimum, using C++11 +compilers and libraries. + +## IV. Design Decisions + +Design decisions in this proposal are presented as an extension to N4643. This proposal may impact N4643. + +## V. Technical Specifications + +## VI. References + +## VII. Acknowledgements + +This document is based on N4643, the ISO C++ Networking Technical Specification. + From 56d8f0b8df5f8e896dc79fb2f6c16eeddf9f7642 Mon Sep 17 00:00:00 2001 From: ct-clmsn Date: Fri, 5 Jan 2018 15:24:32 -0500 Subject: [PATCH 2/8] Update proposal.md Signed-off-by: Chris Taylor --- proposal.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposal.md b/proposal.md index b25fd2f..15e6271 100644 --- a/proposal.md +++ b/proposal.md @@ -1,7 +1,7 @@ **Document number:** Nnnnn=yy-nnnn **Date:** yyyy-mm-dd **Project:** HPC Fabrics Proposal -**Reply-to:** Your name and email address; list multiple authors if applicable. It is OK to obfuscate the address like this: "Jane Proposer +**Reply-to:** Your name and email address; list multiple authors if applicable. Obfuscate the address like this: "Jane Proposer ## Table of Contents From 49e8993bd3a2f630c63580144afed25acb348d49 Mon Sep 17 00:00:00 2001 From: ct-clmsn Date: Sat, 20 Jan 2018 11:24:07 -0500 Subject: [PATCH 3/8] hpx-parcelport analysis started --- catalogs/README.md | 3 +++ catalogs/hpx_parcelport.csv | 11 +++++++++++ catalogs/hpx_parcelport.md | 16 ++++++++++++++++ 3 files changed, 30 insertions(+) create mode 100644 catalogs/README.md create mode 100644 catalogs/hpx_parcelport.csv create mode 100644 catalogs/hpx_parcelport.md diff --git a/catalogs/README.md b/catalogs/README.md new file mode 100644 index 0000000..51883d6 --- /dev/null +++ b/catalogs/README.md @@ -0,0 +1,3 @@ +This directory contains files that 'catalog' libfabric primitives (types, +variables, function calls) used in existing C or C++ codes (libraries, +applications, etc). diff --git a/catalogs/hpx_parcelport.csv b/catalogs/hpx_parcelport.csv new file mode 100644 index 0000000..0f2a048 --- /dev/null +++ b/catalogs/hpx_parcelport.csv @@ -0,0 +1,11 @@ +libfabric-primitive,file,line,description,, +fi_info * fabric_info_,/plugins/parcelport/libfabric/libfabric_controller.hpp,124,variable decl,, +fid_fabric * fabric_,/plugins/parcelport/libfabric/libfabric_controller.hpp,125,variable decl,, +fid_domain * fabric_domain_,/plugins/parcelport/libfabric/libfabric_controller.hpp,126,variable decl,, +fid_pep * ep_passive_,/plugins/parcelport/libfabric/libfabric_controller.hpp,128,variable decl - server/listener for RDMA connections,, +fid_ep * ep_active_,/plugins/parcelport/libfabric/libfabric_controller.hpp,129,variable decl - server/listener for RDMA connections,, +fid_ep * ep_shared_rx_ctx_,/plugins/parcelport/libfabric/libfabric_controller.hpp,130,variable decl - server/listener for RDMA connections,, +fid_eq * event_queue_,/plugins/parcelport/libfabric/libfabric_controller.hpp,133,variable decl,one event queue for all connections,, +fid_cq * txcq_,/plugins/parcelport/libfabric/libfabric_controller.hpp,134,variable decl,one event queue for all connections,, +fid_cq * rxcq_,/plugins/parcelport/libfabric/libfabric_controller.hpp,134,variable decl,one event queue for all connections,, +fid_av * av_,/plugins/parcelport/libfabric/libfabric_controller.hpp,135,variable decl,one event queue for all connections,, diff --git a/catalogs/hpx_parcelport.md b/catalogs/hpx_parcelport.md new file mode 100644 index 0000000..9c8ddff --- /dev/null +++ b/catalogs/hpx_parcelport.md @@ -0,0 +1,16 @@ +the file hpx_parcelport.csv contains a table cataloging the following +information about the use of libfabric primitives in the hpx runtime. + +libfabric-primitive,file,line,description,, + +this table includes information about how the hpx runtime uses libfabric in +it's parcelport, or communication, system. + +here is a description of the columns: + +* libfabric-primitive - contains the name of the type and variable used in hpx +parcelport +* file - the file name containing the libfabric primitive +* line - the line number referencing the libfabric primitive's instantiation +* description - a quick explanation of how the libfabric primitive's instance +is used From 481811d64104c5edba45516befe8044147776ba0 Mon Sep 17 00:00:00 2001 From: ct-clmsn Date: Sat, 20 Jan 2018 11:30:29 -0500 Subject: [PATCH 4/8] added more information to md --- catalogs/hpx_parcelport.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/catalogs/hpx_parcelport.md b/catalogs/hpx_parcelport.md index 9c8ddff..cb97838 100644 --- a/catalogs/hpx_parcelport.md +++ b/catalogs/hpx_parcelport.md @@ -14,3 +14,8 @@ parcelport * line - the line number referencing the libfabric primitive's instantiation * description - a quick explanation of how the libfabric primitive's instance is used + +this document will contain a more contextual analysis describing the hpx +parcelport's design and implementation. this document will attempt to provide +a more general perspective on the contents included in the csv file. + From f8303e58998dda0aa7a690ad0a557a664e0d0437 Mon Sep 17 00:00:00 2001 From: ct-clmsn Date: Sat, 20 Jan 2018 11:35:36 -0500 Subject: [PATCH 5/8] fix for issue #3 --- proposal.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/proposal.md b/proposal.md index 2b45b06..8b6fb6d 100644 --- a/proposal.md +++ b/proposal.md @@ -16,24 +16,24 @@ VII. Acknowledgements ## I. Introduction In the fall of 2017, the Open Fabrics Working Group (OFIWG) discussed proposing an extention to the current version of the ISO C++ -Networking Technical Specification (N4643) to include support for HPC Fabrics. The intent of this proposal is to improve the +Networking Technical Specification (N4695) to include support for HPC Fabrics. The intent of this proposal is to improve the programmability and accessibility of HPC interconnect hardware. ## II. Motivation and Scope -N4643 currently targets commodity, ethernet based, interconnects. Developing a fabric extension to N4643 will increase the accessibility +N4695 currently targets commodity, ethernet based, interconnects. Developing a fabric extension to N4695 will increase the accessibility of fabric interconnects to HPC applications, runtimes, and languages. A fabric extension to the C++ Networking Technical Specification will provide new mechanisms improving HPC application, runtime, and language performance and efficiencies. ## III. Impact On the Standard -The proposed fabric extension will depend on N4643. The proposed fabric extension is a "pure extension" of N4643. The current suite of +The proposed fabric extension will depend on N4695. The proposed fabric extension is a "pure extension" of N4695. The current suite of libraries used for HPC fabrics are implemented in C99. This proposed fabric extension can be implemented, at a minimum, using C++11 compilers and libraries. ## IV. Design Decisions -Design decisions in this proposal are presented as an extension to N4643. This proposal may impact N4643. +Design decisions in this proposal are presented as an extension to N4695. This proposal may impact N4695. ## V. Technical Specifications @@ -41,5 +41,5 @@ Design decisions in this proposal are presented as an extension to N4643. This p ## VII. Acknowledgements -This document is based on N4643, the ISO C++ Networking Technical Specification. +This document is based on N4695, the ISO C++ Networking Technical Specification. From 4ec10dac04b37ac53147dcd930aa47fc3e8c7651 Mon Sep 17 00:00:00 2001 From: Chris Taylor Date: Sat, 20 Jan 2018 11:41:52 -0500 Subject: [PATCH 6/8] added content to the hpx_parcelport.md Signed-off-by: Chris Taylor --- catalogs/hpx_parcelport.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/catalogs/hpx_parcelport.md b/catalogs/hpx_parcelport.md index cb97838..6484b42 100644 --- a/catalogs/hpx_parcelport.md +++ b/catalogs/hpx_parcelport.md @@ -17,5 +17,6 @@ is used this document will contain a more contextual analysis describing the hpx parcelport's design and implementation. this document will attempt to provide -a more general perspective on the contents included in the csv file. +a more general perspective, or analysis, that ties together contents included +in the csv file. From eba505265b44f74f82ef3b43595b7a72841414c1 Mon Sep 17 00:00:00 2001 From: Chris Taylor Date: Sat, 20 Jan 2018 11:43:18 -0500 Subject: [PATCH 7/8] added content to hpx_parcelport.md Signed-off-by: Chris Taylor --- catalogs/hpx_parcelport.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/catalogs/hpx_parcelport.md b/catalogs/hpx_parcelport.md index 6484b42..c8af914 100644 --- a/catalogs/hpx_parcelport.md +++ b/catalogs/hpx_parcelport.md @@ -20,3 +20,5 @@ parcelport's design and implementation. this document will attempt to provide a more general perspective, or analysis, that ties together contents included in the csv file. +file reviewed /plugins/parcelport/libfabric/libfabric_controller.hpp + From 0e49e7e879fb2d58ddd3121079fbbf8385d5102c Mon Sep 17 00:00:00 2001 From: Chris Taylor Date: Sat, 20 Jan 2018 16:30:12 -0500 Subject: [PATCH 8/8] updated with beginning of 'light narrative walk through' of libfabric_controller Signed-off-by: Chris Taylor --- catalogs/hpx_parcelport.md | 99 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 99 insertions(+) diff --git a/catalogs/hpx_parcelport.md b/catalogs/hpx_parcelport.md index c8af914..a6b0f2d 100644 --- a/catalogs/hpx_parcelport.md +++ b/catalogs/hpx_parcelport.md @@ -22,3 +22,102 @@ in the csv file. file reviewed /plugins/parcelport/libfabric/libfabric_controller.hpp +*** summary of lines 142:176 + +libfabric_controller::libfabric_controller( + std::string provider, + std::string domain, + std::string endpoint + int port=7910); + +constructor uses a function called open_fabric(provider, domain, endpoint) to +access the hardware. + +constructor creates a memory pool object which manages a memory containing +rma_memory_pool(fabric_domain_) data segments. +the constructor also initalizes a passive listener (or an active RDM endpoint) +using the function called create_local_endpoint() + +*** summary of lines 315::402 + +libfabric_controller::open_fabric( + std::string provider, + std::string domain, + std::string endpoint_type); + +this method calls fi_allocinfo() and stores results into a local variable called +struct fi_info * fabric_hints_. the method checks to see if fi_allocinfo returns +correctly. + + fabric_hints_->caps is set to FI_MSG|FI_RMA|FI_SOURCE|FI_WRITE +|FI_READ|FI_REMOTE_READ|FI_REMOTE_WRITE|FI_RMA_EVENT + + fabric_hints_->mode is set to FI_CONTEXT|FI_LOCAL_MR + + fabric_hintes_->fabric_attr->prov_name is set to provider.c_str() + + fabric_hints_->domain_attr->name is set to domain.c_str() + + fabric_hints_->domain_attr->mr_mode is set to FI_MR_BASIC (basic IB + registration) + +progress threads are disabled by setting + + fabric_hints_->domain_attr->control_progress to FI_PROGRESS_MANUAL + fabric_hints_->domain_attr->data_progress to FI_PROGRESS_MANUAL + +thread safe mode is enabled (and notes that this does not work with psm2 +provider) by setting + + fabric_hints_->domain_attr->threading to FI_THREAD_SAFE + +resource management is enabled by setting + + fabric_hints_->domain_attr->resource_mgmt to FI_RM_ENABLED + +shared recv context is enabled for active endpoints + + fabric_hints_->ep_attr->rx_ctx_cnt = FI_SHARED_CONTEXT + +if the endpoint_type value is set to "msg" then the following value +is set: + + fabric_hints->ep_attr->type to FI_EP_MSG + +if the endpoint_type value is set to "rdm" then the following value +is set: + + fabric_hints->ep_attr->type to FI_EP_RDM + +if the endpoint_type value is set to "dgram" then the following value +is set: + + fabric_hints->ep_attr->type to FI_EP_DGRAM + +by default, the method wants completions on both tx/rx events and sets + + fabric_hints_->tx_attr->op_flags to FI_COMPLETION + fabric_hints_->rx_attr->op_flags to FI_COMPLETION + +fi_get_info is called and then tests to see if the following values +are set: + + fabric_info_->rx_attr->mode & FI_RX_CQ_DATA != 0 + fabric_hints_->mode & FI_CONTEXT != 0 + +fi_fabric is called and given the following arguments + + fi_fabric(fabric_into_->fabric_attr, & fabric_, nullptr ) + +fi_domain is called and given the following arguments + + fi_domain(fabric_, fabric_info_, &fabric_domain_, nullptr) + +a method called '_set_disable_registration()' for Cray systems, it +disables memory registration caching. + +fi_free_info is called and passed the following arguments + + fi_free_info( fabric_hints_ ) + +